|
In a Business 2.0 article entitled "Why Your Site Traffic Numbers Are Out of Whack" (March
2001) (www.business2.com/articles/mag/print/0,1643,9319,00.html), Brian Caulfield
included a section called "How to Get the Best Site Traffic Numbers," which states:
Even with high-end site analytics software, it's impossible to get dead-on traffic counts. But
with some care, you can get reasonably good numbers. Here's what to do.
1. Stomp out spiders. To distinguish spiders' hits from those created by real
users, look for unusual activity on your logs. Spiders do things no normal
person would, like visit every single page on your site in an hour. Once you
think you've found a spider, comb through Web logs to locate its IP address,
then direct your analytics software to ignore future hits originating from that
address.
2. Watch out for masked IP addresses. Not every address represents an
individual user. Corporations and dial-up ISPs (notably AOL) can show your
server a single IP address for many, many actual users. Look for high traffic
from a single address; it may indicate that you have more users than your data
suggests.
3. Avoid cookie monsters. Don't expect accurate visitor counts from cookies. An
unknown number of Web users set their browsers not to accept cookies.
Cookies also can't distinguish between multiple people using the same
computer—for example, PCs in libraries and schools.
4. Bust those caches. The most common way to defeat the problem of cached
pages is to generate as many pages as possible "on the fly," using scripts to
assemble them from a database, says Scott Hanson, vice president for auditing
services at ABC Interactive. Dynamic pages are extremely difficult to cache.
5. Know your audience. Since Media Metrix and Nielsen//NetRatings track users
only in homes and at work, ask your IT department to filter out users coming
from libraries and schools before comparing trends in your site's traffic with
Media Metrix's figures.
Nothing brings the eCompany Now Web team closer to blows than the issue of traffic. Our
server log files, ad banner logs, and tracking software give numbers that can vary 25 percent,
and everyone who needs that data—from sales to editorial to marketing—is miffed about the
absence of reliable numbers. Here are some lessons we've learned.
• Decide in advance exactly what data you want to collect.
• Pick the right site-metrics software. Because we lacked clear expectations
about what data we wanted, we chose a $10,000 version from WebTrends
when we probably needed the deluxe version that cost 15 times as much.
• Dedicate a powerful piece of hardware to run the software, and have a
technical person learn it, tweak it, and field requests for custom data runs.
The above is what you get when you look at a pile of data (server logs) and ask, "What does it
mean? What can it tell us?" The next step is to ask, "What do we want to know and how much
information is there about that?" |