Statistics Tools for Your Web Site
PDI 2008 Workshop Description
What is it?
Web analytics software uses Web server logs and/or tracking codes to collect, aggregate and analyze data on the use of Web pages or sites.
Whenever someone visits a Web page, statistics can be gathered about that visit, such as:
- Date and time of the visit
- URL and title of the page
- HTTP status/error codes
- Data
about the user's computer
- IP number
- Browser version
- Screen resolution
- Connection speed
How to use Web statistics to improve your Web site design
Why use it?
- Detect successes and problems to improve site design
- Navigation, content, format, usability, accessibility
- Track impact of pages and site on organizational goals and profits
- Reports for grant-funded projects
- Better understand and meet user goals and needs
Strengths
- A lot of data can be collected and aggregated over many users and a long time
- Data collection is non-obtrusive, automated and inexpensive
- More quantitative than subjective methods like usability tests and interviews
Limitations
- Data collection and aggregation is imperfect
and limited
- Need to log all data (different media types and Web applications)
- Need to distinguish automated crawlers from human users
- Web content is not equally accessible to all users (e.g. some tools require on JavaScript)
- Only shows what users do, not why
- Cannot see the full context in which the user is working
- Difficult to track individual users through entire sessions
- Hypotheses must be tested through other methods
Other Web data collection methods
These methods are complementary, so use multiple methods.
Some popular tools
CSU Libraries Demos
WebTrends 8
- Reporting
- Simple interface
to a large number of statistics
- Select profile, date range, report and graph type
- Export to Word, Excel, CSV or PDF
- Report categories
- Overview dashboard
- Marketing - ads, referrers, search engines
- Visitors - visits, domains, geography
- Pages and files - directories, file types, URL parameters
- Navigation - entry, exit and single access pages, paths
- Technical - hits, page views, bandwidth, errors
- Activity - visits and hits by day, hour, duration, pages
- Browsers - spiders, platforms
- Configuration
- Runs on a Windows-based PC; provides its own Web server
- Uses CGI/Perl and Java
- + No tracking codes or cookies required
- + Can use copies of server logs from multiple servers
- - Requires server logs; processing is slow
- - Need to plan reported subdirectories in advance
- - Options are complex
- Profiles - servers, filters, templates, reports, scheduler, etc.
- Filters - include/exclude; hit/visit; URL/directory/file, browser, day/hour, etc.
- Templates - select reports for a profile to run
Google Analytics
- Free for under 5 million page views/month or AdWords users
- Reporting
- + Rich, granular statistics - select combination of dimensions
- + Easy to use interface - filter, sort, export, data visualization
- - Only tracks users with JavaScript and cookies enabled
- Dashboard - overview of saved reports
- Visitors - visits, page views, browser and network capabilities
- Traffic Sources - direct, referring sites, search engines, keywords
- Content - by page views, title, folder; Site Search
- Goals - conversion goal URL, funnel URLs
- Scheduled e-mail reports
- Configuration
- Web-based/hosted
- + Easy to use, good context-sensitive documentation
- + Track multiple servers from one account
- + No need for server logs, automatically filters out crawlers
- - Tracking codes - a few lines of JavaScript required in each page
- - Views of images, PDFs and most documents
are not recorded
- - Dependence on Google; limited downloads and customization
- Profiles - server URL, goals, excluded query params, filters, users
- Filters - include/exclude, search/replace, advanced
- Users - grant user or admin privileges to users with Google accounts
Features
File and folder metrics
- Hits, clicks, visits, page views, unique page views, visitors, unique visitors
- Entry/exit, time on page, bounce rate, average time/visit, pages/visit, % new visits
- Wikipedia general definitions vs. Google Analytics definitions
Data dimensions
- Personal - Who
- IP numbers, ISPs, languages
- Nominal - What
- URLs, Files, folders, search terms
- Chronological - When
- Date and time trends – hour, weekday, month, year
- Geographical - Where
- User’s location – Library, CSU, other
- Technical - How
- User’s browser and network capabilities
Other features
- Track multiple servers/domains from one account
- Can aggregate across multiple servers/subdomains?
- Filter, sort
and group
- Robot filtering: automatic or manual? regular expressions?
- Filter by user IP address?
- Search/Filter URLs? via regular expressions?
- URL query string analysis (e.g. myorg.edu/myscript?variable=value)
- Reporting
- Maximum URLs; URLs displayed; URLs per page
- Export formats: xls, doc, csv, tsv, pdf, xml, html
- Data visualization: charts, graphs, map, site overlay
- Schedule reports to be sent by email
- Access restriction
- Login/accounts? IP-based?
Other selection criteria
- Provider: Commercial? Cost? Licensing? Open source? Ease of customization
- Platform:
Windows or Linux, local server or hosted
- Data collection method: Web server logs? Page tagging?
- Ease of configuration: GUI-based and/or file-based?
- Support: phone/email, user community, documentation, training, upgrades, longevity
Other Resources
- Web analytics - Wikipedia
- Web log analysis software - Wikipedia
- Comparison of commercial Web analytics software - 2005
- AWStats comparison with Analog, Webalizer, HitBox
- Search Analytics: Conversations with Your Customers - Rich Wiggins, Michigan State University
- Analyzing Web Server Logs to Improve a Site's Usage. Marshall Breeding. Computers in Libraries. October 2005, pp. 26-28.
- Web Site Measurement Hacks - Eric Peterson, O'Reilly, 2005
- Actionable Web Analytics - Jason Burby, Shane Atchison, 2007
- Web Analytics: An Hour a Day - Avinash Kaushik, Sybex, 2007
- Interpreting Your Website Statistics
- Web Analytics as Performance Management and Optimization means defining Goals and KPI's
Google Analytics