This week, on one of our Web Analytics projects, we encountered a discrepancy in the Avg. Visit Duration calculation between a set of dashboard reports, and a set of ad hoc reports. We did some testing and research and discovered that the issue was actually a direct reflection on the fact that there are limited industry standards in web analytics. Visit duration is generally defined as the amount of time spent on the web site. It is measured by calculating the difference between the first time stamp in the visit, and the last time stamp in the visit.
One of the noticeable issues with this calculation is that last time stamp of the visit occurs when the user starts viewing the last page in their visit, not when they leave the page. The user could continue to dwell on the page, but that dwell time will not be counted as a part of the overall duration. This is because there is no way to determine how much time the user spent, since they send no additional requests back to the server.
This flaw is then exacerbated by the case of a single page view visit. When a visit includes a single page view (a bounce, in Google analytics terms) the result is a visit with duration = 0 because it contains only a single page view with a single time stamp. Many web analytics end users may consider this to be a bug, but it is a limitation associated with log data.
But, is duration = 0 really true? Isn’t it more like duration = unknown?
And then, how do we calculate Avg. Visit Duration? After some research and testing, we determined that the discrepancy due to the fact that formula for Avg. Visit Duration in the dashboard was:
- Total Time Spent/(Visits – Single Page Visits)
In other words, all of the visits with an “unknown” duration had been removed. Not a bad idea, but it needed to be declared in the documentation. As it stands, this formula violates the definition of “Average”.
But, in the ad hoc reporting sections of the product, the formula for Avg. Visit Duration was:
- Total Time Spent/Visits
The Web Analytics Association has released a standard definition of visit duration, and it includes a note that visits with a single page have a duration that cannot be calculated. But, the standard does not indicate how those visits should be handled in aggregate calculations. Therefore, it is still up to the software vendors, and in this case, we see both calculations in the same product!
We think assigning a value to an unknown is a bit deceptive, it masks the unknown. It would be preferable to make the volume of single page visits visible, and then Avg. Visit Duration of the remaining visits. If reports called attention to the single page visits, there would be more questions regarding their business value and how to improve it.