Trending Now

Update on the new Google Analytics

Last week, I wrote a post on the new version of Google Analytics which drew a response from Avinash Kaushik, Web metrics guru and current Google analytics evangelist. Avinash, well-known for his blog and best-selling book, wrote me to clarify some of what I explained last week. I thought his points were well-taken, so I am revisiting the subject today.


Avinash wanted to clarify my discussion of Google Analytics’ ability to handle very high traffic volumes. I characterized Google’s approach as sampling which dropped data, but Avinash explained it more clearly:

For the high volume traffic, the Google Analytics’ Terms Of Service (TOS) is very clear. Up to five million hits a month is OK for anyone, if you go beyond that (and five million is a huge number to hit!) then it states that you must be an AdWords customer (though no spending requirement is stated) and then the number increases x times (x is not specified but makes it a number of times the five million hits, even if you multiply it by two or five).

Perhaps the most important thought is that Google Analytics does not drop data. 100% of the data is collected, even if you send hundreds of millions of hits Per Day (which many sites I know do—seems crazy!).

There are two sampling scenarios:

1) The GA TOS states that, at Google’s discretion, it may request you to sample the data, i.e., send it less data (though given the numbers are talking about millions of hits a day the sample captured will still be more than statistically significant). All data will be processed and report to you.

2) In the second scenario, again, 100% of the data is collected but if you run very large queries for a very large time period (since Google never deletes your data no matter how large), then that query will sample the data stored to ensure it actually returns results to you. Intelligent algorithms are applied to ensure that the results are statistically accurate.

If this happens, the Google Analytics report will clearly state that the data was sampled and it shows you the rate of sampling (in a little yellow square next to each metric).

Both of the above are precisely what all paid Web analytics tools do, too. Every vendor has contracted limits in terms of data you can send them. The more data you contract for, the more you pay.

If you breach your contract limits with any paid vendor, they will ask you to either sample the data at the collection point (your site), so you don’t send them all the data, or they will ask you to pay more to collect all the data. Even in the latter scenario, to have your queries return with results (and I say this from real experience) you will have to sample the data that has been collected.

Avinash also thought that I should clarify when support for the old version will end. He rightly points out that Google has announced no end date for support, but I seized on their statement that it will be “at least 12 to 18 months” as why I advised folks that they might want to consider moving this year. It’s certainly possible that Google will be supporting the old code even several years from now. Avinash advises existing Google Analytics users to switch to the new version if they need the new features, but not to worry about it too much if they don’t.
Thanks, Avinash, for helping my readers with this important decision.

Avatar

Mike Moran

Mike Moran is an expert in digital marketing, search technology, social media, text analytics, web personalization, and web metrics, who, as a Certified Speaking Professional, regularly makes speaking appearances. Mike’s previous appearances include keynote speaking appearances worldwide. Mike serves as a senior strategist for Converseon, an AI powered consumer intelligence technology and consulting firm. He is also a senior strategist for SoloSegment, a marketing automation software solutions and services firm. Mike also serves as a member of the Board of Directors of SEMPO. Mike spent 30 years at IBM, rising to Distinguished Engineer, an executive-level technical position. Mike held various roles in his IBM career, including eight years at IBM’s customer-facing website, ibm.com, most recently as the Manager of ibm.com Web Experience, where he led 65 information architects, web designers, webmasters, programmers, and technical architects around the world. Mike's newest book is Outside-In Marketing with world-renowned author James Mathewson. He is co-author of the best-selling Search Engine Marketing, Inc. (with fellow search marketing expert Bill Hunt), now in its Third Edition. Mike is also the author of the acclaimed internet marketing book, Do It Wrong Quickly: How the Web Changes the Old Marketing Rules, named one of best business books of 2007 by the Miami Herald. Mike founded and writes for Biznology® and writes regularly for other blogs. In addition to Mike’s broad technical background, he holds an Advanced Certificate in Market Management Practice from the Royal UK Charter Institute of Marketing and is a Visiting Lecturer at the University of Virginia’s Darden School of Business. He also teaches at Rutgers Business School. He is a Senior Fellow at the Society for New Communications Research. Mike worked at ibm.com from 1998 through 2006, pioneering IBM’s successful search marketing program. IBM’s website of over two million pages was a classic “big company” website that has traditionally been difficult to optimize for search marketing. Mike, working with Bill Hunt, developed a strategy for search engine marketing that works for any business, large or small. Moran and Hunt spearheaded IBM’s content improvement that has resulted in dramatic gains in traffic from Google and other internet portals.

Join the Discussion

Your email address will not be published. Required fields are marked *

Back to top