Trending Now

Do you have enough pages in the search index?

In some ways, you never have enough pages in the search index, because every extra page that sneaks in there is a lottery ticket in the search sweepstakes–you’ve got to be in it to win it. So, the more pages you have in the search index, the more chances you have to be found. But clearly there is some amount of pages that seem like you are doing OK and a different amount that seems bad–like, zero would be bad. How do you figure out how many pages you have in the search index and how do you know if that is OK?

First off, you need to understand that there is no single search index–each search engine has its own search index. Google has its own, Bing has its own, and so do many other search engines. So, you need to know which search engines are worth worrying about–in the U.S., it’s Google and Bing.

A quick-pick ticket with two sets of numbers f...
Photo credit: Wikipedia

So how do you find out how many pages are in Google’s index and how many are in Bing’s?

Both Google and Bing have a tool called the “site:” command. You can just enter into each one the word “site:” along with your domain name (Such as “site:biznology.com”).  For some sites, this handy command works just fine and you can see how many pages are stored in each index. If your results look right, great. But sometimes the results just look nuts. For example, “site:ibm.com” yields 2.8 million pages on Bing but a crazy 12.2 million pages on Google.

To avoid such inaccuracies, use each search engine’s Webmaster Tools sites. Both Google and Bing will tell your Webmaster exactly how many pages are in the index and will even let you know which pages they are having trouble grabbing. It’s possible that the IBM Webmaster is aware that there actually is a big discrepancy between Google and Bing, which might be just fine or might be something they are working on.

I’ve spoken to a few experts and they have varying theories. One told me that Bing stops crawling when more than 1% of the pages get errors–the Bing Webmaster site will clue you in on this. Another speculated that Bing is only returning counts of pages that get search visits, not every page in their index. No one I spoke with knew for sure why this is happening, but it shows you the importance of checking your numbers.

Likewise, big swings in indexed pages (1,000 pages indexed in Google today vs. 5,000 yesterday) mean that you should look into it. And, in general, an inclusion ratio (pages indexed divided by actual pages) below 70% is something that should give you pause, although with these Bing errors who knows what a good inclusion ration is for Bing right now.

Regardless. knowing how many pages are indexed is the first step to seeing if you have a problem.

Enhanced by Zemanta
Avatar

Mike Moran

Mike Moran is an expert in digital marketing, search technology, social media, text analytics, web personalization, and web metrics, who, as a Certified Speaking Professional, regularly makes speaking appearances. Mike’s previous appearances include keynote speaking appearances worldwide. Mike serves as a senior strategist for Converseon, an AI powered consumer intelligence technology and consulting firm. He is also a senior strategist for SoloSegment, a marketing automation software solutions and services firm. Mike also serves as a member of the Board of Directors of SEMPO. Mike spent 30 years at IBM, rising to Distinguished Engineer, an executive-level technical position. Mike held various roles in his IBM career, including eight years at IBM’s customer-facing website, ibm.com, most recently as the Manager of ibm.com Web Experience, where he led 65 information architects, web designers, webmasters, programmers, and technical architects around the world. Mike's newest book is Outside-In Marketing with world-renowned author James Mathewson. He is co-author of the best-selling Search Engine Marketing, Inc. (with fellow search marketing expert Bill Hunt), now in its Third Edition. Mike is also the author of the acclaimed internet marketing book, Do It Wrong Quickly: How the Web Changes the Old Marketing Rules, named one of best business books of 2007 by the Miami Herald. Mike founded and writes for Biznology® and writes regularly for other blogs. In addition to Mike’s broad technical background, he holds an Advanced Certificate in Market Management Practice from the Royal UK Charter Institute of Marketing and is a Visiting Lecturer at the University of Virginia’s Darden School of Business. He also teaches at Rutgers Business School. He is a Senior Fellow at the Society for New Communications Research. Mike worked at ibm.com from 1998 through 2006, pioneering IBM’s successful search marketing program. IBM’s website of over two million pages was a classic “big company” website that has traditionally been difficult to optimize for search marketing. Mike, working with Bill Hunt, developed a strategy for search engine marketing that works for any business, large or small. Moran and Hunt spearheaded IBM’s content improvement that has resulted in dramatic gains in traffic from Google and other internet portals.

Join the Discussion

Your email address will not be published. Required fields are marked *

Back to top