Post Panda (Google’s one-year-old revolutionary search ranking algorithm that has upended old-style SEO), optimizing individual pages is more difficult and less effective than it was before Panda. Google now pays more attention to overall site cleanliness and architecture. And duplicate content is ever more problematic. The larger and more complex your environment, the harder it is to present an optimized site. That’s where content analytics–the science of understanding your content and gleaning actionable insights from it–is essential to effective web publishing post Panda.
You do the keyword research. You use the research to build your content strategy and information architecture. You write compelling copy and complement it with relevant and engaging imagery, all with the best SEO practices in mind. You build the page and embed SEO-enhanced YouTube videos into it. You incorporate social rating and sharing throughout the page. You blog and tweet about the page as soon as it’s launched, using relevant keywords and hash tags. In short, you do everything right. So why is your page not ranking?
A common reason: there are other pages in your environment optimized for the same keywords. Especially in large corporate settings, the main culprit to SEO failure is duplicate content. Often this is old junk that is just sitting on a server getting in the users’ way. More importantly, the old stuff is taking up slots in Google’s index, distracting the algorithm in its attempt to rank your content. Even if the old stuff was never optimized, these pages have links into them that should be pointing to your optimized experiences.
What’s the solution? Clean house. Find and retire old, duplicate content. Redirect retired URLs to their appropriate newer experiences (making sure to use 301 permanent redirects, which pass link juice). If you have internal links to the older experiences, point them at your newer pages with the appropriate anchor text—especially on your home page. If you need to retain the old content for legal reasons, de-index it. And govern your site going forward to ensure that you don’t create duplicates in the future.
I know what you’re thinking. That’s easier said than done, right? The larger and more complex your enterprise, the more difficult it is to build a content inventory. Building an intelligent content inventory—what we call an audit in content strategy land—is even more challenging. If two apparent duplicates are owned by different business units, how do you decide which one stays and which one goes? Once you have an audit, how do you govern new content creation?
Your time and attention won’t allow me to answer all these questions, at least not in any detail. My main point in this blog is to get you thinking about the questions and give you a sense of how important it is to start answering them. I also wanted to point you towards a few resources that can guide you to the right answers.
- Auditing your content: To get started with content auditing and strategy, I recommend Content Strategy for the Web by Kristina Halvorson and Melissa Rach. In this second edition, the authors have written the definitive guide to building content audits as foundational components to corporate content strategy.
- Building a strategy: Once you have an audit, the main way to govern content is to let the data answer your questions for you. Using the methods I outline in an article in the current issue of Contents Magazine, keyword demand and audience analysis help you understand how to prioritize your content efforts.
- Choosing the right tools: In large organizations, building a comprehensive audit without the right tools is a fool’s errand. In IBM, we use two primary tools to help us with auditing: Covario and Acrolinx IQ. Covario is a web service that helps us audit pages in our environment and discover apparent duplicates and untapped keyword opportunities. Acrolinx IQ is a content quality tool that can help us govern content and publish only the highest quality, optimized content. In the age of Panda, content quality and optimization are converging.
- Using the right metrics: As Mike likes to point out, these methods help you make your best guess on how to build better content. You need to iterate on the content you choose to nurture using A/B testing, and referral, bounce and engagement data help you tune your content. Mike’s book Do it Wrong Quickly is still the best source for this information.
On the post-Panda web, content strategy is the new SEO. It’s no longer an option to optimize pages in a vacuum. You need to optimize your site as a whole. Using content analytics, audits are the key way you do this, providing answers to a host of questions about how to improve your site’s search optimization and overall effectiveness.
James Mathewson is the global search strategy lead and co-author of the book Audience, Relevance and Search: Targeting Web Audiences with Relevant Content. The views expressed here are his own and not IBM’s.