10% Off SoloSegment Site Search Inspector After Free Trial

The major hole in our book Outside-In Marketing: Using Big Data to Guide Your Content Marketing is the lack of information on marketing artificial intelligence (AI). We talk about how to transform your marketing organization to take advantage of big data, but we don’t address the crying need of CMOs and marketing analysts to incorporate AI into their marketing tech stack.

Perhaps we can be excused for this omission. Our collective 50 years of experience in marketing technology includes AI expertise. Mike, in particular, has deep expertise in text analytics. My team built a first-of-its-kind Watson-powered keyword research tool. But our biggest challenge with our clients was more fundamental: convincing marketing professionals to embrace a data-driven approach in the first place. That is the mission of the book. In light of that, AI seemed too cutting edge for our target audience.

Since the book was published, AI has gone from cutting edge to table stakes for many CMOs. This is reflected in my job. Almost all of my team’s work is about transforming IBM‘s marketing stack with AI. So I thought it’s about time to share some of this work. Perhaps in a future edition of the book, we can include a chapter on marketing AI transformation. For now, consider this a sneak peek.

1. Tagging

Measurement is the core of data-driven marketing, and just about every piece of marketing technology. But much of it suffers from a common flaw: tagging. Suppose you want to serve content to your audience based on their interests, as Netflix does. The more movies you consume on Netflix, the better it is at recommending other movies for you.

How does Netflix do it? Obviously, it logs your choices and uses an algorithm to choose movies that are similar to what you’ve watched. But how does it “know” that two movies with vastly different titles and descriptions are similar? Tagging. Every video in the collection is tagged for genre, topic, intended audience, and a load of other attributes.

What about when Netflix gets it wrong? It’s not the algorithm, it’s the tagging. If a movie is mistagged, it will show up in the wrong genres, or topics, etc. But they do a good enough job with tagging to get it right most of the time, an impressive feat.

If you want to serve more relevant, personalized content to your audience, you need rich and accurate tagging. The problem is, content producers struggle more with tagging than any other aspect of their jobs. For example, in IBM we have a tag for all of our pages called the Subject tag. We recently audited the pages for Subject and found that 70 percent of all pages had the “null” Subject value. Why? Because content producers could not figure out the most relevant Subject tag from a long list of options. Pressed for time, they left the field blank. If the vast majority of your pages are not tagged for an attribute, it renders that attribute useless.

AI solves this problem by automatically tagging marketing pages and assets for the attributes that matter. In IBM, we built an autotagging system using Watson Knowledge Studio that tags pages for their topics. When we scale the system, not only will all the pages be tagged with this attribute, it will be accurate 85 percent of the time (if our initial tests scale), which is good enough to get the job done.

If we can get autotagging working for all of our most important attributes, we can get better and better at serving the right content to our audience when they need it. We can correlate this with transnational metrics like leads and sales. And we can tune the system over time, end to end. AI is a huge enabler of this.

2. Keywords

Keywords are the life’s blood of your marketing enterprise. Keyword research allows you to learn the voice of your customer and tune your marketing messages for them. But most marketing organizations struggle to find the right keywords for their teams. The words have to be relevant to your business and have enough query volume to indicate sufficient interest in a topic by your target audience.

At IBM we built a keyword ontology, which is a fancy name for a set of taxonomies related to the keywords our target audiences most often use in their search queries. We get keyword data from Google, and we organize it into topics for our topic taxonomy. We also can arrange keywords by business unit, brands, and products. When you have all these attributes related in an ontology, there is no end to the way you can manage content using them.

For example, we have something called the web segment taxonomy. It controls the way we form our URLs for our web pages, among other things. Because it is based on the keyword ontology, we can ensure that new pages are built with the language of the customer in the URL. We can then align the URL semantics with the navigation labels, internal faceted search labels, bread crumbs, topic tags, social tags, and page headings. The more of these signals you can line up, the easier it is for your audience to find relevant content on your site through search and navigation.


The keyword ontology and the topic taxonomy are based on an AI classification system, built with Watson Knowledge Studio on TopBraid EDG. You don’t need to use these tools to build something similar. But you do need to understand how to use natural language processing to extract the data and classify it, and how to test and improve the models using machine learning.

The blog format does not allow me to go into the kind of detail I would in a proper book chapter. Fortunately, the industry is full of resources to help you get started in the interim. First, check out my article on cognitive content strategy. Also, here are two of my favorites from InfoQ:

Post Script

In a LinkedIn conversation about this post, I added the following clarification:

I could have been a bit more detailed in my description our use of the Watson toolkit. We use Watson Natural Language Understanding (NLU) to extract the entities with their confidence scores. And we use Watson Knowledge Studio for modeling and classification. To get the relationships between keywords and brands/product names, we built a custom training set for Watson NLU that consisted of all of our external product documentation on IBM Knowledge Center and developerWorks. Every time a product was mentioned, we extracted all the entities related to it, in the form of unbranded keywords. Then we flipped the model in TopBraid EDG, tagging the unbranded keywords with the most relevant product segments. 

ico-rssLike this post?
Sign up for our emails here.

James Mathewson

About James Mathewson

James Mathewson is IBM's Distinguished Technical Marketer for search. He has 20 years of experience in web editorial, content strategy, and SEO for large and small companies. A frequent speaker, lecturer and blogger, James has published more than 1600 articles and two books on how web technology and user experience change the nature of effective content. James has two advanced degrees on related subjects from the University of Minnesota.

Leave a Reply



We've changed our webinars to live expert interviews. For our first one, Mike talked with James Mathewson, the program director of content marketing platforms...