If you have content, you tag it. If you have web content, you tag it in your content management system. Your authors do it every day. And it feels somehow free. You never get a bill for content tagging–people who already work at your place spend a few minutes doing it every day. So why would I need to automate content tagging?
For some kinds of content tagging, manual tagging is fine. It’s perfectly OK for you to ask authors to decide which content format their page is–they should know if they are creating a product spec or a white paper or a case study. But if you ask them to tag a document with the right industry or the right subject? Good luck.
I know that it doesn’t sound right. I know that we think that the computer is trying to be as accurate as human beings and that, sure, automation is cheaper, but certainly human beings tag it correctly. Well, they don’t. At least not when it is a difficult task. Not when you have to choose from a few dozen categories. When it’s a lot of categories, it’s really hard to get right.
Don’t believe me?
We have studies for that:
- Humans don’t always agree with each other. Called inter-coder agreement, here is a study that shows how often people disagree with each other. Using one human’s opinion as the be-all end-all of accuracy makes no sense for complex tasks.
- Human coders agree with themselves just 65% of the time. Given the same task a couple of days apart, one study shows a complex medical coding task where humans fall far short of any reasonable standard of consistency, because they can’t even replicate their own work.
- Humans fail to agree with each other even on relatively simple tasks. So, you will be excused for thinking that the first two studies had very hard tasks–they did. But here is a study that shows the two people agree with each other only about three times out of four for a task with just three answers–sentiment analysis. I mean, by chance, they would agree 33% of the time, so that is really low, isn’t it?
So, if you are wondering why you should be looking at AI techniques for content tagging, this is why. People are not very good at it. They don’t agree with each other–geez, they often don’t even agree with themselves. And for complex tasks, where you need authors to choose the right subject or industry from several dozen choices, the results are appallingly bad.
Now, understand, the machines won’t be 100% accurate either–not even close. But they can quickly come close to human performance (admittedly, that’s not great), but the silver lining is that they can be improved. Because whatever errors they make are made consistently, we can perform analysis to understand what is going wrong and make it better. On the other hand, it’s really hard to improve human performance on these tasks.
But there is no reason to pit humans against machines like some modern day John Henry song. The best approach is to use human-in-the-loop techniques, where the machine takes the first shot and it “knows” how confident it is about its answer. It can send the uncertain cases to people who can correct the machine if necessary, this building more training data for the next time. This kind of active learning approach allows the system to be taught more about the very cases that it is least confident about, improving accuracy rapidly.
If you have had blind faith that your content authors are tagging your content properly, maybe it is time to put that to the test. Even if you are skeptical, perhaps you should hand a bunch of tagged documents to a second person in a blind test to see how often they tag them the same way as the original author. The results might be eye-opening.