Just a few days after I blogged about Podzinger, John Battelle has a story about Nexidia, an interesting search engine because it uses a different approach. While Podzinger uses speech recognition technology to translate audio into words and then indexing them using standard text retrieval techniques, Nexidia skips the speech recognition step—storing each sound as a phoneme (a unique speech sound) and indexing all phonemes.
I know a lot about text retrieval technology, but not much about audio information. So, I need to make some educated guesses about the advantages of Nexidia’s approach.
My guess is that by translating the search words entered as text to phonemes, the matching process is far more reliable because translating text to phonemes is far more accurate than translating phonemes to text (speech recognition). Why is this true? One reason, especially important in search applications, is proper names are frequently missing from speech recognition dictionaries and so are often recognized incorrectly or not at all. Any word that is not correctly recognized won’t be matched at search time to its correct counterpart.
For applications where searchers are typing their search queries as text, I would think that a drawback to Nexidia’s approach would be that speech recognition would still be needed to display the title and text of the audio file in the search results page. If Nexidia truly finds the best matches, however, searchers would still be better off, because speech recognition approaches have the same problem. Basically, a speech recognition approach has errors in both matching and snippets while a phonemic approach has those errors only for snippets.
Now, the basis for the claims of superiority for a phonemic approach is that it is more accurate than speech recognition, which also may be changing. David Nahamoo, a colleague at IBM, is demonstrating far more accurate speech recognition than has been seen before. As with most competing technical approaches, it’s hard to tell which approach will work better, but there certainly seems to be a lot of innovation around audio search, which ends up good for searchers and for search marketers looking to attract visitors to their podcasts and other audio content.