Google's Manager of Speech Technologies Mike Cohen understands speech on a level most of us don't think about. He understands it on a basic level of sound combinations and contextual clues. He has to -- he's in charge of a department at Google that works on speech-recognition technology.
Teaching a computer to recognize speech is tricky. To understand English, there are many hurdles one must overcome. The English language has a lot of homonyms -- words that phonetically sound the same but mean different things. Think of "to," "two" and "too." People speaking with an accent or in a regional dialect may pronounce words in a way that's dramatically different from the standard pronunciation. And then there are words like "route" that have alternate pronunciations -- you can say "root" or "rout" and both are correct.
How do you teach a computer to make these distinctions? How can a machine understand what we say and respond appropriately? These are the challenges Cohen and his team face at Google. We spoke with Cohen and asked him to give more detail about his work in speech-recognition research and applications.
On each page, you'll see our questions in the title and Cohen's responses in the body. We started with the basics of speech recognition technology, as you'll see on the next page.