Advances in “Machine Interpretation”: Speech to Text and Back Again

I know this isn’t particular new news, but after reading a bunch of articles discussing how technology will “revolutionize the language industry” in 2013 I decided to share it. It’s about advances in a sort of Babel Fish or Universal Translator — in other words, a system for instantaneous, machine interpretation. Last fall, Microsoft presented its breakthrough in doing this kind of interpretation from English into Chinese.

Right now, the basic process for machine interpretation is this:

  1. Speech-to-text — Using speech recognition software, you create a text of what the speaker is saying. Having studied linguistics but never being particularly interested in phonetics, I don’t know much about the technical side of how speech recognition works.
  2. Machine translation — Now that you have a text, you can run it through a machine translator. When I was in college I knew someone working on a project related to rule-based machine translation (in which, basically, the machine parses the text and uses dictionaries to do the translation). Now, however, statistical methods are very popular.
  3. Text-to-speech — I’m guessing that most people are familiar with computers that can “read” text aloud, but this is where the Microsoft project is especially cool. They produced the interpreted speech in Chinese using a voice that sounded like the speaker’s own voice. Without him ever having spoken a word of Chinese. Cool, huh?

Read more about Microsoft’s technology for machine interpretation and take a look at this demonstration by Rick Rashid, Microsoft’s chief research officer: