Microsoft is developing new software which allows the user to speak in one language, and have their voice almost immediately translated into speaking another language. Sci-fi takes another step towards reality.
At an event in Tianjin, China on October 25, Microsoft's chief research officer Rick Rashid demonstrated a new product which translates speech into other languages. Rashid's voice was almost immediately translated from English into Chinese. After he spoke, the words he uttered in English appeared on a screen along with it's Chinese translation. A short time thereafter, he was heard speaking Chinese to the cheering audience.
There has been a slew of recent research into voice recognition, and particularly from an advancement known as "deep neural networks", which make use of the way neurons are mapped in the human brain to better optimize speech recognition. By understanding how the human brain thinks, and how it goes from thought to thought, it has an easier time predicting what word comes next, based on what has been said previously. This new technique has reduced the errors in speech recognition from 20-25% of the translated words, to just 15%. This is still a lot, but a significant improvement.
The new translator makes use of the recent improvements in order to transcribe what a speaker says. Even when speaking unclearly, as one would in a normal conversation when not attempting to dictate words, it manages fairly well at understanding what the speaker is saying. Afterward, the software translates the text, word for word, into the new language. Next, it reorders the words to fit the structure of the new language. The final part of the translation process is to speak the translated text, and it does this in the original speaker's voice. The speaker supplies about an hour of their own voice, which allows the software to automatically modulate the audio output to sound like the speaker.
Rashid has stated that the limits of the accuracy of this technology is unknown. However, the most impressive part of the technology is that the software completes the translation almost immediately. In other words, we're not all that far from the translators you see on Star Trek or in Mass Effect, and that is extremely exciting. The future is now! Below is a video demonstration of the technology from the recent event in China: