Skype conversations between people who speak different languages could soon become the norm for an interconnected world. Skype has kicked off the first preview of its real-time translation service for spoken English and Spanish, along with translation options for more than 40 languages within instant messaging conversations.
The preview of the Skype Translator app is available for anyone using Windows 8.1 or Windows 10 Technical Preview on their desktop or tablet devices, the company said early this week. Skype also posted a video showing off the real-time translation during a spoken conversation between Spanish-speaking students in Mexico City and English-speaking students in Tacoma, Wash.
The Skype Translator app currently acts like a third-party interpreter involved in the call. Such a translator bot works by sending the audio streams to speech engines for translation and transcription. That allows it to translate what each person says as soon as he or she has finished talking.
“The automated translator in Skype Translator appears almost as a third speaker,” according to a Skype blog post that explained the new service. “We have seen that customers who are used to speaking through a human interpreter are quickly at ease with the situation. Others require some getting used to this new mode of interaction.”
Skype’s translation software builds upon years of machine learning work by Microsoft Research (Microsoft bought Skype in 2011). The resulting translator combines speech recognition, machine translation, and speech synthesis. The system chops up phrases into individual words before mapping each word over to the other language. It can also filter out the “ahs” and “umms” interspersed throughout normal conversations.
IEEE Spectrum also previously examined how Microsoft researchers trained the translation software to translate causal conversation phrases and terminology found on social media sites such as Facebook. Skype hopes that the newly-launched preview phase of Skype Translator will provide even more opportunities to improve Microsoft’s translation and voice recognition services.
Jeremy Hsu has been working as a science and technology journalist in New York City since 2008. He has written on subjects as diverse as supercomputing and wearable electronics for IEEE Spectrum. When he’s not trying to wrap his head around the latest quantum computing news for Spectrum, he also contributes to a variety of publications such as Scientific American, Discover, Popular Science, and others. He is a graduate of New York University’s Science, Health & Environmental Reporting Program.