Whether you consider it to be interesting, scary, or both, it’s fascinating to see how some elements of science fiction from decades ago are becoming modern technological innovations. Designed to be a plot device to explain how all humans and aliens could communicate in English for time-starved TV shows and popularised by Star Trek in the late 1960s, the “universal translator” allowed for the direct audio translation in the speaker’s voice to the recipient. Little more than half a century later, we call this blooming technology machine interpreting, or speech-to-speech translation, and it stands ready to change how we interact with one another.
Looking at the term, first and foremost, machine interpreting can be broken down into its component words. Coming directly from the Middle French machine but related back to the Latin machina and the Greek makhana (all of which mean ‘device, tool, or machine’), the word machine was first used to define a general instrument or device constructed to perform a task in 1648 and appears in the English newspaper The Moderate, writing (presumably of a balloon) that: “He hath brought from that Country the invention of a Machine, being Airie, & of a construction so light, nevertheless so sound and firm, that the same is able to bear two men, and hold them up in the Air.” As for the process of using a machine to aid in the translation of languages, the first mention of this occurs in a 1952 presentation by James W. Perry, entitled Machine Translation of Russian Technical Literature, which was part of the Massachusetts Institute of Technology’s Conference on Mechanical Translation. For the second term, interpret (and its gerund interpreting), coming from the Old French interpreter and, further back, the Latin interpretari and meaning ‘to render, explain, or make clear’, first appears in the 1382 Wycliffite Bible, specifically in Daniel 5:16, where it is written: “I heard of thee, that thou mayst interpret dark things, and unbind bound things.”
Discounting – if that’s possible – the deep learning, machine learning, linguistic variation and speech recognition, and awesome processing power involved in machine interpreting, it essentially involves combining two existing technologies: speech recognition and machine translation.
Simply put, voice recognition software records what is being said and converts it into text. The text is then processed by a machine translation program, which translates it into another language’s text. Finally, that translated text is then converted into the translated language’s speech.
Unfortunately, the explanation may be simple, but perfecting the real-world application isn’t. For anyone who has had to repeat their requests to Alexa or Siri multiple times or for those of us who have marvelled at what our talking can produce with speech-to-text software, voice recognition is far from perfect. Compounding the problem, machine translation isn’t perfect either: while you may be able to understand the basic idea presented in a body of text, machine translation alone lacks detailed accuracy, overlooking elements like style, voice, and rhetorical devices.
The progress from popular science fiction to the cusp of useable technology over slightly more than 50 years isn’t bad, and, even though machine interpreting hasn’t been perfected yet, there’s no reason to worry: according to Star Trek’s timeline, the universal translator won’t be invented for almost another 130 years, so we’re still ahead of schedule.