Like HowStuffWorks on Facebook!

The Land of Mordor? How Hackers and Spies Use Google Translate


Pro-Russian sympathizers bearing a Russian flag march past pro-Ukrainian sympathizers gathered and waving Ukrainian flags on March 8, 2014 in Simferopol, Ukraine. Sean Gallup/Getty Images
Pro-Russian sympathizers bearing a Russian flag march past pro-Ukrainian sympathizers gathered and waving Ukrainian flags on March 8, 2014 in Simferopol, Ukraine. Sean Gallup/Getty Images

Google Translate is kind of a miracle. Type any sentence into the box and receive an instant and pretty-darn-accurate translation in more than 90 different languages, from Afrikaans to Zulu. Google's free service can also translate entire documents or websites, and its smartphone app can decipher words captured in photos and even translate real-time two-way conversations.

The Ukrainian newspaper Ukrainska Pravda has so much faith in Google Translate that it uses the service to publish an online Russian-language edition of the paper. The translations are usually spot on, which is why Russian-language readers were surprised (and a little ticked off) last week to find “Russia” translated as “Mordor” — the seat of evil in J.R.R. Tolkien's "Lord of the Rings" — and Russia's Foreign Minister Sergey Lavrov rendered as “sad little horse.”

The Ukraine remains a hotbed of political unrest following Russia's 2014 annexation of Crimea. While armed skirmishes have ceased, there's still no love between pro-Ukrainian activists and pro-Russian separatists. The Google-translated version of Ukrainska Pravda also rendered the word “Russian” as “occupant.”

The provocative mistranslations lasted a day before Google fixed the problem, which a spokesman characterized as a normal technical glitch. But some security experts believe that the “Mordor” references and other political jabs couldn't possibly be the result of a computer error. Google Translate was clearly hacked. But how?

To unravel the mystery, HSW spoke with Michael Shoukry, a cybersecurity and intelligence expert with FireEye, an international firm that specializes in identifying and neutralizing sophisticated cybersecurity threats.

Shoukry doesn't buy Google's assertion that the “Mordor” translations resulted from the normal goofiness of machine learning — when computers try to make connections between millions of snippets of text online and occasionally get it wrong.

“I don't think this is a glitch by any means, because these words have meaning to people. They show intent,” says Shoukry. “A computer couldn't think of these things on its own.” 

Loren ipsum and All That Latin

Shoukry knows a thing or two about hacking Google Translate. He spent a year researching an earlier oddity in Google Translate involving the ubiquitous Latin placeholder text known as lorem ipsum. Tipped off by an anonymous security researcher named “Kraeh3n,” Shoukry entered Latin words and phrases into Google Translate that returned bizarre and intriguing results:

lorem ipsum” = China

ipsum lorem” = the Internet

Lorem Ipsum” = NATO

Ipsum Lorem” = the Company (a common nickname for the CIA)

Just like the Russian “Mordor” incident, Shoukry's gut reaction was that the lorem ipsum translations couldn't be coincidental. They smacked of cryptography, the coded messages sent by government spies, political activists and other covert groups trying to communicate secretly in plain sight.

“Being in the research and intelligence community, it's something that everyone knows about,” Shoukry says. “People try to pass secret messages all the time.”

The lorem ipsum hack was brilliant in its simplicity. It took a chunk of pre-existing text — in this case a string of Latin words that all web designers and publishers use to create “dummy” layouts — and turned it into a cryptography “key.”

“You don't even have to send any actual content” to relay a coded message, says Shoukry. “I can just tell you, go to template five, page three, paragraph four, word 42, and then repeat it 142 times, and you have your secret message. I could just send you a series of numbers.”

How Google Translate Works

Google Translate is an amazing technical tool but it is susceptible to human interference.
Google Translate is an amazing technical tool but it is susceptible to human interference.
(c) 2016 HowStuffWorks

.

Most of the heavy lifting of Google Translate is accomplished by supercomputers that spend all day analyzing millions of Web pages to find patterns between chunks of translated text. That's called statistical machine learning, and while Google's computers are certainly more likely to make mistakes with Latin — because there's a relative scarcity of Latin text online — Shoukry says the real problem is a human one.

“One of Google's missions is to learn from its users, and as part of that, Google really wants to dig deep and understand linguistics,” Shoukry says. Google Translate is a cool service for the public, but it's also a way for Google's computers to refine their language skills to deliver better search results.

That's where the Google Translate community comes in. Google invites multilingual users to help translate words and phrases into their native tongue. You can either translate text yourself or validate other people's translations. In both cases, Google allows users to “check” the power of its own supercomputers. It also leaves open a door to hackers.

“Google heavily relies on end users and content that is published online,” Shoukry explains. “So any language can be altered, because Google is relying on information that might not be true.”

To be clear, a handful of pranksters wouldn't be able to trick the system by repeatedly entering the wrong translations. A deception the size of the lorem ipsum hack or the Russian “Mordor” incident requires “a significant amount of computer power” and technical sophistication, says Shoukry.

Google has since “fixed” the lorem ipsum hole by refusing to translate the text at all. But as long as Google Translate gives regular users the ability to translate and validate by hand, there's room for “error,” intentional or otherwise. 


More to Explore