26 September 2013 – Last week we ran a post about Google’s quest to end the language barrier and the possible effect on non-English e-discovery document reviews. We wrote about our long way toward making automated language translation easier, faster and more reliable, but a world of seamless and immediate translation still out of our grasp. But … getting better and better with Google, Facebook, IBM and Microsoft all devoting gobs of money to perfect instant, seamless translation, with Google, IBM and Microsoft creating special legal translation units. We discussed the first machine translation architectures developed by IBM based on mathematical models. For that full post click here.
Yesterday the MIT Technology Review published an article that goes deeper, beginning with a fabulous first line: “To translate one language into another, find the linear transformation that maps one to the other. Simple, say a team of Google engineers”.
The article goes on to say that Tomas Mikolov and a couple of his pals at Google in Mountain View have
” … developed a technique that automatically generates dictionaries and phrase tables that convert one language into another. The new technique does not rely on versions of the same document in different languages. Instead, it uses data mining techniques to model the structure of a single language and then compares this to the structure of another language. ‘This method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs,’they say.”
Ah, the sheer beauty of algorithms. It’s a very interesting post and includes a link to Mikolov’s detailed explanation. For the full MIT Technology Review piece click here.