Monday, May 23, 2005

Google Translator: The Universal Language

"Still, many people can�t speak English. The collected, shared knowledge that makes up the web is therefore only partly accessible to them. The reverse, of course, is true as well. When you surf the web, you will sometimes come across languages and characters you don�t understand � like Chinese, Arabic, Korean, French, German, Italian, Spanish, or Japanese. Would you be able to fluently read these languages, those sites wouldn�t be a dead end for you. You would discover a wealth of knowledge, and more importantly, opinions. If you�re an US citizen, how many Arabic, German or French sources do you read to get a good understanding of how the world sees the US? How many blogs do you read in foreign languages? Probably not many, unless you�re fluent in those languages."
How do they do that? It�s certainly complex to program such a system, but the underlying principle is easy � so easy in fact that the researchers working on this enabled the system to translate from Chinese to English without any researcher being able to speak Chinese. To the translation system, any language is treated the same, and there is no manually created rule-set of grammar, metaphors and such. Instead, the system is learning from existing human translations. Google relies on a large corpus of texts which are available in multiple languages.

