Thursday, March 22, 2007

Tower of Babel redux

MurryC is right, the Tower of Babel still exists in spite of protestations to the contrary by me but this will change. As MC correctly pointed out, Chinese will be the most commonly used language on the web, but...Marty added that China is or will be the largest English speaking country in the world sooooo...What gives?

Well, the state of "good" machine translation is still a work in progress. To whit, click Raptop Computer to read an unintentionally hilarious English translation of a company that puts modern computer hardware inside of outrageous old fashioned typewriter packaging. The tech is very cool, the translation of Japanese to English is not but, just five years ago, doing this at all would have been all but impossible because the universal ISO 16000 character set able to handle all the characters of the world's major 140 languages had not been universally accepted by the online community and integrated into all major web browsers. Now that is has, machine translation for writing and speech is starting to rock as seen by the seminal work being done by IBM

What's interesting about this approach to speech translation is the use of semantics, heuristics and genetic algorithms combined with neural nets that enables the system to learn the idioms of the languages contained in the environment. Google, AT&T and Microsoft, among others, are also getting into the act because this is big time business that's poised to take off given that the software that powers the web can now power language translation, something that can truly eliminate the TOB syndrome that has plagued planet Earth since the beginning of time. Minority Report is near because the tech is near. "and the beat goes on..."