Hacker News new | past | comments | ask | show | jobs | submit login

Obviously Google translate is not error free, nor is any statistical translation system going to be comparable to a human translator in the very near future, but you're underestimating the current development of statistical translation. Granted, I'm not a native speaker but I think "I need to meet up" is not even a sentence with proper grammar. Underlying model probably predicted something like meeting (satisfying) requirements due to the lack of an object in the sentence and context. Situations like this where the input is very short and noisy is obviously going to be a weakness of statistical systems for a long time to come, but looking at technologically how far we are from mastering biological systems, I think it's safe to say this is going to be the way of doing it for a while, and will be very successful in translating properly structured texts if proper context can be provided. Currently statistical translations have (almost) no awareness of context beside some phrase-based or hierarchical models. Many people are probably not factoring in the fact, that with exponentially more data, and exponentially higher computing power, a model can utilize the context of a whole book while translating just a sentence from that book - which is actually still much less than what human translators utilize in terms of context. While translating a sentence, I might even have to utilize what was on the news the night before to infer the correct context. We are currently definitely far from feeding this kind of information to our models, so I'd say this kind of criticism towards statistical translation is very unfair.



"We need to meet up" also translates incorrectly "我们需要满足". In fact, I did not originally use a fragment, I wrote a full sentence that Google repeatedly incorrectly translated. I only used a fragment here to simply my example.

To avoid the wrath of the Google fan boys, a better example would have been the pinnacle of statistical AI : The category was "U.S. Cities" and the clue was: "Its largest airport is named for a World War II hero; its second largest for a World War II battle." The human competitors Ken Jennings and Brad Rutter both answered correctly with "Chicago" but IBM's supercomputer Watson said "Toronto."

Once again, Watson, a probability based system failed where real intelligence would not.

Google has done an amazing job, with their machine translation considering they cling to these outdated statistical methods. And just like with speech recognition has found out over the last 20 years, they will continue to get diminishing returns until they start borrowing from nature's own engine of intelligence.


You are exhibiting a deep misunderstanding of human intelligence.

Ken Jennings thought that a woman of loose morals could be called a "hoe" (with an "e", which makes no sense!), when the correct answer was "rake". Is Ken Jennings therefor inhuman?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: