…but that’s OK.
Here’s what I actually wanted to put up:
Machine Translation: I’m Sick of Waiting
ARTICLE DATE: 09.18.06
By John C. Dvorak
The way I see it, if computers can now play a credible world-class game of chess, then they should be able to translate complex sentences written in the world’s major languages. They should be able to translate to and from English, to and from French, and to and from Russian. I eventually expect a translation to and from Chinese and Japanese, too. Exactly what’s the hangup?
ADVERTISEMENT We have the computing power to make this work, so why don’t governments all demand it? Throw $10 billion at the problem, and I bet it is resolved sooner rather than later. $10 billion is less than the cost of one month of the Iraq war, just for comparison.
My French has been in decline since 1973, but I sure know enough to find machine French-to-English translations to be an abomination. For example, with rare exceptions, if you go to a wine site to find out about the latest conditions in Bordeaux, these systems will invariably translate the word chateau as “castle,” despite that Americans (and most English-speaking nations) use the word “chateau” as such. And, in fact, it is always used when referring to a Bordeaux winery such as Chateau Margaux. That’s the name of the place.
For some unknown reason, no translation system can understand this simple fact. Is this rocket science? It’s not the Castle of Margaux or Margaux Castle. How hard is this? It gets even worse if the name of the place itself can be translated. Chateau Latour comes back as the Tower Castle or Castle Tower. What good is it to read that “In big news this week Castle Tower was sold to French investors?” Wha? Huh?
This is just one example, but to me, since it is so obvious and brain-dead, it epitomizes the problem. Apparently nobody is really trying very hard. THAT’S the problem.—Continue reading…
Let’s consider an example from Sudouest , an excellent regional newspaper in France. It has a blog at sudouest.com, and I decided to cut a chunk of text and run it through various systems. Here’s the text.
Vous avez essayé d’obtenir une bourse d’étude mais celle-ci, fonction des revenus de vos parents, vous a été refusée. Vous trouvez le cas injuste? Comment cela risque-t-il d’influer sur votre scolarité? Partagez votre expérience sur ce blog en utilisant le lien “Commentaire” ci-dessous..com
A reasonable translation is as follows:
You tried to obtain a scholarship, but it was refused because of your parents’ income. Did you think this was unfair? Tell us how you think it would affect your schooling and tell us your experiences by adding a comment to the blog. Click on the comment button below.
Here are the various results from the online translation systems:
Babelfish:
You tried to obtain a grant but this one, function of the incomes of your parents, was refused to you. You find the case unjust? How that is it likely to influence your schooling? Share your experiment on this blog by using the bond “Comment” below.
Grade: D. That should be experience, NOT experiment. And bond should be link or button. Grammar screwed up.
Freetranslation.com:
You tried to obtain a study stock market but this one, function of the income of your parents, you was refused. You find the unfair case? How that does it risk influencing on your schooling? Divide your experience on this blog while using the link “Commentary” here-under. com
Grade: F. Just horrible—and wrong! (The site offered a human translation of these few sentences for $50.)
PROMT-Online:
You tried to acquire a grant but this one, function of the incomes of your parents, was refused you. You find unfair case? How does it risk having an influence on your schooling? Share your experience on this blog by using link “Comment” below com
Grade: B. You can actually make sense of this. It’s the clear winner here.
WorldLingo:
You tried to obtain a grant but this one, function of the incomes of your parents, you was refused. You find the case unjust? How that is it likely to influence your schooling? Share your experiment on this blog by using the bond “Comment” below.COM
Grade D: Similar errors to the Babelfish translation with additional errors. What does it take to use the proper noun “were” instead of “was?” You was? Please.
The computer revolution began a half-century ago. We should have been able to solve this problem by now. What we need is government resolve, because private industry can’t seem to manage it.
–http://www.pcmag.com/article2/0,1759,2017280,00.asp
I think part of the problem might be that the people who work in the computer industry CAN’T EVEN TELL THE DIFFERENCE BETWEEN A NOUN AND A VERB. How can they be expected to know the difference between ‘was’ and ‘were’? Never mind that it’s practically impossible to take into account all the subtleties of a language. And don’t forget the importance of context. This is one of my favourite rant topics. I really have to shut up now or else I’ll be here all night.
Elena Temnova said
The problem with “Chateau Margaut” can be solved by adding this word combination into the French-English dictionary (integrated to the correspoding MT system) with the corresponding translation. So, we will see at least one “abomination” less in the target text. The most of desktop MT system have this feature, e.g. PROMT, Systran etc.
It’s simplicity itself, compared to the other part of the problem – to tell a verb from a noun in a complicated context, and we have to remember than, unlike the game of chess, a natural language can produce an incalculable number of combination. There are only 64 pices in chess game, and rules are invariable, whereas a language has millions words, and each word can have tens of meanings and can be translated in different ways, and everything in language is changeable, and each rule can include many exceptions. That’s why the only way to obtain a perfect machine translation is building a computer provided with a human brain.
The Daily Distracter said
You’re right (even if WordPress decided your comment was spam and I had to rescue it…). There’s just nothing quite like the human touch!
Percy Balemans said
Oh, don’t get me started on this…
It’s not just the people in the computer industry who simply don’t have a clue about what it takes to create a good translation; if I would get paid for every time someone claims that Babel Fish is “great for translating your website” or even “for learning a language”, I’d be rich by now…
I think one of the main problems is that, since everyone speaks a language, people automatically assume it doesn’t require a special skill and it’s therefore something a computer could easily do. At the same time, it doesn’t take a lot of language skills to see that the “translations” Babel Fish produces are rubbish, so why do people still think it’s such a great tool?
The Daily Distracter said
I know, I really can’t understand it either. People are quite happy to use Babelfish to translate into a language they don’t know, yet when they get something that’s been translated back by machine translation and see how bad it is don’t seem to make the connection that it’s just not going to work in ANY language.
I’ve used Babelfish to translate things from languages I don’t know very well – but in the end I’ve found it easier to just get out my dictionaries and do it myself, especially if there’s even the slightest ambiguity in the text. Obviously, machine translation does have its place, but people are continually overrating what it can and can’t do. I accept that if I have to use it to translate say….oh I don’t know…Chinese (a language I don’t speak at all except to proudly announce in Mandarin that I’m wearing a white shirt and count to ten) then there are going to be a lot of mistakes, not only grammatically, but also in the vocabulary.
I think the problem is that people either won’t or can’t recognise the faults in machine translation, and until they do, the problems will continue. People are happy with the standard, so the computer industry won’t help. Added to that is the fact that, as Elena said “the only way to obtain a perfect machine translation is building a computer provided with a human brain,” simply because the nuances of any language can be so subtle. Very few people speak one language perfectly, fewer still can speak two or three languages perfectly – yet we think that a computer should be able to do it. And that’s not even taking into account the special skills required in TRANSLATING from one language to another.
And I never meant to go into any of this, honest I didn’t – I just meant to see why I hadn’t gotten an e-mail from WordPress that they were going to see me…and I saw another comment here…and it kind of triggered me off…I’d better stop now, before I spend the rest of the afternoon on this
.
Elena Temnova said
A natural language has two levels, one is the invariable basis, it can be represented as a set of morphological and syntactic rules, a vocabulary with a fixed number of words, each having a restricted number of translations in different contexts, and some fixed phrases, proverbs etc. Thanks to all this, people speaking a same language, can understand each other.
The second level is variable, flexible, it is the pledge of the language development and flexibility; thanks to this second level we can express our thoughts in a non-standard, individual way, create literary texts, poetry. If it was only the first level, the constant one, creating a 100 per cent reliable MT system would be a realistic task, algtough extremely complicated and expensive. As for the second level, where lies the language variance, sometimes even a professional human translator cannot understand there some nuances.
Besides, the language system is not a mere set of rules, but also a huge base of knowledge about the world reflected in this language. If I ask a child the question – “Can you tell me the name of an animal, green-coloured, small, which jumps and croaks?” – he would say: “Surely, it’s a frog!”. The most high-powered computer would never give me such a reply, if it doesn’t contains this knowledge.
When I analyze the errors in machine-translated texts, I can see that certain number of them (errors) are due to the gaps in the developpers’ work. Well, nothing is perfect in this world, and these gaps can be amended sooner or later. The rare words and collocations absent from the computer dictionaries can be added and translated, the complicated syntactic constructions can be correctly processed. But some errors are “beyond the reach of remedyies”, as doctor says. It means that we will NEVER see an irreproachable machine translation, whatever we do and invest in it, and human translators would never starve, but a good MT system, providing a real basis for understanding a text, is quite a realistic job.