| H-1173 Budapest, Pesti út 8-12.
| Phone: +36 (1) 202-0202, +36 (20) 919-4153 | E-mail: multi@lingua.hu

Our latest news

Bing Translator vs Google Translate – The right text for the right tool

Does the use of free translation tools immediately mean decreased quality, even in the year 2014 when technologies develop and the language industry grows online faster than ever?

Two tools, five languages, five genres

To assess the quality of translations produced by machine translation tools, a team of language experts at blog.bab.la compiled a small specialized corpus that consisted of 50 English sentences drawn from publicly available sources. The corpus was further divided into five genre-specific sub-corpora: Twitter, literature, news headlines, food recipes and legal texts. Ten sentences in each of the five genre categories were then translated into five languages - Finnish, Swedish, French, Russian and Dutch - with the help of two online translation tools, Google Translate and Bing. After compiling the corpus and translating the sentences, each translated sentence was ranked and classified in a scale from 0 to 4. The criteria to obtain a zero mark included for example several untranslated words, severe grammar mistakes and overall incomprehensibility. To get a four, the translation had to be close to what a human translator would produce, yet some stylistic or contextual problems were allowed.

From Twitter to national law: does machine translation stand any chance?

Overall, all languages ranked quite low when compared with translations produced by humans. Swedish acquiring 120 points out of the possible 200 was a clear winner in our quality comparison, followed closely by Dutch, Russian and French. Finnish was left far behind with its 60 points. It can also be observed that in the battle between Google Translate and Bing, the former was considered to produce translations of better quality.

The highest ranked Swedish as well as the second-ranked Dutch are both languages of Germanic origin, similarly to the source language English while the group's backmarker, Finnish, differs drastically from English in terms of syntax and word order. The middle position holders French and Russian are also of non-Germanic origin.

Figure 1. Quality scores of Twitter according to language
(max. points attainable 40 per tool)

Twitter as a genre acquired the overall lowest quality score. In Finnish and Dutch, Bing was praised as producing better quality work while the remaining three languages preferred Google Translate. The low score could be due to the structurally complicated nature of Twitter posts. A lot of information is condensed into a small number of characters with special characters. Machine translation tools seem to be unable to process these trends.

Figure 2. Quality scores of literary texts according to language
(max. points attainable 40 per tool)

Literature scored second-lowest in terms of quality, and to many this probably comes as a no surprise. Literary language often contains poetic expressions, complicated structures and other effects that make literary pieces so original. What is interesting in the team's findings, though, is that this time the Germanic languages Dutch and Swedish scored lower than non-Germanic French and Russian. Overall, Google Translate takes a win in this genre as well, leaving Bing behind in all but one language.

Figure 3. Quality scores of news headlines according to language
(max. points attainable 40 per tool)

Similarly to the two aforementioned genres, news headlines provoked some interesting variation between the languages. Somewhat surprisingly French scored the lowest, while other three big languages (Swedish, Russian and Dutch) produced some relatively good results. Once again, Google Translate takes the credit for a better-working tool.

Figure 4. Quality scores of food recipes according to language
(max. points attainable 40 per tool)

The sub-corpus consisting of sentences from food recipes was ranked as the genre category in which the translation tools produced the best results. Again dominated by Google Translate, and what is especially noteworthy is that in Dutch it produced fully comprehensible results that were considered almost as good as those produced by a human translator. The good quality of translations in this genre could be due to simple sentence structures. However, unlike in some other categories, the lexis proved to be the Achilles heel for the tools: they failed to distinguish polysemic words, such as to serve and icing, which lowered the score especially for Finnish.

Figure 4. Quality scores of food recipes according to language
(max. points attainable 40 per tool)

Legal texts provided another triumph for Google Translate: it was considered better than Bing in all five languages. However, the overall results demonstrate that the achieved results were relatively low in all languages besides Swedish and French. The low scores are likely to be due to the complicated, field-specific terminology. Interestingly enough, in this genre several sentences were given a four, while equally many deserved only a zero or one. Thus, drawing any definite conclusion from the patterns in this genre is not possible.

Figure 5. Results organized according to languages.

Food for thought

There is both good and bad in machine translation and in the tools generated for this purpose. It goes without any doubt that, as they are now, the machine translation tools could not replace human translators. However, it can't be said that these tools would be completely bad either: they worked relatively well with short and simple sentences and instructive texts.

For more information, click here.
Our latest news

Translation tech gets Olympic push

Japan may not be the best in the world when it comes to speaking English, but it remains a pioneer in developing cutting-edge translation technology.
With the 2020 Tokyo Olympics approaching, the nation is once again plotting to surprise the world, this time with high-quality, real-time machine translation systems.
Public and private institutions are working eagerly to develop and upgrade the technology so it can easily be used by tourists, whose numbers are growing sharply

Preparing for Machine Translation: What Machines Can and Can't Do

There is nothing especially novel about machine translation, a technology that reaches back to 1951, when a team from IBM and Georgetown University first demonstrated a computer’s ability to translate short phrases from English into Russian. In 63 years, the machines involved in machine translation have evolved. What a warehouse-sized computer could do in 1951, a laptop can do even better today.