Asia Online, in cooperation with the Translation Automation User Society (TAUS), conducted an experiment to study the optimum approaches to build a statistical machine translation engine with shared data. According to the results, smaller pools of clean, shared data provide significant improvements in machine translation quality.
Asia Online performed extensive analysis on data provided by three TAUS member companies of the same industry domain. The training data was used to create 29 separate MT engines, for which evaluations were performed on the output quality using BLEU and F-Measure systems.