1、TheStateofMachineTranslation2024An independent multi-domain evaluation ofMachine Translation engines and Large Language Models52MT Enginesand LLMs11LanguagePairs9ContentDomainsDisclaimerMarch 25May 14,2024The MT systems and LLMs covered in this report were accessed between March 25 and May 14,2024.S
2、ome systems may have been updated since that time period.Automatic scoringThis report uses semantic similarity and LLM-based DQF-MQM scores.To pick the top model for a your use case you may need a human linguist or subject matter expert review to address particular business requirements.Stock models
3、 onlyIf you consider customizing an engine on your data,your choice may vary from what is suggested here.In the solutions we build for our clients,top picks often include Amazon,DeepL,Google,Microsoft,ModernMT,and Systran,depending on the languages and the amount of available training data.Data limi
4、tationsThe evaluation used plain text data.Results often differ for tagged text with some MT vendors and language pairs because of imperfect inline tag support.This report has also used segment-wise translation rather than leveraging the full text capabilities of LLMs and some MT systems.Valid for a
5、 specific datasetThis report shows how the systems performed only on the datasets listed.We run multiple evaluations for our clients using various language pairs and domains,and often observe different MT system rankings than those provided in this report.on slide 14Theres no“best”MT system or LLMMT
6、 performance depends on how similar your data is to the data used to train the vendors models,their algorithms,and your quality requirements.TrademarksAll third-party trademarks,registered trademarks,product names,and company names or logos mentioned in the Report are the property of their respectiv