Точка Синхронізації

AI Archive of Human History

MTQE.en-he: Machine Translation Quality Estimation for English-Hebrew
| USA | technology

MTQE.en-he: Machine Translation Quality Estimation for English-Hebrew

#Machine Translation #Quality Estimation #Hebrew language #MTQE.en-he #arXiv #Natural Language Processing #Benchmarking

📌 Key Takeaways

  • The release of MTQE.en-he marks the first public benchmark for English-Hebrew Machine Translation Quality Estimation.
  • The dataset includes 959 English-Hebrew segments annotated with Direct Assessment scores from three human experts.
  • Benchmarking was performed using ChatGPT, TransQuest, and CometKiwi to establish performance baselines.
  • Research indicates that ensembling multiple models yields superior results compared to using individual assessment tools.

📖 Full Retelling

Researchers have officially released MTQE.en-he, the first publicly available benchmark for English-Hebrew Machine Translation Quality Estimation (QE), via the arXiv preprint server in early February 2025 to address the lack of standardized evaluation tools for this specific language pair. The dataset was developed to provide a reliable framework for assessing how well machine translation systems handle the linguistic complexities of Hebrew, which has historically lacked robust public resources for automated quality assessment. The release marks a significant milestone for computational linguistics by providing researchers with localized data to train and test advanced Al models. The MTQE.en-he dataset is comprised of 959 distinct English segments sourced from the WMT24++ corpus, with each segment paired with a corresponding machine-translated version in Hebrew. To ensure the highest level of accuracy and reliability, the quality of these translations was evaluated using Direct Assessment (DA) scores provided by three independent human experts. This human-in-the-loop validation process allows the benchmark to serve as a gold standard for comparing the performance of automated QE metrics against human judgment. In addition to the dataset release, the researchers conducted extensive benchmarking using several state-of-the-art technologies, including ChatGPT prompting, TransQuest, and CometKiwi. Their findings revealed that while individual models provide varying levels of accuracy, an ensemble approach combining all three systems consistently outperforms any single model. This discovery suggests that leveraging multiple architectural paradigms—from large language models to specialized quality estimation frameworks—is the most effective strategy for predicting translation quality in the English-Hebrew domain.

🏷️ Themes

Artificial Intelligence, Linguistics, Technology

📚 Related People & Topics

Hebrew language

Hebrew language

Northwest Semitic language

Hebrew is a Northwest Semitic language within the Afroasiatic language family. A regional dialect of the Canaanite languages, it was natively spoken by the Israelites and remained in regular use as a first language until after 200 CE and as the liturgical language of Judaism (since the Second Temple...

Wikipedia →

Natural language processing

Processing of natural language by a computer

Natural language processing (NLP) is the processing of natural language information by a computer. NLP is a subfield of computer science and is closely associated with artificial intelligence. NLP is also related to information retrieval, knowledge representation, computational linguistics, and ling...

Wikipedia →

Benchmarking

Comparing business metrics in an industry

Benchmarking is the practice of comparing business processes and performance metrics to industry bests and best practices from other companies. Dimensions typically measured are quality, time and cost. Benchmarking is used to measure performance using a specific indicator (cost per unit of measure, ...

Wikipedia →

Machine translation

Machine translation

Computerized translation between natural languages

Machine translation is the use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic, and pragmatic nuances of both languages. While some language models are capable of generating comprehensible results, machine translation tools re...

Wikipedia →

📄 Original Source Content
arXiv:2602.06546v1 Announce Type: cross Abstract: We release MTQE.en-he: to our knowledge, the first publicly available English-Hebrew benchmark for Machine Translation Quality Estimation. MTQE.en-he contains 959 English segments from WMT24++, each paired with a machine translation into Hebrew, and Direct Assessment scores of the translation quality annotated by three human experts. We benchmark ChatGPT prompting, TransQuest, and CometKiwi and show that ensembling the three models outperforms t

Original source

More from USA

News from Other Countries

🇵🇱 Poland

🇬🇧 United Kingdom

🇺🇦 Ukraine

🇮🇳 India