Expand LLM Evaluation Use Case - Translation #628

jularase · 2025-01-15T08:48:32Z

Objective: Expand Lumigator’s capabilities to include workflows for evaluating multilingual models, particularly focusing on translation quality.

Why This Matters:
Translation tasks are critical for industries like localization, global commerce, and media, where fluency and accuracy significantly impact business outcomes.
GitHub’s 2023 Octoverse Report indicates that 44% of developers work on projects requiring internationalization, underscoring the need for robust multilingual model evaluations.

Planned Actions:

Evaluate metrics like BLEU, TER, and COMET for translation workflows.
Prototype workflows for evaluating translation quality, focusing on accuracy, fluency, and cultural nuances.
Develop extensible modules for evaluating and comparing translation models, starting with smaller-scale tasks and scaling based on user feedback.

💡 Community Contribution Opportunities:

Help design evaluation metrics for translation.
Share real-world translation benchmarks.
Improve evaluation frameworks with multilingual support.

Timeline: Q1 2025 (availability to be defined, ideally EO Feb / early Mar)

eu9ene · 2025-01-17T19:08:42Z

Hi, I'm Evgeny from the Firefox Translations team. There might be an overlap with one of our initiatives to experiment with LLMs as teacher models for knowledge distillation. The first step of this experiment would be to evaluate the translation capabilities of different models. Specifically, to see the effect on quality and cost of inference (for different sizes, pretrained vs fine-tuned on translations etc.). I did some benchmarking for a limited set of models and languages a while ago, but we need to go deeper this time. I know we're going to meet, but I'm also adding a link here for reference.

Github issue: mozilla/translations#994

jularase added the epic label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand LLM Evaluation Use Case - Translation #628

Expand LLM Evaluation Use Case - Translation #628

jularase commented Jan 15, 2025 •

edited

Loading

eu9ene commented Jan 17, 2025

Expand LLM Evaluation Use Case - Translation #628

Expand LLM Evaluation Use Case - Translation #628

Comments

jularase commented Jan 15, 2025 • edited Loading

eu9ene commented Jan 17, 2025

jularase commented Jan 15, 2025 •

edited

Loading