You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Objective: Expand Lumigator’s capabilities to include workflows for evaluating multilingual models, particularly focusing on translation quality.
Why This Matters:
Translation tasks are critical for industries like localization, global commerce, and media, where fluency and accuracy significantly impact business outcomes. GitHub’s 2023 Octoverse Report indicates that 44% of developers work on projects requiring internationalization, underscoring the need for robust multilingual model evaluations.
Planned Actions:
Evaluate metrics like BLEU, TER, and COMET for translation workflows.
Prototype workflows for evaluating translation quality, focusing on accuracy, fluency, and cultural nuances.
Develop extensible modules for evaluating and comparing translation models, starting with smaller-scale tasks and scaling based on user feedback.
💡 Community Contribution Opportunities:
Help design evaluation metrics for translation.
Share real-world translation benchmarks.
Improve evaluation frameworks with multilingual support.
Timeline: Q1 2025 (availability to be defined, ideally EO Feb / early Mar)
The text was updated successfully, but these errors were encountered:
Hi, I'm Evgeny from the Firefox Translations team. There might be an overlap with one of our initiatives to experiment with LLMs as teacher models for knowledge distillation. The first step of this experiment would be to evaluate the translation capabilities of different models. Specifically, to see the effect on quality and cost of inference (for different sizes, pretrained vs fine-tuned on translations etc.). I did some benchmarking for a limited set of models and languages a while ago, but we need to go deeper this time. I know we're going to meet, but I'm also adding a link here for reference.
Objective: Expand Lumigator’s capabilities to include workflows for evaluating multilingual models, particularly focusing on translation quality.
Why This Matters:
Translation tasks are critical for industries like localization, global commerce, and media, where fluency and accuracy significantly impact business outcomes.
GitHub’s 2023 Octoverse Report indicates that 44% of developers work on projects requiring internationalization, underscoring the need for robust multilingual model evaluations.
Planned Actions:
Evaluate metrics like BLEU, TER, and COMET for translation workflows.
Prototype workflows for evaluating translation quality, focusing on accuracy, fluency, and cultural nuances.
Develop extensible modules for evaluating and comparing translation models, starting with smaller-scale tasks and scaling based on user feedback.
💡 Community Contribution Opportunities:
Timeline: Q1 2025 (availability to be defined, ideally EO Feb / early Mar)
The text was updated successfully, but these errors were encountered: