Investigate using an LLM as a teacher #994

eu9ene · 2025-01-15T23:38:31Z

We should do a proof of concept for knowledge distillation from an LLM to our standard student model. The main benefit of this if it works is we won't need to deal with parallel data cleaning and training teacher models. All this can be quite challenging, especially for lower-resource languages.

This would require:

Estimate costs for different LLMs and APIs we can use
Run quality evaluation for those to see which model would provide the best cost/quality tradeoff
Choose a mix of monolingual data to translate
Run translation with an LLM
Train a regular student model on this data
Try different LLMs and corpus of different sizes (for example, 10M and 50M sentences)

Folks from WMT also suggested we can try pre-training the student on parallel OPUS corpus as is and then finetune on a smaller but high-quality LLM-produced corpus to make it more cost efficient.

eu9ene · 2025-01-15T23:38:46Z

Related to #767

eu9ene added the LLM Investigations into using LLMs in the pipeline label Jan 15, 2025

eu9ene self-assigned this Jan 15, 2025

eu9ene mentioned this issue Jan 17, 2025

Expand LLM Evaluation Use Case - Translation mozilla-ai/lumigator#628

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate using an LLM as a teacher #994

Investigate using an LLM as a teacher #994

eu9ene commented Jan 15, 2025 •

edited

Loading

eu9ene commented Jan 15, 2025

Investigate using an LLM as a teacher #994

Investigate using an LLM as a teacher #994

Comments

eu9ene commented Jan 15, 2025 • edited Loading

eu9ene commented Jan 15, 2025

eu9ene commented Jan 15, 2025 •

edited

Loading