Full finetuning with Roberta-Large #40

aparna-aketi · 2024-10-18T21:54:55Z

I want to run a full fine-tuning with RoBERTa-large. The readme file suggest to use following command

# Adam fine-tuning
TASK=SST-2 K=16 SEED=42 BS=8 LR=1e-5 MODEL=roberta-large bash finetune.sh

However, the type parameter is set to TYPE:-"prompt". Shouldn't this be set to "finetune"?

The text was updated successfully, but these errors were encountered:

gaotianyu1350 · 2024-10-22T17:41:18Z

Hi,

Here "prompt" just means to prompt-based fine-tuning (https://arxiv.org/abs/2012.15723), a very standard way to fine-tuning language models nowadays.

aparna-aketi · 2024-10-22T17:52:11Z

Hi, Thanks for the response. Just for clarification, in figure 2 of the MeZO paper, does FT correspond to full fine-tuning or prompt-based fine-tuning? I want to reproduce the results corresponding to that figure.

gaotianyu1350 · 2024-10-23T12:20:01Z

Hi, everything we report is prompt-based fine-tuning, since that provides much better performance

aparna-aketi · 2024-10-23T16:23:15Z

Okay, thanks for clarification. One more question: mezo.sh file has the steps to be 100k and run_fewshot.sh has 1000 steps. In the figure 2, is MeZO run for 100x more steps than FT. Is that correct? It doesn't seem like a fair comparison as MeZO uses 100x more number of steps than FT. Even if we consider the backward pass to be 2x more expensive than forward pass, we should be using 3x steps for MeZO to do a fair comparison with FT. It would be great if you could provide some insights here.

gaotianyu1350 · 2024-10-27T23:50:54Z

Hi,

Yes MeZO is run with 100x more steps than FT. It is not a fair comparison in terms of wall clock time. The Roberta-large experiments are mainly to showcase that it is possible to train models without backpropagation (which saves a lot of memory).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full finetuning with Roberta-Large #40

Full finetuning with Roberta-Large #40

aparna-aketi commented Oct 18, 2024 •

edited

Loading

gaotianyu1350 commented Oct 22, 2024

aparna-aketi commented Oct 22, 2024 •

edited

Loading

gaotianyu1350 commented Oct 23, 2024

aparna-aketi commented Oct 23, 2024 •

edited

Loading

gaotianyu1350 commented Oct 27, 2024

Full finetuning with Roberta-Large #40

Full finetuning with Roberta-Large #40

Comments

aparna-aketi commented Oct 18, 2024 • edited Loading

gaotianyu1350 commented Oct 22, 2024

aparna-aketi commented Oct 22, 2024 • edited Loading

gaotianyu1350 commented Oct 23, 2024

aparna-aketi commented Oct 23, 2024 • edited Loading

gaotianyu1350 commented Oct 27, 2024

aparna-aketi commented Oct 18, 2024 •

edited

Loading

aparna-aketi commented Oct 22, 2024 •

edited

Loading

aparna-aketi commented Oct 23, 2024 •

edited

Loading