-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model wrapper with local fine-tuning #169
Conversation
…d local hugging face model and finetune loaded model with hugging face dataset Added features to download models from hugging face model hub/load local hugging face model and finetune loaded model with hugging face dataset. Model loading and fine-tuning can happen both at the initialization stage and after the agent has been initialized (see README in `agentscope/examples/load_finetune_huggingface_model` for details). Major changes to the repo include creating the example script `load_finetune_huggingface_model`, adding a new model wrapper `HuggingFaceWrapper`, and creating a new agent type Finetune_DialogAgent. All changes are done in a new example directory `agentscope/examples/load_finetune_huggingface_model`.
made customized hyperparameters specification available from `model_configs` for fine-tuning at initialization, or through `fine_tune_config` in `Finetune_DialogAgent`'s `fine_tune` method after initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the inline comments. Besides, you can run pre-commit run --all-files
to check if your coding style satisfies our repo's requirement.
# Decode the generated tokens to a string | ||
generated_text = self.tokenizer.decode(outputs[0][input_ids.shape[1]:], skip_special_tokens=True) | ||
|
||
return ModelResponse(text=generated_text, raw={'model_id': self.model_id}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with the other model wrappers, the 'raw' should contain the generated_text as well. Also, the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. What was the second point?
Need to disable E203 for pre-commit run. It automatically reformats `examples/load_finetune_huggingface_model/huggingface_model.py' line147 by adding whitespace before ':' and then flags it as an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
About the flask8 code style error on github, maybe the link can can help https://flake8.pycqa.org/en/3.1.1/user/ignoring-errors.html#in-line-ignoring-errors
-
try to split the the agent to an separated file to make it more clean.
-
change the name of this folder "load_finetune_huggingface_model" to "conversation_with_agent_with_finetuned_model" to make the naming style consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a better practice, please create new branches (in your forked repo) for the updates on reformatting examples tasks (one branch for one example). Please do not let the changes affects each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Description of this PR seems outdated, please update it accordingly.
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see inline comments, and:
- It's not clear who is responsible to fine-tune the model, the agent or model wrapper? Since the new model configuration has all required arguments for fine-tuning, why don't we just implement the fine-tuning functionality all in the new model wrapper class (e.g. the constructor function)? So that all other agents in AgentScope can re-use the model wrapper and fine-tune their local models without modifications.
self.model = AutoModelForCausalLM.from_pretrained( | ||
model_id, | ||
token=self.huggingface_token, | ||
device_map="auto", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the model is loaded by device_map
as auto
, will there be any conflict between the attribute self.device
(and device
argument in the constructor)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presumed fine-tuning of LLMs happens on gpus, since it is (as far as I'm aware) infeasible to do on cpu. self.device
is meant for device for inference only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is updated. If the user didn't specify device
in model_configs
, device_map="auto"
by default; otherwise device_map
is set to the user-specified device
. (see 8685213)
"config_name": "my_custom_model", | ||
# Or another generative model of your choice. | ||
# Needed from loading from Hugging Face. | ||
"model_id": "openlm-research/open_llama_3b_v2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using the argument model_name_or_path
like what transformers library does in from_pretrained
function rather than providing two arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed according to suggestion.
examples/conversation_with_agent_with_finetuned_model/README.md
Outdated
Show resolved
Hide resolved
examples/conversation_with_agent_with_finetuned_model/README.md
Outdated
Show resolved
Hide resolved
], | ||
) | ||
|
||
# # alternatively can load `model_configs` from json file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggested to remove lines 56-59
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to give the user the flexibility to specify the arguments needed in a json file, but also wanted to provide some explanation which cannot be done with json file, so I have two identical model_configs
, one in agentscope/examples/conversation_with_agent_with_finetuned_model/conversation_with_agent_with_finetuned_model.py
and the other in agentscope/examples/conversation_with_agent_with_finetuned_model/configs/model_configs.json
.
) | ||
|
||
dialog_agent.load_model( | ||
model_id="openlm-research/open_llama_3b_v2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design is a little strange, and the interface should be more concise.
- We have already specified the model configuration in
agentscope.init
and providemodel_config_name
in line 66, why we have to repeat to setmodel_id
in line 70 and 74? - Similar issue with dataset name in line 40 and 86.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an optional step where the user can choose to load another model after the agent has been instantiated if needed, depending on their use cases. Added a comment to clarify its purpose.
time_string = now.strftime("%Y-%m-%d_%H-%M-%S") | ||
|
||
# Specify the filename | ||
log_name_temp = model.config.name_or_path.split("/")[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should support user customized saving directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a new argument 'output_dir
for the users to customize saving directory. By default will save to the save example directory if left unspecified.
Reloading and fine-tuning models after an agent has been instantiated need to introduce a new method to the agent class. If this is not required, this model wrapper in principle supports loading and fine-tuning for other types of agents. |
Yes, and my question is that why we need to reloading and fine-tuning models within the agent? In my view, once we create a huggingface model wrapper, we have already decided to fine-tune the model. So that the training can be finished automatically within the constructor function of the model wrapper as follows? class HuggingFaceWrapper(ModelWrapperBase):
def __init__(self, *args, **kwargs):
# load model
self.model = self.load_model(xxx)
# fine tuning
self.fine_tune_model()
# ... In this way, the agent doesn't need to do anything, and developers only need to set their model configuration. |
I have a long-term use case scenario in mind, where there might be new data becoming available after the agent has been deployed (i.e., continual learning setting). In this case, the agent might need to be fine-tuned (potentially multiple times) after deployment. One such example we touched on during the discussion is fine-tuning the agents from their interaction traces in a multi-agent setup. Perhaps I can rename the agent class to better reflect this use case? |
Okay, I understand your consideration. For me, |
updated the dependencies needed
Updated the way to read token from .env file, so that it can work in any example directory.
Further update of this pull request can be found here as it was moved to a new branch #240 |
name: Pull Request
about: Create a pull request
Description
Added features to download models from hugging face model hub/load local hugging face model and fine-tune loaded model with hugging face dataset. Model loading and fine-tuning can happen both at the initialization stage and after the agent has been initialized (see
README
inagentscope/examples/conversation_with_agent_with_finetuned_model
for details). Major changes to the repo include creating the example scriptconversation_with_agent_with_finetuned_model.py
, adding a new model wrapperHuggingFaceWrapper
inagentscope/examples/conversation_with_agent_with_finetuned_model/huggingface_model.py
, and creating a new agent typeFinetune_DialogAgent
in 'agentscope/examples/conversation_with_agent_with_finetuned_model/finetune_dialogagent.py'. All changes are done in a new example directoryagentscope/examples/conversation_with_agent_with_finetuned_model
.To test, run
agentscope/examples/conversation_with_agent_with_finetuned_model/conversation_with_agent_with_finetuned_model.py
by following the instructions in theREADME
in the same directory.Checklist
Please check the following items before code is ready to be reviewed.