-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(trainer_builder): refactor trainer_builder and preserve optional callable for custom model dispatch function in isp mode #293
Conversation
Experimental results for acc/loss alignment: Code base:
|
ISP adaption code: https://github.com/InternLM/InternEvo-HFModels/commit/6bfd9576005817e74302000ffa35567aca8260b4 For other huggingface models, just refer to the adaption code/docs of huggingface InternLM1 and InternLM2. (Just need to modify few lines of code) |
Status (Done.):
|
519aed7
to
925637f
Compare
weight parallel also enabled and acc/loss aligned https://github.com/InternLM/InternEvo-HFModels/tree/enable_wp |
925637f
to
ab6265c
Compare
Close this PR, since we choose to drop the repo |
Motivation
trainer_builder
, a.k.a, make the code self-documenting.model_dispatch_func
, might be useful when you try to integrate huggingface models with ISP.Modification
internlm/core/trainer_builder.py
internlm/core/parallel/comm/isp.py
internlm/model/builder.py
internlm/model/ops/attention.py
internlm/train/pipeline.py
BC-breaking (Optional)
None
Use cases (Optional)
None
Checklist
Before PR:
After PR: