You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey! Yep, this is a known issue and it is rather a limitation from the way we implement gradient checkpointing AFAIK.
You can try this branch: #34987 with GradientCheckpointLayer should be working fine! Tell me if that helps?
System Info
transformers 4.47.1
Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
https://github.com/huggingface/transformers/blob/241c04d36867259cdf11dbb4e9d9a60f9cb65ebc/src/transformers/models/llama/modeling_llama.py#L896C1-L931C54
Expected behavior
x
The text was updated successfully, but these errors were encountered: