Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED #7

Open
buptgxt opened this issue Nov 13, 2020 · 1 comment
Open

RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED #7

buptgxt opened this issue Nov 13, 2020 · 1 comment

Comments

@buptgxt
Copy link

buptgxt commented Nov 13, 2020

Traceback (most recent call last):
File "main.py", line 116, in
main(args)
File "main.py", line 38, in main
trainer.train(c_ids, q_ids, a_ids, start_positions, end_positions)
File "/home/2018/Info-HCVAE-master/vae/trainer.py", line 35, in train
loss.backward()
File "/home/2018/anaconda3/envs/transformers/lib/python3.6/site-packages/torch/tensor.py", line 150, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/2018/anaconda3/envs/transformers/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

cuda:9.0 cudnn:7
python3.6 pytorch1.3

thank you for your work! I am trying to train the model but I’ve got an error that is “RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED”. I got this problem when this part of code runs: loss.backward()
Can you help me to solve it?

@seanie12
Copy link
Owner

seanie12 commented Feb 5, 2021

I think it might be due to the GPU memory issue.

Try using smaller batch size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants