Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single GPU Training problem #8

Open
GeoVectorMatrix opened this issue Aug 16, 2022 · 2 comments
Open

Single GPU Training problem #8

GeoVectorMatrix opened this issue Aug 16, 2022 · 2 comments

Comments

@GeoVectorMatrix
Copy link

GeoVectorMatrix commented Aug 16, 2022

Traceback (most recent call last):
File "/home/Prjs/ECCV22-PointMixer-main/sem_seg/train_pl.py", line 157, in
cli_main()
File "/home/Prjs/ECCV22-PointMixer-main/sem_seg/train_pl.py", line 150, in cli_main
trainer.fit(model, train_loader, val_loader)
File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 738, in fit
self._call_and_handle_interrupt(
File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 683, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 773, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1195, in _run
self._dispatch()
.
.
.
.

raise RuntimeError("Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

@GeoVectorMatrix
Copy link
Author

Traceback (most recent call last): File "/home/Prjs/ECCV22-PointMixer-main/sem_seg/train_pl.py", line 157, in cli_main() File "/home/Prjs/ECCV22-PointMixer-main/sem_seg/train_pl.py", line 150, in cli_main trainer.fit(model, train_loader, val_loader) File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 738, in fit self._call_and_handle_interrupt( File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 683, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 773, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "/home/anaconda3/envs/pointmixer/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1195, in _run self._dispatch() . . . .

raise RuntimeError("Default process group has not been initialized, " RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

Solved by
megvii-model/YOLOF#11 (comment)

I am not sure whether this is the right solution, but it works

@LifeBeyondExpectations
Copy link
Owner

LifeBeyondExpectations commented Aug 18, 2022

I did not yet check the full code implementation.
Especially, the current code for semseg utilizes the old version of pytorchlightning, which could be the reason of this issue.

After I check the overall code equipped with the newest pytorchlightning,
then I will close the session.
Thanks for letting me know the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants