Distributed Learning with Queue and pytorch-lightning #1096
Replies: 2 comments
-
Thanks @nicoloesch for opening the discussion and especially for sharing your findings in the note! Pinging @hsyang1222 in case he has any comments. |
Beta Was this translation helpful? Give feedback.
-
Q. Does the Queue require the DistributedSampler in the case of DDP, if the overarching DataLoader encompassing the torchio.Queue is already "equipped" with it? I am going to implement pytorch-lightining-support DDP dataloader in torchio.Queue. When I done, I will notice you. |
Beta Was this translation helpful? Give feedback.
-
Hi everyone,
I am currently preparing my model and my data loading for DDP. Luckily, this is rather straight-forward in
pytorch-lightning
. However, I stumbled upon a mismatch between the functionality ofpytorch-lightning
and theQueue
w.r.t. toDistributedSampler
.The documentation of
pytorch-lightning
mentions the following:Under the hood, they reinstantiate the
Dataloader
with theDistributedSampler
automatically meaning that theDataLoader
has now the respective sampler and all other attributes as well.However, in the documentation of the
torchio.Queue
. it is mentioned that the Queue requires theDistributedSampler
in the case of DDP.My question now is the following: Does the
Queue
require theDistributedSampler
in the case of DDP, if the overarchingDataLoader
encompassing thetorchio.Queue
is already "equipped" with it? Or should I instantiate it twice (once myself and once automatically bypytorch-lightning
), which I feel is probably not the right approach for it?I tried to follow the path of the respective
DistributedSampler
s andDataLoader
s but my knowledge about data loading reached its limit, so I am seeking for advise from someone who has more experience in this!Thank you in advance for your help!
Nico
Edit
It appears that the
pytorch_lightning.Trainer
has the flaguse_distributed_sampler
, which determines if the respectiveDataLoader
should be wrapped or not. As a result, I will continue with setting this toFalse
and use theDistributedSampler
in thetorchio.Queue
as required!Beta Was this translation helpful? Give feedback.
All reactions