-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow training process same as issue #11 #28
Comments
And also I try some methos in NVIDIA/MinkowskiEngine#121, but it did not work as well |
For V100 speed being slower than 1080ti, use export OMP_NUM_THREADS=20 or lower. |
@chrischoy if it is ok, how long did you need to train a epoch on 3dmatch dataset, I need 1.5 hours to train on GTX1080, is it slow or that's a common speed? |
Yes, that is the usual speed. The default argument uses batch size = 4, which uses a fraction of GPU. Try to increase the batch size. Also, the codebase is not particularly optimized, but I think there are some parts that could be sped up significantly if you tune some hard negative mining parameters. |
@chrischoy before did you try some other PyTorch Spatially Sparse Convolution Library Like spconv(https://github.com/traveller59/spconv) or SparseConvNet(https://github.com/facebookresearch/SparseConvNet) , can this library speed up training procegress, thank you a lot |
No I haven't. There are several poorly written parts in data loader that take up huge resources and one of them is https://github.com/chrischoy/FCGF/blob/master/lib/data_loaders.py#L257 which uses parallel KD trees to create a large set of indices and tend to hog CPU resources. This is not really necessary to compute the loss since we can compute whether a correspondence is correct or not from the ground truth transformation. I was planning to replace this part with on-the-fly loss computation, but I didn't have much time and I just left it there. |
Hi Chris, |
@chrischoy @sjnarmstrong ,Thanks for your sharing. I tried your code on 3DMatch dataset using the default configuration and found the training process is very slow. Specifically it took about one and a half hour for one epoch. (as you mentioned in the paper, you trained FCGF for 100 epochs, which means more than one week in my configuration). The GPU memory it took is only less than 5000 MB and GPU utility is less than 10% but CPU utility is high. I wonder is it normal situation and what's the most time-consuming part ?but I use V100 to train the model. And also find the speed of training on GTX1080Ti is faster than it on a V100.
In Issue#11, I could not find the solution, so can you provide another way to solve this problew
Thanks a lot.
The text was updated successfully, but these errors were encountered: