Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: [Errno 24] Too many open files #42

Open
liyw420 opened this issue May 18, 2024 · 2 comments
Open

OSError: [Errno 24] Too many open files #42

liyw420 opened this issue May 18, 2024 · 2 comments

Comments

@liyw420
Copy link

liyw420 commented May 18, 2024

Hello everyone. Thanks for the great work. I run the command python scripts/n3v2blender.py data/N3V/$scene_name under the same environment on Readme. However, I encounterd the error of "too many open files" during the training process. Could anyone provide some suggestions on solving this problem?

Using /home/vincent/.cache/torch_extensions/py38_cu116 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/vincent/.cache/torch_extensions/py38_cu116/diff_gaussian_rasterization/build.ninja...
Building extension module diff_gaussian_rasterization...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module diff_gaussian_rasterization...
Optimizing output/N3V/cut_roasted_beef
Output folder: output/N3V/cut_roasted_beef [18/05 22:33:30]
Tensorboard not available: not logging progress [18/05 22:33:30]
Found transforms_train.json file, assuming Blender data set! [18/05 22:33:30]
Reading Training Transforms [18/05 22:33:30]
100%|████████████████████████████████████████████████████████| 5700/5700 [00:00<00:00, 14028.13it/s]
Reading Test Transforms [18/05 22:33:30]
100%|██████████████████████████████████████████████████████████| 300/300 [00:00<00:00, 15703.12it/s]
Loading Training Cameras [18/05 22:33:31]
Loading Test Cameras [18/05 22:33:31]
Number of points at initialisation :  300000 [18/05 22:33:32]
100%|█████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00,  8.45it/s]
100%|█████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00,  8.80it/s]
[ITER 500] Evaluating train: L1 0.02834104597568512 PSNR 27.286881637573245 [18/05 22:36:30]
100%|█████████████████████████████████████████████████████████████| 300/300 [00:34<00:00,  8.75it/s]
100%|█████████████████████████████████████████████████████████████| 300/300 [00:34<00:00,  8.74it/s]
[ITER 500] Evaluating test: L1 0.02064640684053302 PSNR 29.6612756729126 [18/05 22:37:04]

[ITER 500] Saving best checkpoint [18/05 22:37:04]
Training progress:   2%| | 670/30000 [04:05<1:41:48,  4.80it/s, Loss=0.0106492, PSNR=27.73, Ll1=0.03Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Training progress:   2%| | 680/30000 [04:08<1:45:51,  4.62it/s, Loss=0.0125025, PSNR=24.48, Ll1=0.03Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Training progress:   2%| | 690/30000 [04:09<1:34:29,  5.17it/s, Loss=0.0123019, PSNR=24.30, Ll1=0.03Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/queues.py", line 239, in _feed
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 359, in reduce_storage
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/reduction.py", line 198, in DupFd
  File "/home/vincent/ProgramFiles/miniconda3/envs/fudan4dgs/lib/python3.8/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files```
@JasonLSC
Copy link

I met the same problem

@JasonLSC
Copy link

Hi, it seems that the code below might help:
torch.multiprocessing.set_sharing_strategy('file_system')
You could put it into the train.py before the training loop and give it a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants