Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Error after 2nd tick #15

Open
rmbwalsh opened this issue May 28, 2020 · 2 comments
Open

Memory Error after 2nd tick #15

rmbwalsh opened this issue May 28, 2020 · 2 comments

Comments

@rmbwalsh
Copy link

Hi, congrats on this code! I'm training a model with 768x1280 dataset with this command:

nohup python run_training.py --num-gpus=1 --data-dir=./dataset --config=config-f --dataset=stainedglass1 --mirror-augment=true --metric=none --total-kimg=20000 --min-h=5 --min-w=3 --res-log2=8

I'm then getting this error after the second tick running this fork:

Traceback (most recent call last): File "run_training.py", line 218, in <module> main() File "run_training.py", line 213, in main run(**vars(args)) File "run_training.py", line 136, in run dnnlib.submit_run(**kwargs) File "/home/rmbwalsh/stylegan-skyflynil/stylegan2/dnnlib/submission/submit.py", line 343, in submit_run return farm.submit(submit_config, host_run_dir) File "/home/rmbwalsh/stylegan-skyflynil/stylegan2/dnnlib/submission/internal/local.py", line 22, in submit return run_wrapper(submit_config) File "/home/rmbwalsh/stylegan-skyflynil/stylegan2/dnnlib/submission/submit.py", line 280, in run_wrapper run_func_obj(**submit_config.run_func_kwargs) File "/home/rmbwalsh/stylegan-skyflynil/stylegan2/training/training_loop.py", line 349, in training_loop grid_fakes = Gs.run(grid_latents, grid_labels, is_validation=True, minibatch_size=sched.minibatch_gpu) File "/home/rmbwalsh/stylegan-skyflynil/stylegan2/dnnlib/tflib/network.py", line 433, in run out_arrays = [np.empty([num_items] + expr.shape.as_list()[1:], expr.dtype.name) for expr in out_expr] File "/home/rmbwalsh/stylegan-skyflynil/stylegan2/dnnlib/tflib/network.py", line 433, in <listcomp> out_arrays = [np.empty([num_items] + expr.shape.as_list()[1:], expr.dtype.name) for expr in out_expr] MemoryError

I also got this error on another attempt:

MemoryError: Unable to allocate 450. MiB for an array with shape (40, 3, 1280, 768) and data type float32

Anyone had anything similar?

@Oranging1
Copy link

Oranging1 commented May 29, 2020

I've met the same problem, have you solved it?

@rmbwalsh
Copy link
Author

Yes. I was running this model on a Google Cloud GPU - It need the CPU RAM to be quite a bit larger than the GPU RAM. I was running a 16gb RAM GPU and I changed my virtual machine's CPU config to have 30gb of CPU RAM. The model ran fine then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants