You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the Transfer-Learning of the Marylux-648 dataset with the Glow-TTS -LJSpeech model, a runtime error NaN (not a number) is issued at epoch 145.
Traceback (most recent call last):
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/trainer.py", line 1007, in fit
self._fit()
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/trainer.py", line 992, in _fit
self.train_epoch()
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/trainer.py", line 820, in train_epoch
_, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/trainer.py", line 690, in train_step
outputs, loss_dict_new, step_time = self._optimize(
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/trainer.py", line 601, in _optimize
outputs, loss_dict = self._model_train_step(batch, model, criterion)
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/trainer.py", line 560, in _model_train_step
return model.train_step(*input_args)
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/tts/models/glow_tts.py", line 381, in train_step
loss_dict = criterion(
File "/home/mbarnig/COQUI_TTS_0.5.0/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/tts/layers/losses.py", line 437, in forward
raise RuntimeError(f" [!] NaN loss with {key}.")
RuntimeError: [!] NaN loss with loss.
The training continued by restoring the latest checkpoint, but 16 epochs later the same error appeared.
The text was updated successfully, but these errors were encountered:
File "/home/mbarnig/COQUI_TTS_0.5.0/TTS/TTS/tts/layers/losses.py", line 437, in forward
raise RuntimeError(f" [!] NaN loss with {key}.")
RuntimeError: [!] NaN loss with loss.
During the Transfer-Learning of the Marylux-648 dataset with the Glow-TTS -LJSpeech model, a runtime error NaN (not a number) is issued at epoch 145.
The training continued by restoring the latest checkpoint, but 16 epochs later the same error appeared.
The text was updated successfully, but these errors were encountered: