reward turns to nan #2

iriscxy · 2018-03-22T11:24:09Z

As training moves on, the reward and loss all become 'nan'. Has this problem existed in your data?
A -> B
('[s]', 'Old power means the fossil ##AT##-##AT## nuclear energies : oil , natural gas , coal and uranium exploited in centralised , monopolistic energy systems supported by short ##AT##-##AT## term thinking politics .')
('[smid]', ' Interaktion Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Fachkompetenz Schecks')
r1= nan r2= nan rk= nan fw_loss= nan bw_loss= nan
A loss = nan B loss = nan

JCly-rikiu · 2018-03-26T09:00:18Z

Do the reward and loss become 'nan' all the time? At which step?
The 2 pre-trained NMT models impact the result a lot.
How about pre-training more or trying other data, and see what will happen?

wky9710 · 2018-04-09T17:25:41Z

I met the same problem...
I tried to pre-train on 20% data and for 100 epoches, and tried on a better dataset but still failed,
My tutor told me to change the learning rate, from 1e-3 to 1e-5, and it worked well longer than before, but still failed after about 200 steps, and now 1e-6 is running...
So does a lower learning rate really useful? It seems 1e-3 will fail very quickly(about 50 steps), and lower the lr will make it later.
And what does nan means? The language model fails? The NMT fails?
Thanks for your patience :)

yangkexin · 2018-04-10T03:17:34Z

I finally found that I met the same problem as you，when it trys to generate words in beam(), the new_hyp_scores turn to nan at about 1000 steps， then I changed the learning rate, from 1e-3 to 1e-5 as suggested above，it worked well longer than before, I think the result shows that the nmt model must be train more，and next step I want to change the optimizer，such as adam. If you find some useful methods, please tell me how to do it.Thank you :)

JCly-rikiu · 2018-04-16T07:01:45Z

We have tired adam before, but the result was bad.
We think maybe the reason is that adam changes learning rate constantly. And the loss of translation is not smooth, that makes training process out of control, so the loss can't decrease.

wky9710 · 2018-05-05T17:47:22Z

After several steps(about 20), with learning rate of 1e-6(maybe small enough... 1e-3 is also tried, and loss turned to nan after 2 steps, even before saving a model...), the loss turns to nan again...
I've tried to retrain the nmt model, for about 100w iters, with bleu of about 33.7 for modelA and 15.5 for modelB, but it just won't work...
Does it means that whether the method works heavily depends on the data? Or the nmt model?

JCly-rikiu · 2018-05-08T13:06:43Z

Yes, this method depends heavily on the data. We have read a review mentioned about that.
We think the problem of nan loss is probably due to the gradient exploding, but we didn't have the nan loss anymore after we changed the optimizer to SGD.

oceanypt · 2019-01-08T11:46:02Z

After several steps(about 20), with learning rate of 1e-6(maybe small enough... 1e-3 is also tried, and loss turned to nan after 2 steps, even before saving a model...), the loss turns to nan again...
I've tried to retrain the nmt model, for about 100w iters, with bleu of about 33.7 for modelA and 15.5 for modelB, but it just won't work...
Does it means that whether the method works heavily depends on the data? Or the nmt model?

I think the Nan problem comes from the reward calculation, because the reward is divided by std but std can be zero, so changing the reward form may solve the problem.

fuzihaofzh · 2019-07-25T12:16:03Z

I also meet with this problem. Has anyone found any method to solve this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reward turns to nan #2

reward turns to nan #2

iriscxy commented Mar 22, 2018

JCly-rikiu commented Mar 26, 2018

wky9710 commented Apr 9, 2018

yangkexin commented Apr 10, 2018

JCly-rikiu commented Apr 16, 2018

wky9710 commented May 5, 2018

JCly-rikiu commented May 8, 2018

oceanypt commented Jan 8, 2019

fuzihaofzh commented Jul 25, 2019

reward turns to nan #2

reward turns to nan #2

Comments

iriscxy commented Mar 22, 2018

JCly-rikiu commented Mar 26, 2018

wky9710 commented Apr 9, 2018

yangkexin commented Apr 10, 2018

JCly-rikiu commented Apr 16, 2018

wky9710 commented May 5, 2018

JCly-rikiu commented May 8, 2018

oceanypt commented Jan 8, 2019

fuzihaofzh commented Jul 25, 2019