why mwer use stop gradient? #2

Mddct · 2021-11-15T14:02:46Z

why mwer use stop gradient? just a regularization?

Mddct · 2021-11-15T14:57:45Z

why mwer use stop gradient? just a regularization?

May be Variance reduction

leixiaoning · 2021-12-03T12:30:06Z

i find tf ctc beam search will loss the gradients

TeaPoly · 2022-12-09T01:24:47Z

i find tf ctc beam search will loss the gradients

Beam search is just to find candidate paths, gradient is not required in beam search. Gradients are pushed back to logit weight since there are probability P which is computed from logit as input to MWER loss. NBEST path from CTC Beam search can actually be generated offline to speed up training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why mwer use stop gradient? #2

why mwer use stop gradient? #2

Mddct commented Nov 15, 2021

Mddct commented Nov 15, 2021

leixiaoning commented Dec 3, 2021

TeaPoly commented Dec 9, 2022 •

edited

Loading

why mwer use stop gradient? #2

why mwer use stop gradient? #2

Comments

Mddct commented Nov 15, 2021

Mddct commented Nov 15, 2021

leixiaoning commented Dec 3, 2021

TeaPoly commented Dec 9, 2022 • edited Loading

TeaPoly commented Dec 9, 2022 •

edited

Loading