Does it sometimes make sense to save the model with the best CER and assume that the lower CER will help the beam decoder make a better decision (even if the greedy decoder has a worse wer for a given cer)? Add the ability to switch what models to persist (best wer, cer, or loss).
Does it sometimes make sense to save the model with the best CER and assume that the lower CER will help the beam decoder make a better decision (even if the greedy decoder has a worse wer for a given cer)? Add the ability to switch what models to persist (best wer, cer, or loss).