Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p(blank symbol) >> p(non-blank symbol) during NN-CTC training #4

Open
lifelongeek opened this issue Jun 24, 2015 · 1 comment
Open

Comments

@lifelongeek
Copy link

Hi all

I want to discuss some issue regarding training DNN/CNN-CTC for speech recognition. (Wall Street Journal Corpus). I modeled output unit as characters.

I observed that CTC objective function was increasing and finally converged during training.
image

But I also observed that final NN outputs have clear tendency : p(blank symbol) >> p(non-blank symbol) for all speech time frame as following figure

image

In Alex Graves' paper, trained RNN should have high p(non-blank) at some point like following figure
image

Do you have same situation when you train NN-CTC for sequence labeling problem? I am suspecting that the reason is I use MLP/CNN instead of RNN, but I can't clearly explain why this can be a reason.
Any idea about this result?

Thank you for reading my question.

@tbluche
Copy link

tbluche commented Feb 29, 2016

Hi,
I have quite the same experience with handwriting recognition.
I did some exploration of CTC training with different NNs during my PhD and the results are the following:

  • CTC training works especially well with RNNs
  • CTC training make NNs first learn to predict only blanks. It might take some time for relevant predictions to appear -> adaptive LR methods like RMSProp work very well to circumvent this issue
  • Maybe training 1 epoch with HMM Viterbi alignments before switching to CTC would help.. from scratch it might be hard to learn to align and transcribe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants