Ticket #481 (new enhancement)
add MFCC_0_D_A model type
| Reported by: | kmaclean | Owned by: | kmaclean |
|---|---|---|---|
| Priority: | major | Milestone: | Acoustic Model 0.1.2 |
| Component: | Acoustic Model | Version: | Acoustic Model 0.1.1 |
| Keywords: | Cc: |
Description
from this post
Hi!
We (simon) have basically been using the voxforge script (transformed to C++ code) to create the speech model from the users input files.
Yesterday, we got a suggestion to switch the model type to MFCC_0_D_A (which means that the model uses 39 features instead of just 25). According to someone at the SPSC Graz this would especially improve the model when the training data uses more than one microphone.
Moreover, he suggested to use HHEds MU command to add more GMMs to the final model. I implemented the suggestions in simon and the improvement in recognition rate was drastic (in my tests).
Maybe you could try to change the model creation procedure for the voxforge model and see if this improves recognition rates there as well?
Steps to take if you want to try it out:
Change your model type to MFCC_0_D_A, adjust your prototype to use 39 features and add a few new steps after hmm15:
Use HHEd like this:
HHEd -A -D -T 1 -H hmm15/macros -H hmm15/hmmdefs -M hmm16 gmm1.hed tiedlist
Where gmm1.hed contains:
MU 4 {*.state[2-4].mix}
Re-estimate hmm16 twice, and repeat (technically for as long as you see recognition rates improve).
You can find the simon implementation here:
Greetings,
Peter