Ticket #153 (closed defect: invalid)

Opened 10 years ago

Last modified 10 years ago

htk error on step 10

Reported by: anonymous Owned by: kmaclean
Priority: major Milestone: Website 0.2
Component: Acoustic Model Version: 0.1-alpha
Keywords: Cc:

Description

Step 10 - Making Tied-State Triphones

============================================================== making hmm13

ERROR [+2662] AssignStructure?: cannot find tree for b state 2

FATAL ERROR - Terminating program C:\cygwin\HTK\htk-3.3-windows-binary\htk\HHEd.exe

Attachments

debug.rar (35.4 KB) - added by kmaclean 10 years ago.
log files

Change History

comment:1 Changed 10 years ago by kmaclean

  • Status changed from new to assigned

Hi Lucian,

Please zip your logs directory created when you ran the HTK_Compile_Model.sh script and attach it to this ticket.

I'm most interested in the contents of the "logs/Step10_HHed_hmm13.log" log file. But HTK allows you to make an an error in an earlier step, and the error only causes problems in a later step, so I might need all the log files.

Are you trying anything different from the steps in the Howto? Did you add any new prompts?

thanks,

Ken

comment:2 Changed 10 years ago by lucianbancescu@…

Hi Ken,

I'm trying to train a Romanian acoustic model, and since I couldn't find a suitable lexicon for Romanian I created a lexicon file of about 330 words that I'm attaching here.

So right now I have a prompts file of about 43 lines (and 43 wav files) (see attachment)

I recorded all the words into wav files and i processed them using HTK_compile_model script. I also run manually (step by step) and got the same results.

First I suspected that this is happening because I didn't balanced my prompts file with all English phonemes, but I also got the same result when i recorded example files with my voice ("CALL STEVE YOUNG...").

The audio files are correctly recorded (low noise, low echo, correct sample rate, mono, etc.)

I can send you the wav files as well, if this is of any help.

I'm using VoxForge? under Cygwin.

Is this happening because some words contain phonemes that don't exist in English (e.g. Romanian has a voiceless unrounded central vowel as in ROM_A_N).

Thanks in advance

Lucian

PS. lexicon file contains some English words that I added to see if any change after the prompts are phonetically balanced

Ah! and it's not possible to upload files so I took the liberty of sending the files to your email address.

comment:3 Changed 10 years ago by kmaclean

Hi Lucian,

Sorry, I forgot to set some permissions for anonymous uploads - it's fixed now.

Is this happening because some words contain phonemes that don't exist in English (e.g. Romanian has a voiceless unrounded central vowel as in ROM_A_N).

This might be part of the problem.

Since you are trying to create a non-English Acoustic Model, you don't need to do step 10 - try using Julius to recognize with you Step 9 acoustic models.

Step 10 tries to create *unseen* triphones based on the complete *English* dictionary. What this means is that from steps 1 to 9, you are creating acoustic models based on the words in your prompts file (in your case these are in Romanian). Step 10 uses this data to create a best guess of pronunciations of words that were not in the prompts file (and which do not have any corresponding audio data). Since you used Romanian prompts for steps 1 to 9, you would get problems with step 10 because the VoxForge? script uses an English dictionary, which has a completely different set of phonemes. You would need a larger dictionary of *Romanian* words (that are not included in your prompts file) for Step 10 to be of any use to you. Since you don't, you should try recognition with you Step 9 acoustic models (hmm12/hmmdef and hmm12/macros).

Where did you get you phoneme list - are you using a modified IPA dictionary?

In addition to requiring a larger dictionnary in the target language, Step 10 is a little tricky in that you need to use 'questions' (QS) to figure out the missing triphones in the tree.hed file. These questions are language specific, and I have not really dug into it to understand exactly what it is doing. The HTK book and/or mailing list archives might provide some info on this.

hope this helps,

Ken

Changed 10 years ago by kmaclean

log files

comment:4 Changed 10 years ago by kmaclean

  • Status changed from assigned to closed
  • Resolution set to invalid
  • Milestone set to Website 0.2

Not a bug ...

Note: See TracTickets for help on using tickets.