voxforge.org
VoxForge Dev

root/Trunk/Scripts/Audio_scripts/AudioSegmentation/AudioBook.pm

Revision 2637, 25.9 kB (checked in by kmaclean, 4 months ago)

minor grammar fix

Line 
1 #! /usr/bin/perl
2 $VERSION = 0.2.2;
3
4 =head1 NAME
5
6 AudioBook - Convert a single transcribed audio file into audio segments
7
8 =cut
9
10 package AudioBook;
11 use strict;
12 use diagnostics;
13 use Carp;
14 use Getopt::Std;
15 use File::Basename;
16 use File::Copy;
17
18 use AudioBook::Audio;
19 use AudioBook::Text;
20 use AudioBook::Dictionary;
21
22 use AudioBook::Segments;
23 use AudioBook::MissingWords;
24 use AudioBook::MissingWords::CommandLine;
25 use AudioBook::Chapter;
26
27 =head1 SYNOPSIS
28
29  $./AudioBook -h                                                display help
30  $./AudioBook -a speechfile.wav -t text.txt             minimal run configuration
31  
32 =head1 REQUIREMENTS
33
34 =item 1 - Sequitor G2P trainable Grapheme-to-Phoneme converter (GPL v2; requires Python to be installed)
35
36         http://www-i6.informatik.rwth-aachen.de/web/Software/g2p.html
37
38 =item 2 - HTK Hidden Markov Model Toolkit (note: the source is "open" - i.e. you can read the code - but there are distribution restrictions)
39
40         http://htk.eng.cam.ac.uk/
41        
42         The HTK toolkit needs to be in your path
43         (see http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial/download)
44
45 =item 3 - Perl packages
46
47         Term::ReadLine::Gnu
48         Audio::Wav
49         Lingua::EN::Numbers
50         XML::LibXML
51         File::Copy
52
53 =head1 DESCRIPTION
54
55 This is a command line program that segments a speech audio file into 15 word (on average) speech segments. 
56 It is executable from the command line and uses the following configuration options to help in segmenting speech:
57
58         VoxForge Audio Segmentation Script Parameters
59         =============================================
60         -a      * audio file name (WAV format only)
61         -b      notify if beam width for Forced Alignment exceeds a certain level (default = 250)
62                 (does not set HVite's beam width parameter)
63         -d      pronunciation dictionary  (default = AudioBook/input_files/VoxforgeDict)
64         -h      show help
65         -i      interactive validation of missing word pronunciations
66         -l      LICENSE file (default = AudioBook/input_files/LICENCE)
67         -m      Target maximum sentence length (default = 20 words)
68         -p      Minimum pause for sentence break (default = 2000000 in units of 100ns)
69         -q      log words with single quotes (default = yes)
70         -r      README file (default = AudioBook/input_files/README)
71         -s      Average sentence length (default = 15 words)
72         -t      * text file name (containing transcriptions of speech in audio file)
73         -u      username or name you want file stats collected by on VoxForge Metrics
74                 page:   (http://www.voxforge.org/home/downloads/metrics)
75         -v      validate segment audio files to prompt text using forced Aligment
76         -w      validate missing word pronunciations to audio recordings
77         -x      unique tar file suffix (max 3 characters - remainder is truncated)
78         -S      run sanity test
79         -T      create gzipped/tar file
80
81         * minimum required for script to run
82
83
84 =head1 Suggested Segmentation Approach:
85
86 =head2 Step 1
87
88 Spell check the text for the audiobook and remove any mistakes or archiac spellings (its good to remove
89 these to ensure that the pronunciation dictionary does not get cluttered).
90
91
92 =head2 Step 2 - First Pass Forced Alignment - Getting it to Run Completely Without Errors
93
94 Execute the script as follows using only the audio file ('-a') and and text file ('-t') parameters:
95
96   $./AudioBook.pm -a audio -t eText.txt
97  
98 This tries to match the words in the text file with the words in the speech audio file, and create time stamps for each word. 
99 These time stamps are used to determine where pauses are located, and if the pause is large enough, it will create a segment
100 of the sentence, and put an entry into the prompts file.
101
102 =head3 NOTES
103
104 =head4 Text Does not Match Audio
105
106 The text file *must exactly* match the contents of the speech audio file (...well actually, it will run OK even if some words do not exactly
107 match... it only needs about 98-99% accuracy).
108
109 If there are any errors when you are trying to run the segmentation script for the first time on a new set of text and speech audio files,
110 the likely reason is that there is something in the text file that does not match what was said in the audio file.  Figuring this out usually
111 ends up being an iterative process (i.e. you fix an error, run the script, fix another error, ... until you get an error-free run).  Look
112 for non-alphanumeric characters and remove them from the text - like multiple dashes (---), multiple periods (...) - any weird non-alphanumeric
113 characters not being automatically removed by the script.
114
115 If there are a large divergence in the text from the speech audio, then you will have to manually listen to the speech audio to determine
116 where the biggest transcription errors lie, and then modify the original text file to match the speech audio file.  This may involves mistakes
117 (e.g the reader missing a line while reading the text) or formatting issues in the text (e.g. there might be columnar data in the text,
118 and it is read by column by the reader - you then need to rewrite the text to match how the reader read the passage).
119  
120 =head4 Dealing With Out-of-vocabulary Words
121
122 Forced Alignment is performed with HTK's HVite tool as part of the segmentation process.   Force Alignment simply means that the HTK tools
123 listens to the audio and looks up the most probable phone sequence in the pronunciation dictionary, and returns the word that corresponds
124 to this phone sequence.
125
126 HVite requires that each word in the text to be "forced aligned" to have a pronunciation entry in the pronunications lexicon.  The AudioBook.pm
127 script uses Sequitor G2P (trained on the VoxForge pronunciation lexicon) to provide draft pronunciations for Out-Of-Vocabulary words so that
128 the first pass forced alignment will work. 
129
130 This seems to be "good-enough" to find silences of reasonable lengths.  Using this information, the script can create a prompt entries and
131 corresponding audio segment. 
132
133 =head4 Segmenting Large Audio Files
134
135 For larger files (i.e. greater than 30 minutes of audio), you *may* need to manually split the audio file into 30 minute segments, with
136 corresponding text files.
137
138 =head4 Automatic Numeric Conversion
139
140 This script converts numbers to their word equivalent using these Perl packages:
141
142         Lingua::EN::Numbers qw(num2en num2en_ordinal);
143         Lingua::EN::Numbers::Years;
144
145 These packages make assumptions that need to be validated.  Usually 1, 2, and 3 digit numbers get processed OK. 
146 4 digit numbers can be pronounced a couple of ways, and should be checked.  For example, the script will converted
147 these numbers as follows:
148
149         converted number:7500: to seven thousand five hundred
150         converted number:8500: to eight thousand five hundred
151
152 But the actual pronunciation the user used is Seventy Five Hundred and Eighty Five Hundred.  These need to be
153 corrected manually.
154
155 This script makes the assumptin that 4 digit numbers between 1000 and 2100 are years - this needs to be validated.
156
157 =head2 Step 3 - First Pass Forced Alignment - Runs completely, but there are Errors
158
159 If the transcription errors are only minor, then the first pass forced alignment usually completes successfully.  However, you might
160 see "No tokens survived to final node of network at beam" errors in the HVite log (located in interim_files/logs).
161
162 You need to fix these errors by ensuring that the prompt text matches the prompt audio.
163
164 =head2 Step 4 - First Pass Forced Alignment - Verify the Segments
165
166 Get the script to perform a forced alignment on each of the segments, and display the worst 15 "average log likelihood per frame"
167 scores.  Then check the transcription and listen to the corresponding audio, and make corrections to the text, repeat as needed.
168
169 Run the script as follows:
170
171   $./AudioBook.pm -a audio -t eText.txt -v
172
173 The verify switch performs a forced alignment on the individual segments generated from the first pass forced alignment.  Low scores
174 (i.e. the lowest average log likelihood per frame score) indicate that the transcription text *might* not match the corresponding audio
175 file.  Look at the segment text and listen to the corresponding audo file to determine if they match.  If they do not match (they might
176 still match, but just have a low score), then fix the text in your original text transcription file, repeat this process (i.e. running
177 the AudioBook program again with the verify switch on) until you can get a clean run. 
178
179 =head2 Step 5 - First Pass Forced Alignment - Adjusting Prompt Length
180
181 After you can get the First Pass Forced Alignment to run without errors, check the AudioBook.log log file (in the output_files directory) and
182 review the length of the created prompts. If there are too many prompts over 30 words long (one or two prompts in the low 30s is passable...),
183 reduce the size of the pause ("-p" switch) and run First Pass Forced Alignment again - something like this:
184
185   $./AudioBook.pm -a audio -t eText.txt -v - p 1000000
186  
187 Continue making adjustments until you can get reasonable prompt lengths.
188
189 =head4 Note
190
191 The worst case scenario is that you cannot segment your audio because it does not have any pauses that are long enough to use for a
192 segment.  This is unlikely, given that people need to breath in every once in a while.  What will likely occur is that you will have a few
193 very long segments because the person spoke continuously for a long period of time.  You will probably have to segment these longer
194 prompts manually.
195
196 =head2 Step 6 - Validate Suggested Out-of-Vocabulary Word Pronunciations
197
198 The pronunciations generated by the Sequitor G2P scripts need to be manually reviewed before any they are added to the
199 pronunciation dictionary.  One way to do this is to use Speech Recognition to determine the phone set of the word in the actual audio file.
200 You can do this with the '-w' switch:
201
202   $./AudioBook.pm -a audio -t eText.txt -v - p 1000000 -w
203
204 The -w generates a report (MissingWords_combined) that contains a list of all the OOV words, with the speech segment ID and text
205 (so you can listen to the audio segment), the g2p recommended phone list, and HVite phone list recommendations (determined using speech
206 recognition), so you can manually validate the final pronunciations.
207
208 =head4 Note
209
210 That this approach is only as good as the acoustic model you are using.  The pronunciations still need to be validated against the Sequitor G2P recommended
211 pronunciations.  Please donate some speech to Voxforge to help improve our acoutic models.
212
213 =head2 Step 7 - Iteractive Missing Word Validation
214
215 You can also use the script interactively (using the -v switch) to review the Sequitor G2P suggested phone lists and HVite pronunciations.  It
216 is a simple command line script.
217
218 This mode requires the output (an xml version of the MissingWords_combined file called MissingWords.xml) from the -w switch (which needs the -v
219 switch).  This parameter uses the contents of the missingword.xml file to prompt the user to select or edit a suggested pronunciation.  Results
220 are placed in the MissingWords_final file, and if the -d switch is selected, then that dictionary will be updated with the results, like this:
221
222   $./AudioBook.pm -i -d /home/me/voxforge/VoxForgeDict
223
224 =head2 Step 8 - Validated Pronunciation Lexicon
225
226 If you are submitting your segmented audio to VoxForge, please include your *validated* Out-of-Vocabulary word pronunciations
227 with your submission as a separate file called: "OOV_pron.txt"
228
229 =head1 ALGORITHM
230
231 =head2 Audio Segmentation
232
233 This program tries to segments the speech audio file into 15 word sentences.  However, if the pause following the 15th word relative to the current
234 sentence start position is too short, the algorithm looks at the previous word (i.e. word 14) to see if it has a pause of suitable duration. 
235 If not, it then looks at word following the current start position (i.e. word 16), and so one until a pause of suitable
236 duration can be found, increasing the number of words to look behind and ahead each time.
237
238 The default pause duration is 2000000 in units of 100ns.  This can be changed (using the "-p" switch") if the speech audio file doesn't segment well
239 enough with this default.
240
241 =head2 Generating Out-of-Vocabulary Word Pronunciations
242
243 The script gets Sequitor to generate the 20 likeliest pronunciations for each OOV, and then add it to the dict file.  It then performs
244 another forced aligment on the audio segment containing the Out-of-Vocabulary word.  Hvite will take the sequence of phoneme sounds that it
245 recognizes and try to match it to one of the possible pronunciations in the dictionary.  We are therefore using the audio to automatically
246 help generate the correct pronunciation.  Because the VoxForge Aoustic models are not that accurate, these suggestions need to be validated
247 and compared with the pronunications generated by Sequitor, and a judgment call needs to be made to select the correct pronunciations.
248
249 =cut
250
251 ####################################################################
252 ### Class Variables
253 ####################################################################
254 our($opt_a,$opt_b,$opt_d,$opt_h,$opt_i,$opt_l,$opt_m,$opt_p,$opt_r,$opt_s,$opt_t,$opt_x,$opt_q,$opt_S,$opt_T,$opt_u,$opt_v,$opt_w); # need to define these because using strict.
255 my $self = {};
256 $self->{'debug'} = 0;
257 $self->{'g2p_model'} = "AudioBook/input_files/g2p/models/model-5";
258 $self->{'htk_files'} = "AudioBook/input_files/htk";
259 $self->{'log'} = "AudioBook/output_files/AudioBook_Log";
260 bless($self,"AudioBook");
261
262 my $default_average_sentence_length = 15;
263 my $default_max_sentence_length = 20;
264 my $default_min_pause_for_sentence_break = 2000000;
265 my $command;
266
267 ####################################################################
268 ### Main
269 ####################################################################
270 $self->getOptions();
271 if ($self->getInteractive) {
272         my $xmlfile = 'AudioBook/interim_files/MissingWords.xml';                       
273         my $missingWords = AudioBook::MissingWords::CommandLine->new($xmlfile);
274         $missingWords->interactive();
275 } else {
276         $self->cleanupFiles();
277         if ($self->getTesting) {
278                 $command = ("cp AudioBook/input_files/VoxForgeDict AudioBook/interim_files/VoxForgeDict"); print "cmd:$command\n" if $self->{'debug'} ; system($command);
279         }
280         $self->process();
281 }
282 print "completed!\n";
283
284 ####################################################################
285 ### Methods
286 ####################################################################
287
288 =head1 METHODS (not user accessible)
289
290 =head2 process
291
292 Segment the user designated speech audio file (-a) using the supplied text file (-t)
293
294 =cut
295
296 sub process {
297         my ($self)= @_;
298         my $tarSuffix = $self->{"tarSuffix"};
299         my $chapter = AudioBook::Chapter->new($self);
300         # need draft missing word pronunciations before audio can be processed
301         my $missingWords = $chapter->processText();
302         $chapter->processAudio();               
303        
304         my $segments = AudioBook::Segments->new($self,$chapter);
305         $segments->processAudio();     
306                        
307         if ($chapter->getMissingWordFound()) { 
308                 if ($self->getVerify_out_of_vocabulary_pronunciations()) {
309                         $missingWords->getAudio();
310                 }
311         }
312                
313         if (defined($tarSuffix)){
314                 _createTarFile($self);
315         }       
316 }
317
318 =head2 cleanupFiles
319
320 Removes any old files in the AudioBook/interim_files/ and AudioBook/output_files/ directories, prior to processing.
321
322 =cut
323
324 sub cleanupFiles {
325         my ($self)= @_;
326         if (defined(<AudioBook/interim_files/*>)) {
327                 unlink (<AudioBook/interim_files/*>);
328         }
329         if (defined(<AudioBook/interim_files/logs/*>)) {
330                 unlink (<AudioBook/interim_files/logs/*>);     
331         }
332         if (defined(<AudioBook/interim_files/missingWordsFolder/*>)) {
333                 unlink (<AudioBook/interim_files/missingWordsFolder/*>);       
334         }
335         if (defined(<AudioBook/interim_files/wav/*>)) {
336                 unlink (<AudioBook/interim_files/wav/*>);       
337         }       
338         if (defined(<AudioBook/output_files/wav/*>)) {
339                 unlink (<AudioBook/output_files/wav/*>);       
340         }
341         if (defined(<AudioBook/output_files/*>)) {     
342                 unlink (<AudioBook/output_files/*>);
343         }
344 }
345
346 =head2 _createTarFile
347
348 creates a GZipped Tar file form files contained in AudioBook/output_files
349
350 =item * -T      Switch to turn on this functionality
351
352 =item * -u      username used in name of the tar file
353
354 =cut
355
356 sub _createTarFile { # private
357         my ($self)= @_;
358         my $debug = $self->{'debug'};
359         my $username = $self->{"username"};
360         my $tarSuffix = $self->{"tarSuffix"};
361         my $readme = $self->{"README"};
362         my $license = $self->{"LICENSE"};
363        
364         my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
365         $year += 1900;
366         $mon = sprintf("%02d", $mon);
367         $mday = sprintf("%02d", $mday);
368         print "creating gzipped tar file:$username\-$year$mon$mday\-$tarSuffix\.tgz \n";
369         if (defined($readme)) {
370                 copy("$readme","AudioBook/output_files/README");
371         } else {
372                 print "Warning: no README file to copy\n";
373         }
374         if (defined($license)) {
375                 copy("$license","AudioBook/output_files/LICENSE");
376         } else {
377                 print "Warning: no LICENSE file to copy\n";
378         }
379         copy("AudioBook/interim_files/prompts","AudioBook/output_files/prompts");
380         $command = ("cp AudioBook/interim_files/wav/* AudioBook/output_files/wav/"); print "cmd:$command\n" if $debug; system($command);       
381         if ($debug) {
382                 $command = ("tar -zcvf $username\-$year$mon$mday\-$tarSuffix\.tgz AudioBook/output_files --exclude \"\.svn\" "); print "cmd:$command\n" if $debug; system($command);
383         } else {
384                 $command = ("tar -zcf $username\-$year$mon$mday\-$tarSuffix\.tgz AudioBook/output_files --exclude \"\.svn\" "); print "cmd:$command\n" if $debug; system($command);
385         }
386         print "please submit your tar file to: www.voxforge.org\n";     
387 }
388
389 sub _random_characters {
390         my ($length) = @_;     
391         my @chars=('a'..'z');
392         my $randomString;
393         foreach (1..$length){
394                 $randomString.=$chars[rand @chars];
395         }
396         return $randomString;
397 }
398
399 =head2 getOptions
400
401 Get the user submitted options ('a:b:d:hil:m:p:r:s:t:u:x:q:vwST')
402
403 =cut
404
405 sub getOptions {
406         my ($self)= @_;
407         my $debug = $self->{'debug'};   
408         getopts('a:b:d:hil:m:p:r:s:t:u:x:q:vwST');    #  sets $opt_* as a side effect.
409         if ($opt_h) {
410                 print "\nVoxForge Audio Segmentation Script Parameters\n";     
411                 print   "=============================================\n";     
412                 print "-a\t* audio file name (WAV format only)\n";
413                 print "-b\tnotify if beam width for Forced Alignment exceeds a certain level (default = 250)\n";
414                 print "\t(does not set HVite's beam width parameter)\n";
415                 print "-d\tpronunciation dictionary  (default = AudioBook/input_files/VoxforgeDict)\n";
416                 print "-h\tshow help\n";
417                 print "-i\tinteractive validation of missing word pronunciations\n";   
418                 print "-l\tLICENSE file (default = AudioBook/input_files/LICENCE)\n";
419                 print "-m\tTarget maximum sentence length (default = $default_max_sentence_length words)\n";
420                 print "-p\tMinimum pause for sentence break (default = $default_min_pause_for_sentence_break in units of 100ns)\n";             
421                 print "-q\tlog words with single quotes (default = yes)\n";             
422                 print "-r\tREADME file (default = AudioBook/input_files/README)\n";                             
423                 print "-s\tAverage sentence length (default = $default_average_sentence_length words)\n";                               
424                 print "-t\t* text file name (containing transcriptions of speech in audio file)\n";
425                
426                 print "-u\tusername or name you want file stats collected by on VoxForge Metrics \n";
427                 print "\tpage:\t(http://www.voxforge.org/home/downloads/metrics)\n";   
428                
429                 print "-v\tvalidate segment audio files to prompt text using forced Aligment\n";
430                 print "-w\tvalidate missing word pronunciations to audio recordings\n";         
431                 print "-x\tunique tar file suffix (max 3 characters - remainder is truncated)\n";
432                 print "-S\trun sanity test\n";         
433                 print "-T\tcreate gzipped/tar file\n";
434                 print "\n\t* minimum required for script to run\n";     
435                 print "\n";     
436                 print "--\n";                   
437                 print "Free Speech... Recognition\n";
438                 print "http://www.voxforge.org\n\n";
439                 exit;
440         } elsif ($opt_S) { # Sanity test switch
441                 $self->{'testing'} = 1;
442                 $self->{"audiofile"}="AudioBook/test/audio.wav";
443                 #$self->{"textFile"}="AudioBook/test/text-simple.txt";
444                 $self->{"textFile"}="AudioBook/test/text-original.txt";
445                 $self->{"pronDict"}="AudioBook/interim_files/VoxForgeDict";
446                 $self->{"tarSuffix"}=_random_characters(3);
447                 $self->{"username"}="test";
448                 $self->{"average_sentence_length"}= $default_average_sentence_length;
449                 $self->{"max_sentence_length"}= $default_max_sentence_length;
450                 $self->{"min_pause_for_sentence_break"}=$default_min_pause_for_sentence_break;
451                
452                 $self->{"log_single_quotes"}= 1;
453                 $self->{"verify_segments"}=1;   
454                 $self->{"verify_out_of_vocabulary_pronunciations"}=1;   
455                 $self->{"README"}="AudioBook/test/README";
456                 $self->{"LICENSE"}="AudioBook/test/LICENSE";
457         } elsif ($opt_a and $opt_t) {   
458                 if (-r $opt_a) {
459                         $self->{"audiofile"}=$opt_a;
460                 } else {
461                         die "can't open -a" . $self->{"audiofile"} . "\n";             
462                 }
463                 if (-r $opt_t) {
464                         $self->{"textFile"}=$opt_t;
465                 } else {
466                         die "can't open -t" . $self->{"textFile"} . "\n";               
467                 }
468                 if (defined($opt_d)) {
469                         if (-r $opt_d) {
470                                 $self->{"pronDict"}=$opt_d;
471                         } else {
472                                 die "can't open -d" . $self->{"pronDict"} . "\n";       
473                         }
474                 } else {
475                         $self->{"pronDict"}="AudioBook/input_files/VoxForgeDict";       
476                 }
477                 ### Audio Processing
478                 if ($opt_s) {
479                         $self->{"average_sentence_length"}=$opt_s;
480                 } else {
481                         $self->{"average_sentence_length"}= $default_average_sentence_length;   
482                 }
483                 if ($opt_m) {
484                         $self->{"max_sentence_length"}=$opt_m;
485                 } else {
486                         $self->{"max_sentence_length"}= $default_max_sentence_length;   
487                 }
488                 if ($opt_p) {
489                         $self->{"min_pause_for_sentence_break"}=$opt_p;
490                 } else {
491                         $self->{"min_pause_for_sentence_break"}= $default_min_pause_for_sentence_break;
492                 }       
493                 if ($opt_q) {
494                         if ($opt_v =~ /^n|no$/i){
495                                 $self->{"log_single_quotes"}= 0;
496                         } else {
497                                 $self->{"log_single_quotes"}= 1;       
498                         }
499                 } else {
500                         $self->{"log_single_quotes"}= 1;       
501                 }       
502                 if ($opt_b) {
503                         $self->{"beam_width"}=$opt_b;
504                 } else {
505                         $self->{"beam_width"}=250;     
506                 }
507                 if ($opt_v) {
508                         $self->{"verify_segments"}=1;
509                 } else {
510                         $self->{"verify_segments"}=0;   
511                 }       
512                 if ($opt_w) {
513                         $self->{"verify_out_of_vocabulary_pronunciations"}=1;
514                 } else {
515                         $self->{"verify_out_of_vocabulary_pronunciations"}=0;   
516                 }       
517                 ### Tar file processing
518                 if (defined($opt_T)) {
519                         if ($opt_x) {
520                                 $self->{"tarSuffix"}=substr($opt_x,0,3); # only use 1st 3 characters.                   
521                         }else {
522                                 $self->{"tarSuffix"}=_random_characters(3);
523                         }
524                         if ($opt_u) {
525                                 $self->{"username"}=$opt_u;     
526                         }else {
527                                 $self->{"username"}="anonymous";
528                         }       
529                         if ($opt_r) {
530                                 if (-r $opt_r) {
531                                         $self->{"README"}=$opt_r;       
532                                 } else {
533                                         die "can't open -r" . $self->{"README"} . "\n";                         
534                                 }
535                         } else {
536                                 $self->{"README"}="AudioBook/input_files/README";
537                         }               
538                         if ($opt_l) {
539                                 if (-r $opt_l) {
540                                         $self->{"LICENSE"}=$opt_l;     
541                                 } else {
542                                         die "can't open -l" . $self->{"LICENSE"} . "\n";                               
543                                 }
544                         } else {
545                                 $self->{"LICENSE"}="AudioBook/input_files/LICENSE";
546                         }
547                 }
548         } elsif ($opt_i) {
549                 if ($opt_i) {
550                         $self->{"interactive"}=1;
551                 }
552                 if (defined($opt_d)) {
553                         if (-r $opt_d) {
554                                 $self->{"pronDict"}=$opt_d;
555                         } else {
556                                 die "can't open -d" . $self->{"pronDict"} . "\n";       
557                         }
558                 } else {
559                         $self->{"pronDict"}="AudioBook/input_files/VoxForgeDict";       
560                 }
561         } else {
562                 print "\nVoxForge Audio Segmentation Script\n";
563                 print   "==================================\n";
564                 print "Parms -a and -t need to be defined. Use -h parameter for more information\n\n";
565                 print "--\n";                   
566                 print "Free Speech... Recognition\n";
567                 print "http://www.voxforge.org\n\n";
568                 exit;
569         }
570         print "audiofile:" . $self->{"audiofile"}. "\n" if $debug;
571         print "textFile:" . $self->{"textFile"}. "\n" if $debug;
572         print "pronDict:" . $self->{"pronDict"} . "\n\n" if $debug;     
573 }
574
575 =head2 Gettors
576
577 =item * getAverage_sentence_length()
578
579 =cut
580
581 sub getAverage_sentence_length {
582         my $self = shift;
583         return $self->{"average_sentence_length"};
584 }
585
586 =item * getInteractive()
587
588 =cut
589
590 sub getInteractive {
591         my $self = shift;
592         return $self->{"interactive"};
593 }
594
595 =item * getTesting()
596
597 =cut
598
599 sub getTesting {
600         my $self = shift;
601         return $self->{'testing'};
602 }
603
604 =item * getBeam_width()
605
606 =cut
607
608 sub getBeam_width {
609         my $self = shift;
610         return $self->{"beam_width"};
611 }
612
613 =item * getMax_sentence_length()
614
615 =cut
616
617 sub getMax_sentence_length {
618         my $self = shift;
619         return $self->{"max_sentence_length"};
620 }
621
622 =item * getMin_pause_for_sentence_break()
623
624 =cut
625
626 sub getMin_pause_for_sentence_break {
627         my $self = shift;
628         return $self->{"min_pause_for_sentence_break"};
629 }
630
631 =item * getLog_single_quotes()
632
633 =cut
634
635 sub getLog_single_quotes {
636         my $self = shift;
637         return $self->{"log_single_quotes"};
638 }
639
640
641 =item * getTextFile()
642
643 =cut
644
645 sub getTextFile {
646         my $self = shift;
647         return $self->{"textFile"};
648 }
649
650 =item * getAudiofile()
651
652 =cut
653
654 sub getAudiofile {
655         my $self = shift;
656         return $self->{"audiofile"};
657 }
658
659 =item * getUsername()
660
661 =cut
662
663 sub getUsername {
664         my $self = shift;
665         return $self->{"username"};
666 }   
667
668 =item * getLog()
669
670 =cut
671
672 sub getLog {
673         my $self = shift;
674         return $self->{"log"};
675 }
676
677
678 =item * getPronDict()
679
680 =cut
681
682 sub getPronDict {
683         my $self = shift;
684         return $self->{"pronDict"};
685 }       
686
687 =item * getHtk_files()
688
689 =cut
690
691 sub getHtk_files {
692         my $self = shift;
693         return $self->{'htk_files'};
694
695
696 =item * getG2p_model()
697
698 =cut
699
700 sub getG2p_model {
701         my $self = shift;
702         return $self->{'g2p_model'};
703 }
704
705 =item * getDebug()
706
707 =cut
708
709 sub getDebug {
710         my $self = shift;
711         return $self->{'debug'};
712 }
713
714 =item * getDebug()
715
716 =cut
717
718 sub getVerify_segments {
719         my $self = shift;
720         return $self->{'verify_segments'};
721 }
722
723 =item * getVerify_out_of_vocabulary_pronunciations()
724
725 =cut
726
727 sub getVerify_out_of_vocabulary_pronunciations {
728         my $self = shift;
729         return $self->{"verify_out_of_vocabulary_pronunciations"};
730 }
731                
732
733 =head1 Change Log   
734
735   2008/06/12 - 0.2.2 - created CommandLine class to permit interactive validation of missing word pronunciations
736   2008/06/1 - 0.2.1 - refacture to create Chapter, Segments & MissingWords classes
737   2008/06/09 - 0.2.1 - refacture to create Chapter, Segments & MissingWords classes
738   2008/05/02 - 0.2 - convert to class; major refacture ; renamed fullrun.pl to AudioBook.pm                                                       
739   2008/01/31 - 0.1 - created
740        
741 =head1 AUTHOR
742    
743   Ken MacLean
744   contact@voxforge.org
745      
746 =head1 COPYRIGHT AND LICENSE       
747      
748 Copyright (C) 2008 Ken MacLean
749    
750 This program is free software; you can redistribute it and/or
751 modify it under the terms of the GNU General Public License
752 as published by the Free Software Foundation; either version 2
753 of the License, or (at your option) any later version.
754    
755 This program is distributed in the hope that it will be useful,
756 but WITHOUT ANY WARRANTY; without even the implied warranty of
757 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
758 GNU General Public License for more details.
759
760 =cut
761
762 1;
Note: See TracBrowser for help on using the browser.