| 1 |
|
|---|
| 2 |
$VERSION = 0.2; |
|---|
| 3 |
|
|---|
| 4 |
=head1 NAME |
|---|
| 5 |
|
|---|
| 6 |
AudioBook - Convert a single transcribed audio file into 15 word audio segments (approximately) |
|---|
| 7 |
|
|---|
| 8 |
=cut |
|---|
| 9 |
|
|---|
| 10 |
package AudioBook; |
|---|
| 11 |
use strict; |
|---|
| 12 |
use diagnostics; |
|---|
| 13 |
use Carp; |
|---|
| 14 |
use Getopt::Std; |
|---|
| 15 |
use File::Basename; |
|---|
| 16 |
use File::Copy; |
|---|
| 17 |
use lib '/home/kmaclean/VoxForge-dev/Main/Scripts/Audio_scripts/AudioSegmentation'; |
|---|
| 18 |
use AudioBook::Audio; |
|---|
| 19 |
use AudioBook::Text; |
|---|
| 20 |
use AudioBook::Dictionary; |
|---|
| 21 |
|
|---|
| 22 |
=head1 SYNOPSIS |
|---|
| 23 |
|
|---|
| 24 |
$./AudioBook -h display help |
|---|
| 25 |
$./AudioBook -a speechfile.wav -t text.txt minimal run configuration |
|---|
| 26 |
|
|---|
| 27 |
=head1 DESCRIPTION |
|---|
| 28 |
|
|---|
| 29 |
This is a command line program that segments a speech audio file into 15 word (on average) speech segments. |
|---|
| 30 |
It is executable from the command line and uses the following configuration options to help in segmenting speech: |
|---|
| 31 |
|
|---|
| 32 |
-a * audio file name (WAV format only) |
|---|
| 33 |
-b notify if beam width for Forced Alignment exceeds a certain level (default = 250) |
|---|
| 34 |
(does not set HVite's beam width parameter) |
|---|
| 35 |
-d pronunciation dictionary (default = AudioBook/input_files/VoxforgeDict) |
|---|
| 36 |
-h show help |
|---|
| 37 |
-l LICENSE file (default = AudioBook/input_files/LICENCE) |
|---|
| 38 |
-m Target maximum sentence length (default = 20 words) |
|---|
| 39 |
-p Minimum pause for sentence break (default = 2000000 in units of 100ns) |
|---|
| 40 |
-q log words with single quotes (default = yes) |
|---|
| 41 |
-r README file (default = AudioBook/input_files/README) |
|---|
| 42 |
-s Average sentence length (default = 15 words) |
|---|
| 43 |
-t * text file name (containing transcriptions of speech in audio file) |
|---|
| 44 |
-u username or name you want file stats collected by on VoxForge Metrics |
|---|
| 45 |
page: (http://www.voxforge.org/home/downloads/metrics) |
|---|
| 46 |
-v validate segment audio files to prompt text using forced Aligment |
|---|
| 47 |
-w validate missing word pronunciations to audio recordings |
|---|
| 48 |
-x unique tar file suffix (max 3 characters - remainder is truncated) |
|---|
| 49 |
-S run sanity test |
|---|
| 50 |
-T create gzipped/tar file |
|---|
| 51 |
|
|---|
| 52 |
* required for script to run |
|---|
| 53 |
|
|---|
| 54 |
|
|---|
| 55 |
=head1 NOTES |
|---|
| 56 |
|
|---|
| 57 |
=head3 Text Does not Match Audio |
|---|
| 58 |
|
|---|
| 59 |
If the contents of the text file do not *exactly* match the contents of the speech audio file, the segmentation process necessarily becomes |
|---|
| 60 |
a manual, iterative process. |
|---|
| 61 |
|
|---|
| 62 |
If there are a large divergence in the text from the speech audio, then you will have to manually listen to the speech audio to determine |
|---|
| 63 |
where the biggest transcription errors lie, and then modify the original text file to match the speech audio file. |
|---|
| 64 |
|
|---|
| 65 |
If the transcription errors are minor, then the first pass forced alignment usually completes successfully. However, if you see "No tokens survived to final node of network at beam" errors in the |
|---|
| 66 |
HVite log (located in interim_files/logs), then using the "-v" verify switch might be helpful in determining where transcription problems |
|---|
| 67 |
might exist. |
|---|
| 68 |
|
|---|
| 69 |
The verify switch performs a forced alignment on the individual segments generated from the first pass forced alignment. Low scores |
|---|
| 70 |
(i.e. the lowest average log likelihood per frame score) indicate that the transcription text might not match the corresponding audio |
|---|
| 71 |
file. Look at the segment text and listen to the corresponding audo file to determine if they match. If they do not match, then fix the |
|---|
| 72 |
text in your original text transcription file, repeat this process (i.e. running the AudioBook program again with the verify switch on) |
|---|
| 73 |
until you can get a clean run. |
|---|
| 74 |
|
|---|
| 75 |
=head3 Segmenting large audio files |
|---|
| 76 |
|
|---|
| 77 |
For larger files (i.e. greater than 30 minutes of audio), you *may* need to manually segment the audio file into 30 minute segments. |
|---|
| 78 |
|
|---|
| 79 |
=head3 Automatically Adding Out-of-Vocabulary Words to Pronunciation Dictionary |
|---|
| 80 |
|
|---|
| 81 |
The pronunciations generated by the Sequitor G2P scripts need to be manually reviewed before any new pronunciations are added to the |
|---|
| 82 |
pronunciation dictionary. Make sure you review the pronunciation before commiting these changes to SVN. |
|---|
| 83 |
|
|---|
| 84 |
=head1 REQUIREMENTS |
|---|
| 85 |
|
|---|
| 86 |
=item 1 - Sequitor G2P trainable Grapheme-to-Phoneme converter (GPL v2; requires Python to be installed) |
|---|
| 87 |
|
|---|
| 88 |
http://www-i6.informatik.rwth-aachen.de/web/Software/g2p.html |
|---|
| 89 |
|
|---|
| 90 |
=item 2 - HTK Hidden Markov Model Toolkit (note: the source is "open", but there are distribution restrictions) |
|---|
| 91 |
|
|---|
| 92 |
http://htk.eng.cam.ac.uk/ |
|---|
| 93 |
|
|---|
| 94 |
The HTK toolkit needs to be in your path (see http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial/download) |
|---|
| 95 |
|
|---|
| 96 |
=head1 ALGORITHM |
|---|
| 97 |
|
|---|
| 98 |
This program tries to segments the speech audio file into 15 word sentences. However, if the pause following the 15th word relative to the current |
|---|
| 99 |
sentence start position is too short, the algorithm looks at the previous word (i.e. word 14) to see if it has a pause of suitable duration. |
|---|
| 100 |
If not, it then looks at word following the current start position (i.e. word 16), and so one until a pause of suitable |
|---|
| 101 |
duration can be found, increasing the number of words to look behind and ahead each time. |
|---|
| 102 |
|
|---|
| 103 |
The default pause duration is 2000000 in units of 100ns. This can be changed (using the "-p" switch") if the speech audio file does segment well |
|---|
| 104 |
enough with this default. |
|---|
| 105 |
|
|---|
| 106 |
=cut |
|---|
| 107 |
|
|---|
| 108 |
#################################################################### |
|---|
| 109 |
### Class Variables |
|---|
| 110 |
#################################################################### |
|---|
| 111 |
our($opt_a,$opt_b,$opt_d,$opt_h,$opt_l,$opt_m,$opt_p,$opt_r,$opt_s,$opt_t,$opt_x,$opt_q,$opt_S,$opt_T,$opt_u,$opt_v,$opt_w); # need to define these because using strict. |
|---|
| 112 |
my %self; |
|---|
| 113 |
$self{'debug'} = 0; |
|---|
| 114 |
$self{'g2p_model'} = "AudioBook/input_files/g2p/models/model-5"; |
|---|
| 115 |
$self{'htk_files'} = "AudioBook/input_files/htk"; |
|---|
| 116 |
$self{'log'} = "AudioBook/output_files/AudioBook_Log"; |
|---|
| 117 |
my $self=\%self; |
|---|
| 118 |
bless($self,"AudioBook"); |
|---|
| 119 |
|
|---|
| 120 |
my $default_average_sentence_length = 15; |
|---|
| 121 |
my $default_max_sentence_length = 20; |
|---|
| 122 |
my $default_min_pause_for_sentence_break = 2000000; |
|---|
| 123 |
my $command; |
|---|
| 124 |
|
|---|
| 125 |
#################################################################### |
|---|
| 126 |
### Main |
|---|
| 127 |
#################################################################### |
|---|
| 128 |
$self->cleanupFiles(); |
|---|
| 129 |
$self->getOptions(); |
|---|
| 130 |
$self->process(); |
|---|
| 131 |
print "completed!\n"; |
|---|
| 132 |
|
|---|
| 133 |
#################################################################### |
|---|
| 134 |
### Methods |
|---|
| 135 |
#################################################################### |
|---|
| 136 |
|
|---|
| 137 |
=head1 METHODS (not user accessible) |
|---|
| 138 |
|
|---|
| 139 |
=head2 process |
|---|
| 140 |
|
|---|
| 141 |
Segement the user designated speech audio file (-a) sing the supplied text file (-t) |
|---|
| 142 |
|
|---|
| 143 |
=cut |
|---|
| 144 |
|
|---|
| 145 |
sub process { |
|---|
| 146 |
my ($self)= @_; |
|---|
| 147 |
my $debug = $self->{'debug'}; |
|---|
| 148 |
my $audiofile = $self->{"audiofile"}; |
|---|
| 149 |
my $textfile = $self->{"textfile"}; |
|---|
| 150 |
my $username = $self->{"username"}; |
|---|
| 151 |
my $tarSuffix = $self->{"tarSuffix"}; |
|---|
| 152 |
my $pronDict = $self->{"pronDict"}; |
|---|
| 153 |
my $htk_files = $self->{'htk_files'}; |
|---|
| 154 |
my $log = $self{'log'}; |
|---|
| 155 |
my $dict = "AudioBook/interim_files/dict"; |
|---|
| 156 |
my $originalDict = "AudioBook/interim_files/originalDict"; |
|---|
| 157 |
my $altDict = "AudioBook/interim_files/altDict"; |
|---|
| 158 |
my $prompts = "AudioBook/interim_files/prompts"; |
|---|
| 159 |
|
|---|
| 160 |
my $tempPronDict = "AudioBook/interim_files/pronDict"; |
|---|
| 161 |
copy($pronDict,$tempPronDict); |
|---|
| 162 |
|
|---|
| 163 |
my $textContents = AudioBook::Text->new($self,$textfile); |
|---|
| 164 |
$textContents->createWLISTFile("AudioBook/interim_files/wlist"); |
|---|
| 165 |
|
|---|
| 166 |
my $dictionary = AudioBook::Dictionary->new($self); |
|---|
| 167 |
my $missingwordfound = $dictionary->findOutOfVocabularyWords($pronDict,"AudioBook/interim_files/MissingWords"); |
|---|
| 168 |
if ($missingwordfound) { |
|---|
| 169 |
$dictionary->getRecommendedPronunciations("AudioBook/interim_files/MissingWords_out"); |
|---|
| 170 |
$dictionary->updatePronDict($tempPronDict); |
|---|
| 171 |
copy($dict,$originalDict); |
|---|
| 172 |
|
|---|
| 173 |
|
|---|
| 174 |
$command = ("HDMan -A -D -T 1 -g $htk_files/global.ded -m -w AudioBook/interim_files/wlist -i -l AudioBook/interim_files/dlog $dict $tempPronDict"); system($command) == 0 or confess "fullrun $command failed: $?"; |
|---|
| 175 |
$command = ("mv AudioBook/interim_files/dlog AudioBook/interim_files/logs/dlog2"); print "cmd:$command\n" if $debug; system($command); |
|---|
| 176 |
|
|---|
| 177 |
} else { |
|---|
| 178 |
open(LOG,">>$log") or confess ("cannot open AudioBook/output_files/MissingWords file"); |
|---|
| 179 |
print LOG "\nMissing Words that need to be added to Pronunciation Dictionary, with suggested pronunciations::\n"; |
|---|
| 180 |
print LOG "------------------------------------------------\n"; |
|---|
| 181 |
print LOG "no missing words\n"; |
|---|
| 182 |
close LOG |
|---|
| 183 |
} |
|---|
| 184 |
|
|---|
| 185 |
|
|---|
| 186 |
|
|---|
| 187 |
my $audio = AudioBook::Audio->new($self); |
|---|
| 188 |
$audio->segment($audiofile,$textContents); |
|---|
| 189 |
if ($self->{"verify_segments"}) { |
|---|
| 190 |
$audio->verifySegments; |
|---|
| 191 |
} |
|---|
| 192 |
if ($missingwordfound) { |
|---|
| 193 |
if ($self->{"verify_out_of_vocabulary_pronunciations"}) { |
|---|
| 194 |
$dictionary->getAlternatePronunciations("AudioBook/interim_files/MissingWords_alt",15); |
|---|
| 195 |
$dictionary->createAltDict($originalDict,$altDict); |
|---|
| 196 |
$dictionary->validateAlternatePronunciations($originalDict,$altDict,$prompts); |
|---|
| 197 |
} |
|---|
| 198 |
$dictionary->updatePronDict($pronDict); |
|---|
| 199 |
} |
|---|
| 200 |
|
|---|
| 201 |
if (defined($tarSuffix)){ |
|---|
| 202 |
_createTarFile($self); |
|---|
| 203 |
} |
|---|
| 204 |
} |
|---|
| 205 |
|
|---|
| 206 |
=head2 cleanupFiles |
|---|
| 207 |
|
|---|
| 208 |
Removes any old files in the AudioBook/interim_files/ and AudioBook/output_files/ directories, prior to processing. |
|---|
| 209 |
|
|---|
| 210 |
=cut |
|---|
| 211 |
|
|---|
| 212 |
sub cleanupFiles { |
|---|
| 213 |
my ($self)= @_; |
|---|
| 214 |
if (defined(<AudioBook/interim_files/*>)) { |
|---|
| 215 |
unlink (<AudioBook/interim_files/*>); |
|---|
| 216 |
} |
|---|
| 217 |
if (defined(<AudioBook/interim_files/logs/*>)) { |
|---|
| 218 |
unlink (<AudioBook/interim_files/logs/*>); |
|---|
| 219 |
} |
|---|
| 220 |
if (defined(<AudioBook/interim_files/missingWordsFolder/*>)) { |
|---|
| 221 |
unlink (<AudioBook/interim_files/missingWordsFolder/*>); |
|---|
| 222 |
} |
|---|
| 223 |
if (defined(<AudioBook/interim_files/wav/*>)) { |
|---|
| 224 |
unlink (<AudioBook/interim_files/wav/*>); |
|---|
| 225 |
} |
|---|
| 226 |
if (defined(<AudioBook/output_files/wav/*>)) { |
|---|
| 227 |
unlink (<AudioBook/output_files/wav/*>); |
|---|
| 228 |
} |
|---|
| 229 |
if (defined(<AudioBook/output_files/*>)) { |
|---|
| 230 |
unlink (<AudioBook/output_files/*>); |
|---|
| 231 |
} |
|---|
| 232 |
} |
|---|
| 233 |
|
|---|
| 234 |
=head2 _createTarFile |
|---|
| 235 |
|
|---|
| 236 |
creates a GZipped Tar file form files contained in AudioBook/output_files |
|---|
| 237 |
|
|---|
| 238 |
=item * -T Switch to turn on this functionality |
|---|
| 239 |
|
|---|
| 240 |
=item * -u username used in name of the tar file |
|---|
| 241 |
|
|---|
| 242 |
=cut |
|---|
| 243 |
|
|---|
| 244 |
sub _createTarFile { |
|---|
| 245 |
my ($self)= @_; |
|---|
| 246 |
my $debug = $self->{'debug'}; |
|---|
| 247 |
my $username = $self->{"username"}; |
|---|
| 248 |
my $tarSuffix = $self->{"tarSuffix"}; |
|---|
| 249 |
my $readme = $self->{"README"}; |
|---|
| 250 |
my $license = $self->{"LICENSE"}; |
|---|
| 251 |
|
|---|
| 252 |
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time); |
|---|
| 253 |
$year += 1900; |
|---|
| 254 |
$mon = sprintf("%02d", $mon); |
|---|
| 255 |
$mday = sprintf("%02d", $mday); |
|---|
| 256 |
print "creating gzipped tar file:$username\-$year$mon$mday\-$tarSuffix\.tgz \n"; |
|---|
| 257 |
if (defined($readme)) { |
|---|
| 258 |
copy("$readme","AudioBook/output_files/README"); |
|---|
| 259 |
} else { |
|---|
| 260 |
print "Warning: no README file to copy\n"; |
|---|
| 261 |
} |
|---|
| 262 |
if (defined($license)) { |
|---|
| 263 |
copy("$license","AudioBook/output_files/LICENSE"); |
|---|
| 264 |
} else { |
|---|
| 265 |
print "Warning: no LICENSE file to copy\n"; |
|---|
| 266 |
} |
|---|
| 267 |
copy("AudioBook/interim_files/prompts","AudioBook/output_files/prompts"); |
|---|
| 268 |
$command = ("cp AudioBook/interim_files/wav/* AudioBook/output_files/wav/"); print "cmd:$command\n" if $debug; system($command); |
|---|
| 269 |
if ($debug) { |
|---|
| 270 |
$command = ("tar -zcvf $username\-$year$mon$mday\-$tarSuffix\.tgz AudioBook/output_files --exclude \"\.svn\" "); print "cmd:$command\n" if $debug; system($command); |
|---|
| 271 |
} else { |
|---|
| 272 |
$command = ("tar -zcf $username\-$year$mon$mday\-$tarSuffix\.tgz AudioBook/output_files --exclude \"\.svn\" "); print "cmd:$command\n" if $debug; system($command); |
|---|
| 273 |
} |
|---|
| 274 |
print "please submit your tar file to: www.voxforge.org\n"; |
|---|
| 275 |
} |
|---|
| 276 |
|
|---|
| 277 |
sub _random_characters { |
|---|
| 278 |
my ($length) = @_; |
|---|
| 279 |
my @chars=('a'..'z'); |
|---|
| 280 |
my $randomString; |
|---|
| 281 |
foreach (1..$length){ |
|---|
| 282 |
$randomString.=$chars[rand @chars]; |
|---|
| 283 |
} |
|---|
| 284 |
return $randomString; |
|---|
| 285 |
} |
|---|
| 286 |
|
|---|
| 287 |
=head2 getOptions |
|---|
| 288 |
|
|---|
| 289 |
Get the user submitted options ('a:b:d:hl:m:p:r:s:t:u:x:q:vwST') |
|---|
| 290 |
|
|---|
| 291 |
=cut |
|---|
| 292 |
|
|---|
| 293 |
sub getOptions { |
|---|
| 294 |
my ($self)= @_; |
|---|
| 295 |
my $debug = $self->{'debug'}; |
|---|
| 296 |
getopts('a:b:d:hl:m:p:r:s:t:u:x:q:vwST'); |
|---|
| 297 |
if ($opt_S) { |
|---|
| 298 |
$self->{"audiofile"}="AudioBook/test/audio.wav"; |
|---|
| 299 |
|
|---|
| 300 |
$self->{"textfile"}="AudioBook/test/text-original.txt"; |
|---|
| 301 |
$command = ("cp AudioBook/input_files/VoxForgeDict AudioBook/interim_files/VoxForgeDict"); print "cmd:$command\n"; system($command); |
|---|
| 302 |
$self->{"pronDict"}="AudioBook/interim_files/VoxForgeDict"; |
|---|
| 303 |
$self->{"tarSuffix"}=_random_characters(3); |
|---|
| 304 |
$self->{"username"}="test"; |
|---|
| 305 |
$self->{"average_sentence_length"}= $default_average_sentence_length; |
|---|
| 306 |
$self->{"max_sentence_length"}= $default_max_sentence_length; |
|---|
| 307 |
$self->{"min_pause_for_sentence_break"}=$default_min_pause_for_sentence_break; |
|---|
| 308 |
|
|---|
| 309 |
$self->{"log_single_quotes"}= 1; |
|---|
| 310 |
$self->{"verify_segments"}=1; |
|---|
| 311 |
$self->{"verify_out_of_vocabulary_pronunciations"}=1; |
|---|
| 312 |
$self->{"README"}="AudioBook/input_files/README"; |
|---|
| 313 |
$self->{"LICENSE"}="AudioBook/input_files/LICENSE"; |
|---|
| 314 |
} elsif ($opt_a and $opt_t) { |
|---|
| 315 |
if (-r $opt_a) { |
|---|
| 316 |
$self->{"audiofile"}=$opt_a; |
|---|
| 317 |
} else { |
|---|
| 318 |
die "can't open -a" . $self->{"audiofile"} . "\n"; |
|---|
| 319 |
} |
|---|
| 320 |
if (-r $opt_t) { |
|---|
| 321 |
$self->{"textfile"}=$opt_t; |
|---|
| 322 |
} else { |
|---|
| 323 |
die "can't open -t" . $self->{"textfile"} . "\n"; |
|---|
| 324 |
} |
|---|
| 325 |
if (defined($opt_d)) { |
|---|
| 326 |
if (-r $opt_d) { |
|---|
| 327 |
$self->{"pronDict"}=$opt_d; |
|---|
| 328 |
} else { |
|---|
| 329 |
die "can't open -d" . $self->{"pronDict"} . "\n"; |
|---|
| 330 |
} |
|---|
| 331 |
} else { |
|---|
| 332 |
$self->{"pronDict"}="AudioBook/input_files/VoxForgeDict"; |
|---|
| 333 |
} |
|---|
| 334 |
|
|---|
| 335 |
if ($opt_s) { |
|---|
| 336 |
$self->{"average_sentence_length"}=$opt_s; |
|---|
| 337 |
} else { |
|---|
| 338 |
$self->{"average_sentence_length"}= $default_average_sentence_length; |
|---|
| 339 |
} |
|---|
| 340 |
if ($opt_m) { |
|---|
| 341 |
$self->{"max_sentence_length"}=$opt_m; |
|---|
| 342 |
} else { |
|---|
| 343 |
$self->{"max_sentence_length"}= $default_max_sentence_length; |
|---|
| 344 |
} |
|---|
| 345 |
if ($opt_p) { |
|---|
| 346 |
$self->{"min_pause_for_sentence_break"}=$opt_p; |
|---|
| 347 |
} else { |
|---|
| 348 |
$self->{"min_pause_for_sentence_break"}= $default_min_pause_for_sentence_break; |
|---|
| 349 |
} |
|---|
| 350 |
if ($opt_q) { |
|---|
| 351 |
if ($opt_v =~ /^n|no$/i){ |
|---|
| 352 |
$self->{"log_single_quotes"}= 0; |
|---|
| 353 |
} else { |
|---|
| 354 |
$self->{"log_single_quotes"}= 1; |
|---|
| 355 |
} |
|---|
| 356 |
} else { |
|---|
| 357 |
$self->{"log_single_quotes"}= 1; |
|---|
| 358 |
} |
|---|
| 359 |
if ($opt_b) { |
|---|
| 360 |
$self->{"beam_width"}=$opt_b; |
|---|
| 361 |
} else { |
|---|
| 362 |
$self->{"beam_width"}=250; |
|---|
| 363 |
} |
|---|
| 364 |
if ($opt_v) { |
|---|
| 365 |
$self->{"verify_segments"}=1; |
|---|
| 366 |
} else { |
|---|
| 367 |
$self->{"verify_segments"}=0; |
|---|
| 368 |
} |
|---|
| 369 |
if ($opt_w) { |
|---|
| 370 |
$self->{"verify_out_of_vocabulary_pronunciations"}=1; |
|---|
| 371 |
} else { |
|---|
| 372 |
$self->{"verify_out_of_vocabulary_pronunciations"}=0; |
|---|
| 373 |
} |
|---|
| 374 |
|
|---|
| 375 |
if (defined($opt_T)) { |
|---|
| 376 |
if ($opt_x) { |
|---|
| 377 |
$self->{"tarSuffix"}=substr($opt_x,0,3); |
|---|
| 378 |
}else { |
|---|
| 379 |
$self->{"tarSuffix"}=_random_characters(3); |
|---|
| 380 |
} |
|---|
| 381 |
if ($opt_u) { |
|---|
| 382 |
$self->{"username"}=$opt_u; |
|---|
| 383 |
}else { |
|---|
| 384 |
$self->{"username"}="anonymous"; |
|---|
| 385 |
} |
|---|
| 386 |
if ($opt_r) { |
|---|
| 387 |
if (-r $opt_r) { |
|---|
| 388 |
$self->{"README"}=$opt_r; |
|---|
| 389 |
} else { |
|---|
| 390 |
die "can't open -r" . $self->{"README"} . "\n"; |
|---|
| 391 |
} |
|---|
| 392 |
} else { |
|---|
| 393 |
$self->{"README"}="AudioBook/input_files/README"; |
|---|
| 394 |
} |
|---|
| 395 |
if ($opt_l) { |
|---|
| 396 |
if (-r $opt_l) { |
|---|
| 397 |
$self->{"LICENSE"}=$opt_l; |
|---|
| 398 |
} else { |
|---|
| 399 |
die "can't open -l" . $self->{"LICENSE"} . "\n"; |
|---|
| 400 |
} |
|---|
| 401 |
} else { |
|---|
| 402 |
$self->{"LICENSE"}="AudioBook/input_files/LICENSE"; |
|---|
| 403 |
} |
|---|
| 404 |
} |
|---|
| 405 |
} elsif ($opt_h) { |
|---|
| 406 |
print "\nVoxForge Audio Segmentation Script Parameters\n"; |
|---|
| 407 |
print "=============================================\n"; |
|---|
| 408 |
print "-a\t* audio file name (WAV format only)\n"; |
|---|
| 409 |
print "-b\tnotify if beam width for Forced Alignment exceeds a certain level (default = 250)\n"; |
|---|
| 410 |
print "\t(does not set HVite's beam width parameter)\n"; |
|---|
| 411 |
print "-d\tpronunciation dictionary (default = AudioBook/input_files/VoxforgeDict)\n"; |
|---|
| 412 |
print "-h\tshow help\n"; |
|---|
| 413 |
print "-l\tLICENSE file (default = AudioBook/input_files/LICENCE)\n"; |
|---|
| 414 |
print "-m\tTarget maximum sentence length (default = $default_max_sentence_length words)\n"; |
|---|
| 415 |
print "-p\tMinimum pause for sentence break (default = $default_min_pause_for_sentence_break in units of 100ns)\n"; |
|---|
| 416 |
print "-q\tlog words with single quotes (default = yes)\n"; |
|---|
| 417 |
print "-r\tREADME file (default = AudioBook/input_files/README)\n"; |
|---|
| 418 |
print "-s\tAverage sentence length (default = $default_average_sentence_length words)\n"; |
|---|
| 419 |
print "-t\t* text file name (containing transcriptions of speech in audio file)\n"; |
|---|
| 420 |
|
|---|
| 421 |
print "-u\tusername or name you want file stats collected by on VoxForge Metrics \n"; |
|---|
| 422 |
print "\tpage:\t(http://www.voxforge.org/home/downloads/metrics)\n"; |
|---|
| 423 |
|
|---|
| 424 |
print "-v\tvalidate segment audio files to prompt text using forced Aligment\n"; |
|---|
| 425 |
print "-w\tvalidate missing word pronunciations to audio recordings\n"; |
|---|
| 426 |
print "-x\tunique tar file suffix (max 3 characters - remainder is truncated)\n"; |
|---|
| 427 |
print "-S\trun sanity test\n"; |
|---|
| 428 |
print "-T\tcreate gzipped/tar file\n"; |
|---|
| 429 |
print "\n\t* required for script to run\n"; |
|---|
| 430 |
print "\n"; |
|---|
| 431 |
print "--\n"; |
|---|
| 432 |
print "Free Speech... Recognition\n"; |
|---|
| 433 |
print "http://www.voxforge.org\n\n"; |
|---|
| 434 |
exit; |
|---|
| 435 |
} else { |
|---|
| 436 |
print "\nVoxForge Audio Segmentation Script\n"; |
|---|
| 437 |
print "==================================\n"; |
|---|
| 438 |
print "Parms -a and -t need to be defined. Use -h parameter for more information\n\n"; |
|---|
| 439 |
print "--\n"; |
|---|
| 440 |
print "Free Speech... Recognition\n"; |
|---|
| 441 |
print "http://www.voxforge.org\n\n"; |
|---|
| 442 |
exit; |
|---|
| 443 |
} |
|---|
| 444 |
print "audiofile:" . $self->{"audiofile"}. "\n"; |
|---|
| 445 |
print "textfile:" . $self->{"textfile"}. "\n"; |
|---|
| 446 |
print "pronDict:" . $self->{"pronDict"} . "\n\n"; |
|---|
| 447 |
} |
|---|
| 448 |
|
|---|
| 449 |
=head2 Gettors - Public (used by methods in other sub-classes) |
|---|
| 450 |
|
|---|
| 451 |
=item * getAverage_sentence_length() |
|---|
| 452 |
|
|---|
| 453 |
=cut |
|---|
| 454 |
|
|---|
| 455 |
sub getAverage_sentence_length { |
|---|
| 456 |
my $self = shift; |
|---|
| 457 |
return $self->{"average_sentence_length"}; |
|---|
| 458 |
} |
|---|
| 459 |
|
|---|
| 460 |
=item * getMax_sentence_length() |
|---|
| 461 |
|
|---|
| 462 |
=cut |
|---|
| 463 |
|
|---|
| 464 |
sub getMax_sentence_length { |
|---|
| 465 |
my $self = shift; |
|---|
| 466 |
return $self->{"max_sentence_length"}; |
|---|
| 467 |
} |
|---|
| 468 |
|
|---|
| 469 |
=item * getMin_pause_for_sentence_break() |
|---|
| 470 |
|
|---|
| 471 |
=cut |
|---|
| 472 |
|
|---|
| 473 |
sub getMin_pause_for_sentence_break { |
|---|
| 474 |
my $self = shift; |
|---|
| 475 |
return $self->{"max_sentence_length"}; |
|---|
| 476 |
} |
|---|
| 477 |
|
|---|
| 478 |
=head1 Change Log |
|---|
| 479 |
|
|---|
| 480 |
2008/05/02 - 0.2 - convert to class; major refacture ; renamed fullrun.pl to AudioBook.pm |
|---|
| 481 |
2008/01/31 - 0.1 - created |
|---|
| 482 |
|
|---|
| 483 |
=cut |
|---|
| 484 |
|
|---|
| 485 |
=head1 AUTHOR |
|---|
| 486 |
|
|---|
| 487 |
Ken MacLean |
|---|
| 488 |
contact@voxforge.org |
|---|
| 489 |
|
|---|
| 490 |
=head1 COPYRIGHT AND LICENSE |
|---|
| 491 |
|
|---|
| 492 |
Copyright (C) 2008 Ken MacLean |
|---|
| 493 |
|
|---|
| 494 |
This program is free software; you can redistribute it and/or |
|---|
| 495 |
modify it under the terms of the GNU General Public License |
|---|
| 496 |
as published by the Free Software Foundation; either version 2 |
|---|
| 497 |
of the License, or (at your option) any later version. |
|---|
| 498 |
|
|---|
| 499 |
This program is distributed in the hope that it will be useful, |
|---|
| 500 |
but WITHOUT ANY WARRANTY; without even the implied warranty of |
|---|
| 501 |
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
|---|
| 502 |
GNU General Public License for more details. |
|---|