voxforge.org
VoxForge Dev
Show
Ignore:
Timestamp:
06/18/08 21:42:49 (6 months ago)
Author:
kmaclean
Message:

AudioSegmentation script - snapshot re: interactive Missingword update

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • Trunk/Scripts/Audio_scripts/AudioSegmentation/AudioBook/output_files/AudioBook_Log

    r2608 r2615  
    33Changes made to Text file 
    44------------------------- 
    5 changed URL MOJOMOVE411.COM to:MOJOMOVE411 DOT COM; 
    6 converted number:4: to four 
    7 converted number:1: to one 
    8 converted number:1: to one 
    9 changed URL VOXFORGE.ORG to:VOXFORGE DOT ORG; 
     5converted number:8: to eight 
     6changed URL LIBRIVOX.ORG to:LIBRIVOX DOT ORG; 
     7converted number:200: to two hundred 
     8converted number:4000: to four thousand 
     9converted number:500: to five hundred 
     10converted number:800: to eight hundred 
     11converted date:1500: to fifteen hundred 
     12converted number:3000: to three thousand 
     13converted number:5000: to five thousand 
     14converted number:220: to two hundred and twenty 
     15converted number:245: to two hundred and forty five 
     16converted date:1825: to eighteen twenty five 
     17converted number:26: to twenty six 
     18converted date:1826: to eighteen twenty six 
     19converted number:3: to three 
     20converted number:57: to fifty seven 
     21converted number:9: to nine 
     22converted number:134: to one hundred and thirty four 
     23converted number:17: to seventeen 
     24converted number:5: to five 
     25converted number:245: to two hundred and forty five 
     26converted number:5000: to five thousand 
     27converted date:2000: to two thousand 
     28converted number:200: to two hundred 
     29converted number:4000: to four thousand 
     30converted number:220: to two hundred and twenty 
     31converted number:200: to two hundred 
     32converted number:200: to two hundred 
     33converted number:300: to three hundred 
     34converted number:5500: to five thousand five hundred 
     35converted number:5000: to five thousand 
     36converted number:3000: to three thousand 
     37converted number:4000: to four thousand 
     38converted number:150: to one hundred and fifty 
     39converted number:450: to four hundred and fifty 
     40converted number:5000: to five thousand 
     41converted number:9000: to nine thousand 
     42converted number:200: to two hundred 
     43converted number:200: to two hundred 
     44converted number:250: to two hundred and fifty 
     45converted number:7500: to seven thousand five hundred 
     46converted number:8500: to eight thousand five hundred 
     47converted number:240: to two hundred and forty 
     48converted number:250: to two hundred and fifty 
     49converted number:8: to eight 
    1050 
    1151Words with period (".") removed from body of word - please review: 
     
    1454Words with single quotes (no change made to word, unless otherwise stated) - please review: 
    1555------------------------------------------------------------------------------------------- 
    16 changed:'T to:T 
     56INDIAN'S 
     57HOUR'S 
     58NATURE'S 
     59ONE'S 
     60NATURE'S 
     61ONE'S 
     62ONE'S 
    1763 
    1864Missing Words that need to be added to Pronunciation Dictionary, with suggested pronunciations: 
    1965----------------------------------------------------------------------------------------------- 
    20 MOJOMOVE        [MOJOMOVE]      m ow jh ow m uw v 
    21 VOXFORGE        [VOXFORGE]      v aa k s f ao r jh 
     66ABIES           [ABIES]         ey b iy z 
     67ACCLIVITIES     [ACCLIVITIES]   ae k l ih v ax dx iy z 
     68BEHELD          [BEHELD]        b ix hh eh l d 
     69BRACTED         [BRACTED]       b r ae k t ax d 
     70BRANCHLESS      [BRANCHLESS]    b r ae n ch l ax s 
     71BURS            [BURS]          b er z 
     72CASTILLEIA      [CASTILLEIA]    k aa s t iy l ey iy ax 
     73CEANOTHUS       [CEANOTHUS]     s iy n aa th ax s 
     74CHAMOEBATIA     [CHAMOEBATIA]   ch ax m iy b ax t iy ax 
     75COLONNADES      [COLONNADES]    k aa l ax n ey d z 
     76CONCOLOR        [CONCOLOR]      k aa n k ah l er 
     77CONTIGUITY      [CONTIGUITY]    k ax n t ih g y uw ax dx iy 
     78CONVENTIONALITIES [CONVENTIONALITIES] k ax n v eh n sh ax n ae l ix dx iy z 
     79CYPRIPEDIUM     [CYPRIPEDIUM]   s ih p r iy p iy dx iy ax m 
     80DECURRENS       [DECURRENS]     d ix k er ax n z 
     81DEODAR          [DEODAR]        d ix ow dx er 
     82DOUGLASII       [DOUGLASII]     d ah g l ax s iy iy 
     83DREAMILY        [DREAMILY]      d r iy m ax l iy 
     84DROWSING        [DROWSING]      d r aw z ix ng 
     85ESSENCES        [ESSENCES]      eh s ax n s ix z 
     86EXUDING         [EXUDING]       ix g z uw dx ix ng 
     87FERNY           [FERNY]         f er n iy 
     88FRONDED         [FRONDED]       f r aa n d ix d 
     89FROW            [FROW]          f r aw 
     90GILIAS          [GILIAS]        g ih l iy ax z 
     91HEAVENWARD      [HEAVENWARD]    hh eh v ax n w er d 
     92IMBRICATED      [IMBRICATED]    ix m b r ix k ey dx ix d 
     93IMPRESSIVENESS [IMPRESSIVENESS] ix m p r eh s ix v n ax s 
     94INTERBLENDED    [INTERBLENDED]  ih n t er b l eh n d ix d 
     95LAMBERTIANA     [LAMBERTIANA]   l ae m b er t iy ae n ax 
     96LARKSPURS       [LARKSPURS]     l aa r k s p er z 
     97LEAFIEST        [LEAFIEST]      l iy f iy ix s t 
     98LIBOCEDRUS      [LIBOCEDRUS]    l ay b ow s eh d r ax s 
     99LITHELY         [LITHELY]       l ay th l iy 
     100LUMBERMEN       [LUMBERMEN]     l ah m b er m ax n 
     101LUPINES         [LUPINES]       l uw p ay n z 
     102MAGNIFICA       [MAGNIFICA]     m ae g n ih f ix k ax 
     103MANZANITA       [MANZANITA]     m ae n z ey n iy dx ax 
     104MORAINES        [MORAINES]      m er ey n z 
     105MOTTLE          [MOTTLE]        m aa dx ax l 
     106MYRIADS         [MYRIADS]       m ih r iy ax d z 
     107NOBLENESS       [NOBLENESS]     n ow b ax l n ax s 
     108OBTUSA          [OBTUSA]        aa b t uw s ax 
     109OUTSPREAD       [OUTSPREAD]     aw t s p r eh d 
     110OUTSWEEPING     [OUTSWEEPING]   aw t s w iy p ix ng 
     111PARVUM          [PARVUM]        p aa r v ax m 
     112PERVIOUS        [PERVIOUS]      p er v iy ax s 
     113PICTURESQUELY   [PICTURESQUELY] p ih k ch er ax s k l iy 
     114PINNATE         [PINNATE]       p ih n ix t 
     115PINNATED        [PINNATED]      p ih n ey dx ix d 
     116PINUS           [PINUS]         p ay n ax s 
     117PLATEAUS        [PLATEAUS]      p l ae t ow z 
     118PLUMY           [PLUMY]         p l ah m iy 
     119PLUSHY          [PLUSHY]        p l ah sh iy 
     120PSEUDOTSUGA     [PSEUDOTSUGA]   s uw dx ax t s uw g ax 
     121REPOSING        [REPOSING]      r iy p ow z ix ng 
     122RESINY          [RESINY]        r ix z ay n iy 
     123RETINOSPORA     [RETINOSPORA]   r eh t ax n aa s p er ax 
     124RIFTED          [RIFTED]        r ih f t ax d 
     125SABINIANA       [SABINIANA]     s ax b iy n iy aa n ax 
     126SAITH           [SAITH]         s ey th 
     127SARCODES        [SARCODES]      s aa r k ow d z 
     128SAUNTERING      [SAUNTERING]    s ao n t er ix ng 
     129SEDGES          [SEDGES]        s eh jh ix z 
     130SELVAS          [SELVAS]        s eh l v aa z 
     131SESSILE         [SESSILE]       s eh s ax l 
     132SHEEPMEN        [SHEEPMEN]      sh iy p m eh n 
     133SILICIOUS       [SILICIOUS]     s ix l ih sh ax s 
     134SILVERING       [SILVERING]     s ih l v er ix ng 
     135SPIRY           [SPIRY]         s p ay r iy 
     136SPRUCES         [SPRUCES]       s p r uw s ix z 
     137STAMINATE       [STAMINATE]     s t ae m ax n ey t 
     138SUBLIMELY       [SUBLIMELY]     s ax b l ay m l iy 
     139SUNGOLD         [SUNGOLD]       s ah n g ow l d 
     140TALUSES         [TALUSES]       t ae l ax s ix z 
     141TASSELS         [TASSELS]       t ae s ax l z 
     142TOPMOST         [TOPMOST]       t ax p m ow s t 
     143TUBERCULATA     [TUBERCULATA]   t ax b er k y ax l ax dx ax 
     144UMPQUA          [UMPQUA]        ah m p k w ax 
     145UNDECAYED       [UNDECAYED]     ax n d ix k ey d 
     146UNDIMMED        [UNDIMMED]      ax n d ih m d 
     147UNRESERVEDLY    [UNRESERVEDLY]  ax n r ix z er v ax d l iy 
     148UNTHOUGHT       [UNTHOUGHT]     ax n th ao t 
     149UNWHOLESOME     [UNWHOLESOME]   ax n hh ow l s ax m 
     150UPTURNING       [UPTURNING]     ah p t er n ix ng 
     151VERATRUMALBA    [VERATRUMALBA]  v er ae t r ax m ae l b ax 
     152WHORLED         [WHORLED]       w er l d 
     153YOSEMITES       [YOSEMITES]     y ow s eh m ix dx iy z 
    22154 
    23155Audio Segmenting summary: 
     
    25157Settings:average sentence length: 15 
    26158         target max sentence length: 20 
    27          pause length: 2000000 (0.2 seconds) 
    28  
    29 Sentence Length: min:aud0002: 11 
    30                  max:aud0003: 16 
     159         pause length: 1000000 (0.1 seconds) 
     160 
     161Sentence Length: min:mtn0496: 6 
     162                 max:mtn0495: 30 
    31163 
    32164Prompt lines with more than max_sentence_length of 20 words: 
    33         none 
     165        mtn0016:23 
     166        mtn0039:23 
     167        mtn0108:26 
     168        mtn0113:22 
     169        mtn0115:24 
     170        mtn0150:24 
     171        mtn0152:23 
     172        mtn0187:22 
     173        mtn0231:22 
     174        mtn0244:24 
     175        mtn0261:22 
     176        mtn0296:22 
     177        mtn0315:22 
     178        mtn0346:22 
     179        mtn0373:22 
     180        mtn0465:22 
     181        mtn0495:30 
     182        mtn0512:22 
     183        mtn0526:22 
     184        mtn0536:23 
    34185Checking for "No tokens survived to final node of network at beam" warnings: 
    35186---------------------------------------------------------------------------- 
     
    38189(confirm anything with an avg log likelihood of less than 60): 
    39190--------------------------------------------------------------- 
    40 57.7317 aud0009 THIS RECORDING IS IN THE PUBLIC DOMAIN 
    41 60.5411 aud0004 OF YOUTH AND HOME AND THAT SWEET TIME WHEN LAST I HEARD THEIR SOOTHING CHIME 
    42 60.7235 aud0007 AND SO T WILL BE WHEN I AM GONE THAT TUNEFUL PEAL WILL STILL RING ON 
    43 61.0408 aud0006 WITHIN THE TOMB NOW DARKLY DWELLS AND HEARS NO MORE THOSE EVENING BELLS 
    44 61.0721 aud0008 WHILE OTHER BARDS SHALL WALK THESE DELLS AND SING YOUR PRAISE SWEET EVENING BELLS 
    45 61.1881 aud0005 THOSE JOYOUS HOURS ARE PASSED AWAY AND MANY A HEART THAT THEN WAS GAY 
    46 61.5020 aud0003 THOSE EVENING BELLS THOSE EVENING BELLS THOSE EVENING BELLS HOW MANY A TALE THEIR MUSIC TELLS 
    47 61.7058 aud0002 AS PART OF THE VOXFORGE DOT ORG SHORTS WEEKLY MONOLOGUE COLLECTION 
    48 61.9362 aud0001 THOSE EVENING BELLS BY THOMAS MOORE READ FOR MOJOMOVE FOUR ONE ONE DOT COM BY ROBERT SCOTT 
     19160.0676 mtn0181 ABOUT AN HOUR'S WALK FROM THE CAMP I MET AN INDIAN 
     19260.7904 mtn0477 A DIAMETER OF A LITTLE MORE THAN FIVE FEET 
     19361.5720 mtn0147 THEN A YOUNG GROVE IMMEDIATELY SPRINGS UP GIVING BEAUTY FOR ASHES SUGAR PINE PINUS LAMBERTIANA 
     19461.7243 mtn0422 ESPECIALLY IN YOSEMITE GORGES MOISTENED BY THE SPRAY OF WATERFALLS INCENSE CEDAR LIBOCEDRUS DECURRENS 
     19561.8324 mtn0149 BUT ALSO IN KINGLY BEAUTY AND MAJESTY 
     19661.9729 mtn0083 THE NUT PINE PINUS SABINIANA THE NUT PINE THE FIRST CONIFER MET IN ASCENDING THE RANGE FROM THE WEST 
     19761.9864 mtn0198 AMONG SEVERAL THAT HAD BEEN BLOWN DOWN BY THE WIND 
     19862.0141 mtn0177 EIGHTEEN TWENTY SIX WEATHER DULL COLD AND CLOUDY 
     19962.0437 mtn0171 FAR TO THE SOUTHWARD OF THE COLUMBIA 
     20062.0466 mtn0393 TREE IS THE KING OF THE SPRUCES AS THE SUGAR PINE IS KING OF PINES 
     20162.2501 mtn0392 TO LEAVE ROOM FOR EVEN A HEAVENWARD CARE DOUGLAS SPRUCE PSEUDOTSUGA DOUGLASII THIS 
     20262.3192 mtn0350 THAT IT MAY WELL BE CALLED THE YOSEMITE PINE 
     20362.3502 mtn0524 TO MAKE A FRINGE FOR ITS FEET AND SHOW IT OFF TO BEST ADVANTAGE 
     20462.3740 mtn0308 AND FLOWERY PARK LIKE GROUND INTO A SCENE OF ENCHANTMENT ON 
     20562.3747 mtn0451 WHITE SILVER FIR ABIES CONCOLOR WE COME NOW TO THE MOST REGULARLY PLANTED OF ALL THE MAIN FOREST BELTS