Ticket #337 (new enhancement)

Opened 12 years ago

Last modified 12 years ago

Automated audio segmentation using forced alignment

Reported by: kmaclean Owned by: kmaclean
Priority: major Milestone: WebSite 0.2.1
Component: Scripts Version: Website 0.2
Keywords: Cc:

Description (last modified by kmaclean) (diff)

Need a script to automatically segment audio from LibriVox.

Current challenges:

  • dealing with out-of-vocabulary words
  • automatically adjusting the pause interval for optimal audio segmentation

both of these are done manually, would like them to be automated.

see Automated Audio Segmentation Using Forced Alignment (Draft)

Change History

comment:1 Changed 12 years ago by kmaclean

Festival's English phone lists is the radio phoneset:

aa

fAther, wAshington

ae

fAt, bAd

ah

bUt, hUsh

ao

lAWn, dOOr, mAll

aw

hOW, sOUth, brOWser

ax

About, cAnoe

ay

hIde, bIble

eh

gEt, fEAther

el

tabLE, usabLE

em

systEM, communisM

en

beatEN

er

fERtile, sEARch, makER

ey

gAte, Ate

ih

bIt, shIp

iy

bEAt, shEEp

ow

lOne, nOse

oy

tOY, OYster

uh

fUll, wOOd

uw

fOOl, fOOd

b

Book, aBrupt

ch

CHart, larCH

d

Done, baD

dh

THat, faTHer

f

Fat, lauGH

g

Good, biGGer

hh

Hello, loopHole

jh

diGit, Jack

k

Camera, jaCK, Kill

l

Late, fuLL

m

Man, gaMe

n

maN, New

ng

baNG, sittiNG

p

Pat, camPer

r

Reason, caR,

s

Sit, maSS

sh

SHip, claSH

t

Tap, baT

th

THeatre, baTH

v

Various, haVe

w

Water, cobWeb

y

Yellow, Yacht

z

Zero, quiZ, boyS

zh

viSion, caSual

pau

short silence

comment:2 Changed 12 years ago by kmaclean

  • Description modified (diff)

comment:3 Changed 12 years ago by kmaclean

  • Summary changed from Automated uadio segmentation using forced alignment to Automated audio segmentation using forced alignment

comment:4 Changed 12 years ago by kmaclean

  • Priority changed from critical to major
  • Milestone changed from SpeechSubmission 0.1.5 to WebSite 0.2.1
Note: See TracTickets for help on using tickets.