Ticket #132 (closed task: wontfix)

Opened 14 years ago

Last modified 13 years ago

Reconciling Approaches to Collecting Audio for a Speech Corpus

Reported by: kmaclean Owned by: kmaclean
Priority: major Milestone: WebSite 0.2.1
Component: Audio Version: 0.1-alpha
Keywords: Cc:

Description

There seems to be two broad approaches to collecting audio for a speech corpus:

1. collect speech with as little external noise as possible

The presumption here is that any speech to be recognized will noise filtered (i.e. remove echo, hardware noises, external sounds, ...) before it is sent to the speech recognition engine. Julius includes parameters to aid in this noise removal (using spectral substraction: 'ssload' and 'ssalpha') ...

2. collect speech in its natural environment, warts and all

The approach here is to ensure that we get recordings covering as many different hardware configurations (different microphones types with computers with audio cards and on-board audio, noisy and quiet cooling fans/hard drives), and different recording environments (rooms with and without echo). The presumption here is that there would be no noise pre-filtering on the speech to be recognized, because any such noise removal algorithm introduces noise of its own.

Both appraoches require that we get good monophone and triphone coverage, and as many different people and dialects (and accents) as possible.

need to follow-up on David Gelbert's recommendation to ask the comp.speech.research newsgroup for advice...

Change History

comment:1 Changed 14 years ago by kmaclean

  • Milestone changed from Website 0.2 to WebSite 0.2.1

comment:2 Changed 13 years ago by kmaclean

  • Status changed from new to closed
  • Resolution set to wontfix
Note: See TracTickets for help on using tickets.