wiki:AudioProcessingTools

Version 10 (modified by kmaclean, 12 years ago) (diff)

--

Audio Processing Tools

Audio Segmenting Tools

We call this project as Interslice  - to be released under Festvox (Alan 
would have more comments).

The basic idea of interslice is to automatically build synthetic voices 
from large speech databases typically available from public domain such 
as librivox.org and loudlit.org.

Interslice comes with a segmentation tool capable to handling infinitely 
large corpora and chunking them into utterances and *.lab files.

You may also want to refer to Kishore Prahallad, Arthur R Toth and Alan 
W Black,, /"Automatic Building of Synthetic Voices from Large 
Multi-Paragraph Speech Databases " 
<http://speech.iiit.net/%7Espeech/publications/interslice_v3.pdf>/, in 
Proceedings of Interspeech, Antwerp, Belgium 2007.  
http://speech.iiit.net/%7Espeech/publications/interslice_v3.pdf

Cross Platform Audio API

converting to and from different audio formats

(from this site) Shell commands to convert audio from one format to other using Sox, LAME, FLAC, and madplay:

  • flac->wav
    • flac -sd $in -o $out
  • flac->mp3
    • flac -sdc $in | lame - $out
  • wav->flac
    • flac -s $in -o $out
  • wav->mp3
    • lame $in $out
  • mp3->flac
    • madplay -q -o wave:- $in | flac -s - -o $out
  • mp3->wav
    • sox $in $out