-
Notifications
You must be signed in to change notification settings - Fork 2
Lazily extracting utterances
Manually cutting up and aligning audio files is a fiddly nuisance. The method described here tries to minimise effort at the expense of missing some perfectly good utterances. It works for the likes of Hansard.
- Open the audio file in Audacity
- Select
Analyse
->Silence Finder
orAnalyse
->Sound Finder
- Reduce the silence length a little bit (around 0.3 seconds is a good start).
- The silence finder will put labels at each silence that size or bigger.
- Select
File
->Export Multiple
, choose WAV and prefix-and-number naming.
Actually, before Export Multiple
, you might want to visit
Edit
->Preferences
->Import/Export
and turn off the metadata
editing pop-up.
Festival recommends utterances between 5 and 30 seconds long. WAV file sizes are proportional to their length, but depend on the number of channels and sample rate.
Here are bit rates in decimal k.
seconds -> 1 5 30
---------------------------------
16 mono 32 160 960
44.1 mono 88 441 2646
44.1 stereo 176 882 5292
48 mono 96 480 2880
48 stereo 192 960 5760
Find out your files' bit rate and see if there are a good number of files the right size:
# for 44.1 stereo
ls | wc -l
find -size -900k -type f | wc -l
find -size +5000k -type f | wc -l
This tells you haw many you are throwing out. If there are not many the right size, go back to Audacity and search again for silences using different parameters. Then when it is right:
find -size -900k -type f | xargs rm
find -size +5000k -type f | xargs rm
Audacity numbers the files without zero padding, so they sort in the wrong order. Fix that with:
rename 's/(\d+.wav)$/00000$1/' *.wav
rename 's/0+(\d{4}.wav)$/$1/' *.wav
(/usr/bin/rename
seems to come with Perl).
Open the transcript in a text editor. Keep a backup. Listen to the files with:
mkdir done
for x in *.wav; do play $x; echo $x; read && mv $x done; done
Delete everything up to the first word, write in the file number, then insert carriage returns after the last word.
Look out for any numerals -- you'll need to spell them out.