From: jaiger
As part of your nightly job, perhaps you can also collect metrics on the archive such as:
- total audio data, in seconds and MB
- compare the total with some goal: we're 10% of our 100hour goal
- similar audio submission metrics by user: jaiger submitted 1 hour of audio or 1% of 100hour goal
Collecting and publishing the metrics might spur submissions for those of us with competitive personalities and at least show us where we are relative to our project goals.
For future programming ease you might also (at submission time) create an XML file containing the License, prompts and README data. The XML file might also contain other data such as the calculated time metrics as above or perhaps MD5 hashes of audio files for use to check that a file as downloaded is not corrupt. This might facilitate future scripts manipulating the data - say for import into a DB or other queries.