audio issues in mir evaluation - eecs - school of ...josh/documents/reisssandler-ismir2003.pdf · 3...
TRANSCRIPT
1
Audio issues in MIR evaluation
vOverview of audio formatsvPreferred presentation of audio files
in an MIR testbedvA set of simple recommendations
Audio Formats I1. Apple à AIFF (Audio Interchange File
Format)2. Microsoft, IBM àWAV (Wave)• Stores data in chunks• Supports a variety of Ø Bit resolutionsØ Sample ratesØ Channels
• 2 most common uncompressed formats• Digital Audio Workstations support both
2
Broadcast Wave Format
• EBU standard • Based on Wave audio files• Additional header chunküdefines data formatüsound sequence description üoriginator name üorigination and time
• Play on any system capable of playing WAV
AES31
1. physical data transport – How files move between systems via
removable media or networks
2. audio file format– Broadcast Wave format
3. simple project structure– Audio Decision List (ADL)
4. object-oriented project structure
3
Audio Codecs• MPEG-2• MP3 • MPEG-4 AAC –
– Advanced Audio Coder, efficient, extensible• Dolby Digital AC-3 –• DTS• SDDS (Sony Dynamic Digital Sound) • WMA9
– High res compression– Windows specific features
Wrappers and exchange formats
• AES31• OMF- open media framework, 1997à AAF-advanced authoring format
– metadata, drm, links, interactive content– platform independent, extensible, royalty
free
• MXF- Material exchange format– file wrapper – read metadata regardless of internal data
4
Simple Guidelines #1
• Audio files should be presented in the highest quality format possible, ideally the original master recordings. If a compressed format is used, it should be used in tandem with the original format.
• Why?
ExamplesØPreparationvCompandingvEqualisationvBoost
ØAnalysisvWavelet Analysis ØWMA
vTime domain analysis Ølow sampling rate files
vInstrument templatesØMasking and high frequency content in mp3s
5
Audio ArtefactsvPre-echo
vAliasing (low frequency sampling)
vBirdies (masking)
vLoss of Stereo Image
vRepeated Encoding
Why use the master recordings?• Guarantee highest quality
– Then test robustness• Far richer data ü20+ tracksü96kHz +ü24 bit
• Many audio processing methods reverse-engineerqSource separationqInstrument recognitionqTranscription
6
Simple Guidelines #2
• Audio files should be presented in the most open format possible. Users should be allowed to understand how the file was encoded and access the files from any operating system, on any platform, using any development environment.
Simple Guidelines #3
• Audio files should be presented in the simplest format possible. Tools should be available to read and write files in that format.
– Converters should be provided for exchange between all major formats.
7
Recommendations I
• Popular, simple, well-understood and uncompressed format for original files– WAV
• Provide various converters– Store files in multiple formats
• Guarantee interoperability and usability– OS– Development environment– Programming language– Media player
Recommendations II
• Very low quality popular format with embedded artefacts for listening tests– Low sample rate, highly compressed mp3s– Mono– Watermarks
• Emphasised responsibility• Enhanced liability
– Thumbnails– Streaming– Pings, drop-outs,…? annoying
• Listening tests• Demonstrations