lam: musical audio similarity michael casey centre for cognition, computation and culture department...
TRANSCRIPT
LAM: Musical Audio Similarity
Michael CaseyCentre for Cognition, Computation and Culture
Department of Computing
Goldsmiths College, University of London
Overview
• Machine Music Understanding• Features / Classes / Clusters
• Real-Time Audio Matching• Feature Extraction• Feature Similarity (Indexing / Retrieval)• PD/MSP Tools
• Music Similarity Applications• Sound object matching• Texture matching
Sound Understanding
Signal Processing Sound Understanding
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
Feature Extraction
frame 2
frame 3
overlapframe 1
audiosource
20ms10ms 30ms 40ms
p( | ) * P( )
Statistical Learningfor Decision Making
Decision boundary
Partitioning of feature space
P( | )= p( )
MusicSpeech
MPEG-7 Audio Tools
Audio
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio
AudioSpectrumEnvelopeD
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio
LogAmplitude
DecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
SoundModelStatePathD
State Path
Use estimated state sequence as a feature
MPEG-7 Audio Tools
Log FrequencySpectrogramAudio
LogAmplitude
DecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
Hidden MarkovModel
SoundModelDS
MPEG-7 Audio StringsAcoustic Lexicons
Log FrequencySpectrogramAudio
LogAmplitude
DecorrelatingTransform /
Dimension Reduction
AudioSpectrumEnvelopeD
AudioSpectrumProjectionD
Hidden MarkovModel
SoundModelDS StatePath
? 7 1 V 7 1 0 1 ...
SoundModelStatePathD
SYMBOL STRING
State Symbol Sequence (40 State Model)
?71V7101 ...
State Symbol Sequence (40 State Model)
?71V7101 ...
State Symbol Sequence (40 State Model)
?71V7101 ...
State Symbol Sequence (40 State Model)
?71V7101 ...
SoundModelStateHistogramD
seconds
state
index
state
index
0.01s Frames
Self-Similarity Matrix
Self-Similarity Matrix
Self-Similarity Matrix
|||||||||cos, 1
ba
baT
ba
Self-Similarity Matrix
|||||||||cos, 1
ba
baT
ba
a
Self-Similarity Matrix
|||||||||cos, 1
ba
baT
ba
a
b
Self-Similarity Matrix
|||||||||cos, 1
ba
baT
ba
a
b
Self-Similarity Matrix
|||||||||cos, 1
ba
baT
ba
S-Matrix
Efficient Storage / Retrieval
• Real-Time Access
• Large Databases
• Distributed Databases
PostgreSQL Database Representation of State Path “Strings” and Histograms
Similarity
• Compute distance between feature pairs• Features == SoundModelStateHistogramD
• Similarity Metric•dist(a,b) >= 0•dist(a,b)== 0 iff a==b•dist(a,b) + dist(b,c) >= dist(a,c)
• Vector Dot Product
|||||||||cos, 1
ba
baT
ba
Similarity of Feature Trajectories
Dynamic Time Warping
Acousticon Strings
• Distance Metric– String Edit Distance (Levenschtein)
• Scalable to Large Databases– PostgreSQL Implementation– Can use built-in Index Structures
• Scalable to Real-Time Implementation– matching and audio streaming (< 20ms )
Information Retrievalfor Creativity
• Utilize sound extant database for new material
• Take the structure of a music clip but replace the content.
• New interfaces for music creativity.
Audio Information Retrieval
MPEG-7Database
A pre-indexed Collection of Sounds
Audio Query Extract
MPEG-7Database
Segment Match
Result ListA Sound or Scene orList of Sounds
Audio Information Retrieval
Audio Query Extract
MPEG-7Database
Segment Match
Result ListFeature extractionfrom audio.
Audio Information Retrieval
Audio Query Extract
MPEG-7Database
Segment Match
Result ListPartitioningof audio intochunks.
Audio Information Retrieval
Audio Query Extract
MPEG-7Database
Segment Match
Result List
Find similar chunksof Audio
Audio Information Retrieval
Real-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time MatchingReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching
MusaicsReal-Time Matching