a short introduction to audio...

73
Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks A Short Introduction to Audio Fingerprinting Simon Froitzheim RWTH Aachen University [email protected] June 28th, 2017 Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 1 / 26

Upload: buithu

Post on 26-Mar-2018

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

A Short Introduction to Audio Fingerprinting

Simon Froitzheim

RWTH Aachen University

[email protected]

June 28th, 2017

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 1 / 26

Page 2: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview

1 Audio Fingerprinting Basics

2 An Algorithmic Overview

3 ShazamOverviewTechnical Details

4 Conclusive RemarksSynopsisReferences

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 2 / 26

Page 3: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Audio Fingerprinting Basics

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 3 / 26

Page 4: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Central Issue

Given: An unlabeled piece of audio in any format

Goal: To encode the given audio piece in a so-called ”fingerprint”

Why is this useful?

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 4 / 26

Page 5: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Central Issue

Given: An unlabeled piece of audio in any format

Goal: To encode the given audio piece in a so-called ”fingerprint”

Why is this useful?

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 4 / 26

Page 6: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Central Issue

Given: An unlabeled piece of audio in any format

Goal: To encode the given audio piece in a so-called ”fingerprint”

Why is this useful?

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 4 / 26

Page 7: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Applications for Audio Fingerprinting

Content-based audio identification (CBID)

Content-based integrity verification

Watermarking support

All of this implies storage, indexing and comparison of fingerprints!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 5 / 26

Page 8: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Applications for Audio Fingerprinting

Content-based audio identification (CBID)

Content-based integrity verification

Watermarking support

All of this implies storage, indexing and comparison of fingerprints!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 5 / 26

Page 9: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Applications for Audio Fingerprinting

Content-based audio identification (CBID)

Content-based integrity verification

Watermarking support

All of this implies storage, indexing and comparison of fingerprints!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 5 / 26

Page 10: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Applications for Audio Fingerprinting

Content-based audio identification (CBID)

Content-based integrity verification

Watermarking support

All of this implies storage, indexing and comparison of fingerprints!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 5 / 26

Page 11: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification

Goal: To identify audio tracks solely based on the track itself (without anygiven metadata)

This can be achieved by:

Computing a fingerprint for every known audio piece

Storing each fingerprint in a database

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 6 / 26

Page 12: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification

Goal: To identify audio tracks solely based on the track itself (without anygiven metadata)

This can be achieved by:

Computing a fingerprint for every known audio piece

Storing each fingerprint in a database

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 6 / 26

Page 13: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification

Goal: To identify audio tracks solely based on the track itself (without anygiven metadata)

This can be achieved by:

Computing a fingerprint for every known audio piece

Storing each fingerprint in a database

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 6 / 26

Page 14: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification

Goal: To identify audio tracks solely based on the track itself (without anygiven metadata)

This can be achieved by:

Computing a fingerprint for every known audio piece

Storing each fingerprint in a database

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 6 / 26

Page 15: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification (continued)

When confronted with an unknown audio excerpt:

Compute the corresponding fingerprint

Match it against the database

Identify music piece based on the matching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 7 / 26

Page 16: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification (continued)

When confronted with an unknown audio excerpt:

Compute the corresponding fingerprint

Match it against the database

Identify music piece based on the matching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 7 / 26

Page 17: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification (continued)

When confronted with an unknown audio excerpt:

Compute the corresponding fingerprint

Match it against the database

Identify music piece based on the matching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 7 / 26

Page 18: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Content-based Audio Identification (continued)

When confronted with an unknown audio excerpt:

Compute the corresponding fingerprint

Match it against the database

Identify music piece based on the matching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 7 / 26

Page 19: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Requirements for Audio Fingerprinting Techniques[Cano, 2002]

Discrimination power

Robustness

CroppingDistortions

Compactness

Computational simplicity

The usual trade-off between reliability and efficiency is present here!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 8 / 26

Page 20: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Requirements for Audio Fingerprinting Techniques[Cano, 2002]

Discrimination power

Robustness

CroppingDistortions

Compactness

Computational simplicity

The usual trade-off between reliability and efficiency is present here!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 8 / 26

Page 21: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Requirements for Audio Fingerprinting Techniques[Cano, 2002]

Discrimination power

Robustness

Cropping

Distortions

Compactness

Computational simplicity

The usual trade-off between reliability and efficiency is present here!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 8 / 26

Page 22: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Requirements for Audio Fingerprinting Techniques[Cano, 2002]

Discrimination power

Robustness

CroppingDistortions

Compactness

Computational simplicity

The usual trade-off between reliability and efficiency is present here!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 8 / 26

Page 23: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Requirements for Audio Fingerprinting Techniques[Cano, 2002]

Discrimination power

Robustness

CroppingDistortions

Compactness

Computational simplicity

The usual trade-off between reliability and efficiency is present here!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 8 / 26

Page 24: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Requirements for Audio Fingerprinting Techniques[Cano, 2002]

Discrimination power

Robustness

CroppingDistortions

Compactness

Computational simplicity

The usual trade-off between reliability and efficiency is present here!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 8 / 26

Page 25: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Requirements for Audio Fingerprinting Techniques[Cano, 2002]

Discrimination power

Robustness

CroppingDistortions

Compactness

Computational simplicity

The usual trade-off between reliability and efficiency is present here!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 8 / 26

Page 26: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

An Algorithmic Overview

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 9 / 26

Page 27: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Audio Fingerprinting - A Modern Research Field

Various procedures were developed since the very late 20th century

Most use the same general framework

Of high commercial interest for music companies

Especially relevant for mobile devices

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 10 / 26

Page 28: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Audio Fingerprinting - A Modern Research Field

Various procedures were developed since the very late 20th century

Most use the same general framework

Of high commercial interest for music companies

Especially relevant for mobile devices

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 10 / 26

Page 29: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Audio Fingerprinting - A Modern Research Field

Various procedures were developed since the very late 20th century

Most use the same general framework

Of high commercial interest for music companies

Especially relevant for mobile devices

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 10 / 26

Page 30: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Audio Fingerprinting - A Modern Research Field

Various procedures were developed since the very late 20th century

Most use the same general framework

Of high commercial interest for music companies

Especially relevant for mobile devices

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 10 / 26

Page 31: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

State-of-the-Art

The Philps technique [Haitsma, 2002]

Shazam [Wang, 2003]

Google waveprint [Baluja, 2007]

MASK [Anguera, 2012]

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 11 / 26

Page 32: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

State-of-the-Art

The Philps technique [Haitsma, 2002]

Shazam [Wang, 2003]

Google waveprint [Baluja, 2007]

MASK [Anguera, 2012]

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 11 / 26

Page 33: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

State-of-the-Art

The Philps technique [Haitsma, 2002]

Shazam [Wang, 2003]

Google waveprint [Baluja, 2007]

MASK [Anguera, 2012]

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 11 / 26

Page 34: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

State-of-the-Art

The Philps technique [Haitsma, 2002]

Shazam [Wang, 2003]

Google waveprint [Baluja, 2007]

MASK [Anguera, 2012]

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 11 / 26

Page 35: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Shazam

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 12 / 26

Page 36: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Application Details

Free mobile app that recognizes music, TV and media based on shortaudio snippets recorded with your phone

Features special camera interaction for interactive experiences andadditional content

Possible connections to Google, Snapchat, Facebook, Spotify,...

Encourages music purchases

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 13 / 26

Page 37: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Application Details

Free mobile app that recognizes music, TV and media based on shortaudio snippets recorded with your phone

Features special camera interaction for interactive experiences andadditional content

Possible connections to Google, Snapchat, Facebook, Spotify,...

Encourages music purchases

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 13 / 26

Page 38: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Application Details

Free mobile app that recognizes music, TV and media based on shortaudio snippets recorded with your phone

Features special camera interaction for interactive experiences andadditional content

Possible connections to Google, Snapchat, Facebook, Spotify,...

Encourages music purchases

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 13 / 26

Page 39: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Application Details

Free mobile app that recognizes music, TV and media based on shortaudio snippets recorded with your phone

Features special camera interaction for interactive experiences andadditional content

Possible connections to Google, Snapchat, Facebook, Spotify,...

Encourages music purchases

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 13 / 26

Page 40: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Demonstration

Let’s see Shazam in action!

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 14 / 26

Page 41: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

How Does it Work? - An Overview

The main steps of Shazam are:

Spectogram computation

Constellation map construction

Combinatorial hashing

Database searching

Scoring of possible matches

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 15 / 26

Page 42: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

How Does it Work? - An Overview

The main steps of Shazam are:

Spectogram computation

Constellation map construction

Combinatorial hashing

Database searching

Scoring of possible matches

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 15 / 26

Page 43: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

How Does it Work? - An Overview

The main steps of Shazam are:

Spectogram computation

Constellation map construction

Combinatorial hashing

Database searching

Scoring of possible matches

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 15 / 26

Page 44: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

How Does it Work? - An Overview

The main steps of Shazam are:

Spectogram computation

Constellation map construction

Combinatorial hashing

Database searching

Scoring of possible matches

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 15 / 26

Page 45: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

How Does it Work? - An Overview

The main steps of Shazam are:

Spectogram computation

Constellation map construction

Combinatorial hashing

Database searching

Scoring of possible matches

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 15 / 26

Page 46: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Spectograms

Visually represent the signal strength (energy) of a signal over time atvarious frequencies

Can also be used for sound waves

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 16 / 26

Page 47: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Spectograms

Visually represent the signal strength (energy) of a signal over time atvarious frequencies

Can also be used for sound waves

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 16 / 26

Page 48: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Spectograms

Visually represent the signal strength (energy) of a signal over time atvarious frequencies

Can also be used for sound waves

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 16 / 26

Page 49: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

From Spectograms to Constellation Maps

Shazam chooses high energy candidate peaks with respect to density

Why high amplitudes? Robustness!

Results in a constellation map

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 17 / 26

Page 50: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

From Spectograms to Constellation Maps

Shazam chooses high energy candidate peaks with respect to density

Why high amplitudes? Robustness!

Results in a constellation map

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 17 / 26

Page 51: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

From Spectograms to Constellation Maps

Shazam chooses high energy candidate peaks with respect to density

Why high amplitudes? Robustness!

Results in a constellation map

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 17 / 26

Page 52: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

From Spectograms to Constellation Maps

Shazam chooses high energy candidate peaks with respect to density

Why high amplitudes? Robustness!

Results in a constellation map

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 17 / 26

Page 53: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Combinatorial Hashing

Anchor points and target zones are determined

Pairwise hashing between anchor and points in the zone

Two frequency components plus the time difference form one hash

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 18 / 26

Page 54: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Combinatorial Hashing

Anchor points and target zones are determined

Pairwise hashing between anchor and points in the zone

Two frequency components plus the time difference form one hash

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 18 / 26

Page 55: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Combinatorial Hashing

Anchor points and target zones are determined

Pairwise hashing between anchor and points in the zone

Two frequency components plus the time difference form one hash

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 18 / 26

Page 56: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Combinatorial Hashing

Anchor points and target zones are determined

Pairwise hashing between anchor and points in the zone

Two frequency components plus the time difference form one hash

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 18 / 26

Page 57: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Why is Hashing Necessary?

Problem: Constellation maps are very sparse which results in longmatching times

Hashing enables the usage of 64-Bit structs (32 bits hash, 32 bitstime offset and track ID)

Limited number of points in every target zone to limit combinatorialexplosion

Overall trade: 10 times more disk space for 10000 times fastermatching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 19 / 26

Page 58: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Why is Hashing Necessary?

Problem: Constellation maps are very sparse which results in longmatching times

Hashing enables the usage of 64-Bit structs (32 bits hash, 32 bitstime offset and track ID)

Limited number of points in every target zone to limit combinatorialexplosion

Overall trade: 10 times more disk space for 10000 times fastermatching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 19 / 26

Page 59: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Why is Hashing Necessary?

Problem: Constellation maps are very sparse which results in longmatching times

Hashing enables the usage of 64-Bit structs (32 bits hash, 32 bitstime offset and track ID)

Limited number of points in every target zone to limit combinatorialexplosion

Overall trade: 10 times more disk space for 10000 times fastermatching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 19 / 26

Page 60: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Why is Hashing Necessary?

Problem: Constellation maps are very sparse which results in longmatching times

Hashing enables the usage of 64-Bit structs (32 bits hash, 32 bitstime offset and track ID)

Limited number of points in every target zone to limit combinatorialexplosion

Overall trade: 10 times more disk space for 10000 times fastermatching

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 19 / 26

Page 61: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Searching and Scoring

Matching of all sample hashes with database hashes

Association of time pairs for every matching hash (both offset times)

Distribution of time pairs into bins according to track ID

Construction of a scatterplot of association between sample anddatabase files

Detection of point clusters that form a diagonal line

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 20 / 26

Page 62: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Searching and Scoring

Matching of all sample hashes with database hashes

Association of time pairs for every matching hash (both offset times)

Distribution of time pairs into bins according to track ID

Construction of a scatterplot of association between sample anddatabase files

Detection of point clusters that form a diagonal line

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 20 / 26

Page 63: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Searching and Scoring

Matching of all sample hashes with database hashes

Association of time pairs for every matching hash (both offset times)

Distribution of time pairs into bins according to track ID

Construction of a scatterplot of association between sample anddatabase files

Detection of point clusters that form a diagonal line

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 20 / 26

Page 64: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Searching and Scoring

Matching of all sample hashes with database hashes

Association of time pairs for every matching hash (both offset times)

Distribution of time pairs into bins according to track ID

Construction of a scatterplot of association between sample anddatabase files

Detection of point clusters that form a diagonal line

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 20 / 26

Page 65: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

Searching and Scoring

Matching of all sample hashes with database hashes

Association of time pairs for every matching hash (both offset times)

Distribution of time pairs into bins according to track ID

Construction of a scatterplot of association between sample anddatabase files

Detection of point clusters that form a diagonal line

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 20 / 26

Page 66: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Overview Technical Details

A Successful Match

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 21 / 26

Page 67: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Synopsis References

Conclusive Remarks

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 22 / 26

Page 68: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Synopsis References

Audio fingerprinting aims for the construction of a compact,discriminative, robust and efficient encoding of audio data

Works on very small song snippets reliably

Shazam as an exemplary algorithm is based on constellation mapsand uses combinatorial hashing for speedup

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 23 / 26

Page 69: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Synopsis References

Audio fingerprinting aims for the construction of a compact,discriminative, robust and efficient encoding of audio data

Works on very small song snippets reliably

Shazam as an exemplary algorithm is based on constellation mapsand uses combinatorial hashing for speedup

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 23 / 26

Page 70: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Synopsis References

Audio fingerprinting aims for the construction of a compact,discriminative, robust and efficient encoding of audio data

Works on very small song snippets reliably

Shazam as an exemplary algorithm is based on constellation mapsand uses combinatorial hashing for speedup

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 23 / 26

Page 71: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Synopsis References

References

J. Haitsma and T. Kalker (2002)

A Highly Robust Audio Fingerprinting System

Journal of New Music Research 32 (2), 211 – 221

E.Batlle, P. Cano, J. Haitsma and T. Kalker(2002)

A Review of Algorithms for Audio Fingerprinting

IEEE Workshop on Multimedia Signal Processing 2002

A. Wang (2003)

An Industrial-Strength Audio Search Algorithm

Proc. of the 4th International Conference on Music Information Retrieval

S.Baluja and M. Covell (2007)

Audio Fingerprinting: Combining Computer Vision & Data Stream Processing

Acoustics, Speech and Signal Processing

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 24 / 26

Page 72: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Synopsis References

References - continued

X. Anguera, A. Garzon and T. Adamek (2012)

MASK: Robust Local Features for Audio Fingerprinting

IEEE International Conference on Multimedia and Expo

All graphics in this presentation are taken from [Wang, 2003].

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 25 / 26

Page 73: A Short Introduction to Audio Fingerprintinghpac.rwth-aachen.de/teaching/sem-mus-17/Final-slides/Froitzheim.pdf · Synopsis References Simon Froitzheim (RWTH Aachen) Audio Fingerprinting

Audio Fingerprinting Basics An Algorithmic Overview Shazam Conclusive Remarks

Synopsis References

Thank you for listening

Simon Froitzheim (RWTH Aachen) Audio Fingerprinting June 28th, 2017 26 / 26