beat tracking - the department of electrical engineering - columbia

19
E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19 Lecture 10: Beat Tracking Dan Ellis Dept. Electrical Engineering, Columbia University [email protected] http://www.ee.columbia.edu/~dpwe/e4896/ 1 1. Rhythm Perception 2. Onset Extraction 3. Beat Tracking 4. Dynamic Programming ELEN E4896 MUSIC SIGNAL PROCESSING

Upload: others

Post on 12-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Lecture 10:Beat Tracking

Dan EllisDept. Electrical Engineering, Columbia University

[email protected] http://www.ee.columbia.edu/~dpwe/e4896/1

1. Rhythm Perception2. Onset Extraction3. Beat Tracking4. Dynamic Programming

ELEN E4896 MUSIC SIGNAL PROCESSING

Page 2: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

1. Rhythm Perception

• What is rhythm?aspects, origin?

2

Page 3: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Rhythm Perception Experiments• Tapping experiment

ambiguous; hierarchy

3

0

10

20

30

40

time / sec

Freq

uenc

y

0 5 10 15 20 25 30

240

761

2416

7671

McKinney & Moelants 2006

mirex06/train10 Harnoncourt

Page 4: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Rhythm Tracking Systems• Two main components

front end: extract ‘events’ from audioback end: find plausible beat sequence to match

• Other outputstempotime signaturemetrical level(s)

4

Onset Eventdetection

Beatmarking

Musicalknowledge

Audio Beat times etc.

Page 5: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

2. Onset detection• Simplest thing is

energy envelope

5

e(n0) =W/2�

n=�W/2

w[n] |x(n + n0)|2

time / sec

freq

/ kHz

level

/ dB

level

/ dB

0 2 4 6 8 10

0

2

4

6

8

0102030405060

0102030405060

time / sec0 2 4 6 8 10

Harnoncourt Maracatu

Bello et al. 2005

emphasis on high frequencies?

f

f · |X(f, t)|

f

|X(f, t)|

Page 6: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Multiband Derivative• Sometimes energy just “shifts”

calculate & sum onset in multiple bandsuse ratio instead of difference - normalize energy

6

o(t) =�

f

W (f) max(0,|X(f, t)|

|X(f, t� 1)| � 1)

0 2 4 6 8 100

10203040

0 2 4 6 8 10

freq

/ kHz

level

/ dB

0

2

4

6

8

5060

time / sec time / sec

Puckette et. al 1998

bonk~

Page 7: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Phase Deviation• When amplitudes don’t change much,

phase discontinuity may signal new note

• Can detect by comparing actual phase with extrapolation from past

combine with amplitude?7

8.96 8.97 8.98 8.99 9 9.01 9.02 9.03 9.04 9.05 9.06

−0.1−0.05

00.050.10.15

X(f, tn-1)∆Θ∆Θ

X(f, tn+1)^

X(f, tn+1)

X(f, tn)

Re{X}

Im{X}

prediction

actual

Complexdeviation

Bello et al. 2005

X̂(f, tn+1) = X(f, tn)X(f, tn)

X(f, tn�1)

Page 8: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

3. Rhythm Tracking• Earliest systems were rule based

based on musicology Longuet-Higgins and Lee, 1982inspired by linguistic grammars - Chomsky

input: event sequence (MIDI)output: quarter notes, downbeats

8

Desain & Honing 1999

Page 9: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Resonators• How to address:

build-up of rhythmic evidence“ghost events”(audio input)

• Seems more like a comb filter...resonant filterbank of

for all possible T

9

Scheirer 1998

y(t) = �y(t� T ) + (1� �)x(t)

Page 10: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Multi-Hypothesis Systems• Beat is ambiguous→ develop several alternatives

inputs: music audiooutputs: beat times, downbeats, BD/SD patterns...

10

Goto & Muraoka 1994Goto 2001Dixon 2001

A/D conversion Beat prediction Beat informationtransmission

Frequencyanalysis

tf

Frequency

spectru

m

Musical audio signals

Compact disc

Beat informationAgents

Manager

Onset-finders vectorizers

Drum-soundfinder

time Onset-time

Higher-levelcheckers

Page 11: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

4. Dynamic Programming• Re-cast beat tracking as optimization:

Find beat times {ti} to maximize

is onset strength function is tempo consistency score e.g.

• Looks like an exponential search over all {ti}... but Dynamic Programming saves us

11

C({ti}) =N�

i=1

O(ti) + �N�

i=2

F (ti � ti�1, �p)

F (�t, �) = ��

log�t

�2F (�t, �)O(t)

Ellis 2007

Page 12: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Dynamic Programming (DP)• DP is a general algorithm for optimizing

“optimal substructure” problemsi.e. where optimal total solution can be built from optimal partial solutions

• e.g. best path through cost matrix

path after (i,j) is independent of how we got there12

Input frames fi

Ref

eren

ce fr

ames

r j D(i,j) = d(i,j) + min{ }D(i-1,j) + T10D(i,j-1) + T01

D(i-1,j-1) + T11D(i-1,j)

D(i-1,j) D(i-1,j)

T10

T 01

T 11 Local match cost

Lowest cost to (i,j)

Best predecessor (including transition cost)

Page 13: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Tempo Estimation• Algorithm needs global tempo period τ

otherwise problem is not “optimal substructure”

13

8 8.5 9 9.5 10 10.5 11 11.5 12-2

0

2

4

0

200

400

0 0.5 1 1.5 2 2.5 3 3.5 4lag / s

time / s

-100

0

100

200

Onset Strength Envelope (part)

Raw Autocorrelation

Windowed Autocorrelation

Primary Tempo PeriodSecondary Tempo Period

• Pick peak in onset envelope autocorrelationafter applying “human preference” windowcheck for subbeat

Page 14: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Beat Tracking by DP• To optimize

define C*(t) as best score up to time t then build up recursively (with traceback P(t))

final beat sequence {ti} is best C* + back-trace

14

C({ti}) =N�

i=1

O(ti) + �N�

i=2

F (ti � ti�1, �p)

C*(t) = O(t) + max{αF(t – τ, τp) + C*(τ)}τ

P(t) = argmax{αF(t – τ, τp) + C*(τ)}τ

O(t)

C*(t)

Page 15: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

beatsimple• Beat tracking in 15 lines of Matlab

15

function beats = beat_simple(onset, osr, tempo, alpha)% beats = beat_simple(onset, osr, tempo, alpha)% Core of the DP-based beat tracker% <onset> is the onset strength envelope at frame rate <osr>% <tempo> is the target tempo (in BPM)% <alpha> is weight applied to transition cost% <beats> returns the chosen beat sample times (in sec).% 2007–06–19 Dan Ellis [email protected]

if nargin < 4; alpha = 100; end

% backlink(time) is best predecessor for this point% cumscore(time) is total cumulated score to this pointlocalscore = onset;backlink = -ones(1,length(localscore));cumscore = zeros(1,length(localscore));

% convert bpm to samplesperiod = (60/tempo)*osr;

% Search range for previous beatprange = round(-2*period):-round(period/2);% Log-gaussian window over that rangetxwt = (-alpha*abs((log(prange/-period)).^2));

for i = max(-prange + 1):length(localscore) timerange = i + prange; % Search over all possible predecessors % and apply transition weighting scorecands = txwt + cumscore(timerange); % Find best predecessor beat [vv,xx] = max(scorecands); % Add on local score cumscore(i) = vv + localscore(i); % Store backtrace backlink(i) = timerange(xx);

end

% Start backtrace from best cumulated score[vv,beats] = max(cumscore);% .. then find all its predecessorswhile backlink(beats(1)) > 0 beats = [backlink(beats(1)),beats];end

% convert to secondsbeats = (beats-1)/osr;

Page 16: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Results• Verify

against human tapping datavary tradeoff weight α vary tempo estimate τ

16

100

70

5040

30

150

200

mea

n ou

tput

BPM

bonus1 (Choral): BPM target vs. BPM out

0

0.5

1Beat accuracy

1000

0.1

0.2m/+ of inter-beat-intervals

target BPM

_ = 400_ = 100

100

70

150

200

300

400

mea

n ou

tput

BPM

accu

racy

accu

racy

m/+

m/+

bonus6 (Jazz): BPM target vs. BPM out

_ = 400_ = 100

0

0.5

1Beat accuracy

10070705030 40 200150 200150 300 4000

0.1

0.2

target BPM

m/+ of inter-beat-intervals

Page 17: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Downbeat Detection• Downbeat = start of “bar”

one level up in metrical hierarchy

• Approaches

Goto’94 BTS: Pop music SD/BD template

Jehan’05: Trained classifier

17

Page 18: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

Summary• Rhythm perception

Innate and strong, hierarchic

• Beat tracking modelsNeed to account for buildup & persistence

• Dynamic ProgrammingNeat way to maintain multiple hypotheses

18

Page 19: Beat Tracking - the Department of Electrical Engineering - Columbia

E4896 Music Signal Processing (Dan Ellis) 2013-04-01 - /19

References• J. P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M. B. Sandler, “A Tutorial on Onset Detection in

Music Signals,” IEEE Tr. Speech and Audio Proc., vol.13, no. 5, pp. 1035-1047, September 2005.

• P. Desain & H. Honing, “Computational models of beat induction: The rule-based approach,” J. New Music Research, vol. 28 no. 1, pp. 29-42, 1999.

• Simon Dixon, “Automatic extraction of tempo and beat from expressive performances,” J. New Music Research, vol. 30 no. 1, pp. 39-58, 2001.

• D. Ellis, “Beat Tracking by Dynamic Programming,” J. New Music Research, vol. 36 no. 1, pp. 51-60, March 2007.

• Tristan Jehan, “Creating Music By Listening,” Ph.D Thesis, MIT Media Lab, 2005.

• Masataka Goto & Yoichi Muraoka “A Beat Tracking System for Acoustic Signals of Music,” ACM Multimedia, pp.365-372, October 1994.

• Masataka Goto, “An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds,” J. New Music Research, vol. 30 no. 2, pp.159-171, June 2001.

• M. F. McKinney and D. Moelants, “Ambiguity in tempo perception: What draws listeners to different metrical levels?” Music Perception, vol. 24 no. 2 pp. 155-166, December 2006.

• M. Puckette, T. Apel, D. Zicarelli, “Real-time audio analysis tools for Pd and MSP,” Proc. Int. Comp. Music Conf., Ann Arbor, pp. 109–112, October 1998.

• Eric. D. Scheirer, “Tempo and beat analysis of acoustic musical signals,” J. Acoust. Soc. Am., vol. 103, pp. 588-601, 1998.

19