educational software using audio to score alignment antoine gomas supervised by dr. tim collins...

Educational Software using

Audio to Score Alignment

Antoine Gomas supervised by

Dr. Tim Collins & Pr. Corinne Mailhes

7th of September, 2007

2

Agenda Introduction Objectives Review & Innovation Work

Dynamic Time WarpingHidden Markov Models Interface

Conclusion

3

Audio to score alignment?

Associate Notes in a score Timing points in a

recording

Example

4

Project objectives

Implement a monophonic audio to score alignment algorithm

Evaluate characteristics of the performance

Design a learning interface to help music students improve their performance

5

Review (1)

Previous workAlgorithms already existSimilar to Spoken Language ProcessingApplication: musicologyProfessional recordings

6

Review (2)

Previous work (continued)Dynamic Time Warping

Few parameters Heavy Low flexibility

Hidden Markov Models Very flexible Large number of parameters (training)

7

Review (3)

InnovationApply to educational softwareRequires modifications & new functionalities

Cope with errors Detect errors

8

Work

Dynamic Time Warping

Hidden Markov Models

ITS & Interface design

9

DTW (1) Overview Get a first version to work Attack, Sustain, Silence Uses Dynamic Time Warping

10

DTW (2) Structure

Feature extraction Distance matrix Find optimal path

Signal Score

Instrumentmodel

Feature vectors Feature vectors

DTW

Aligned sequence

11

DTW (3) Instrument model

Silence Energy Attack Energy Sustain

Guitar

Vibes

12

DTW (4) Results

~95% notes aligned on “good” performances Rhythm errors

Very high tolerance Provided pitches are correct

Pitch errors Tuning errors: no problem Note errors: OK

Good results, but limitations

13

DTW (5) Limitations

Impossible to recover from severe student mistakes

Self-correction not perfect

14

HMM (1) Why?

ExpectedLower computing requirementsFlexibility to recover from student’s errors

And alsoUse state-of-the-art techniquesFind connections with SLP

15

HMM (2) Application to ASA

HMM Observed symbols State trellis Emission matrix

Decoded sequence

ASA Recording frames Score representation Instrument model

Performance image

16

HMM (3) Flexibility

Note 6

D6, P6

1-p12 1

Note 1

D1, P1

Note 2

D2, P2

Note 3

D3, P3

Note 4

D4, P4

Note 5

D5, P5p12 p23 1 1

Note 1

D1, P1

Note 2

D2, P2

Note 3

D3, P3

Note 4

D4, P4

Note 5

D5, P5

p23

1 1p12

Note 7

D’3, P3

Note 8

D’4, P4

1-p23

11

Note 6

D2, P’2

1-p12

1-p63

p63

1-p23

17

HMM (4) Results

100% on rhythmic recordings Good on melodic recordings Rhythm errors

Good tolerance, though inferior to DTW Pitch errors

No data Severe mistakes

Fine when anticipated Self correction

More robust than DTW Tempo estimation not critical

18

HMM (5) Extensions

Pitch Other note topologies Improve speed

Local algorithmLanguage

Waiting state

19

ITS & Interface (1)

Intelligent Tutoring Systems Knowledge models

Domain modelLearner model

Open LearnerModel

DM

LM

Teaching strategies

DM LM

Teaching strategies

Overlay Perturbation

20

ITS & Interface (2)

21

Conclusion

DTW not suitable for education Promising HMM results

Works without pitch Additional paths for anticipated errors

Still room for improvements Pitch Computation efficiency

Coherent ground together with IF design

22

Thank you for listening

Any questions?

educational software using audio to score alignment antoine gomas supervised by dr. tim collins...

Documents

performance slide

critical slide

perfect slide

slp slide

interface design slide

note topologies

dynamic time warping

dtw pitch errors