melodic contour matching for midi ﬁles - uva · piano). there is also music written that is meant...

Informatic

a—

Univ

ersi

teit

van

Amst

erdam

Supervisors: David Ahn (ILPS)

Balder ten Cate (ILPS)

Signed:

Melodic contour matching for MIDI

files

Alex Mazurel

August 23, 2006

submitted in the partial fullfillment of the B. Sc.

Computer Science

Bachelor Informatica

Universiteit van Amsterdam

Melodic contour matching for MIDI files University of Amsterdam

Alex Mazurel August 23, 2006 of 28

Abstract

With ever increasing amounts of music in everyone’s personal library and more and more

music video channels, sometimes it is hard to keep track of it all. What if you hear a

song on the radio that you like, but didn’t catch the artist and or title? How can you

ever find out which song it is based on the tune stuck in your head?

The program presented in this thesis may help you out. It parses a (potentially large)

database of music and looks for the melody requested.

A few small experiments are conducted to test the how well the program works. This

test consists of repeatedly querying the program with computer generated queries, to

see how much input is required to give acceptable results.

1


Acknowledgments

I would like to thank the following people for their help with this project:

• David Ahn and Balder ten Cate for guiding me during this project and assisting

me where needed,

• Mutopia for providing me with a large, copy-right free database of music,

• My girlfriend for all the mental support,

• Jeroen Bulters for reviewing my thesis and comments during the development of

the program,

• Stephan Schroevers and Marc Makkes for helping me out with all my C coding

problems,

• and of course all my fellow students for their ideas and comment during the weekly

meetings.

Alex Mazurel August 23, 2006 Page 2 of 28

Contents

1 Introduction 5

2 Theoretical background 6

2.1 Music formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Monophonic versus Polyphonic . . . . . . . . . . . . . . . . . . . . 6

2.1.2 UDRM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.3 Relative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.4 XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.5 MIDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.1 Exact matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.2 Approximate matching . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Program design 15

3.1 User interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Music Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 The program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.1 Exact matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.2 Approximate matching . . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Experiments 18

4.1 Monophonic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2 Polyphonic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 Future work 25

5.1 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.2 Additions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3


6 Conclusions 26


CHAPTER 1

Introduction

Why would someone want to do a study about melodic contour matching? It’s all made

clear by a small anecdote. I was talking to my girlfriend about old DOS-games. She

then told me she always played this great game, but didnt remember the title 1. All she

remembered was the tune playing when the game was running. She hummed it for me

and I instantly recognised the song. However I didn’t know the title of this song either.

When asking more people if they knew the title of this song they all answered the same

thing: ”I know this song, but I can’t remember the title”2. A few weeks later I had

to choose a topic for my Bachelor’s thesis and I thought back to that song. The next

moment I chose the subject to be melodic contour matching.

Hopefully this small piece of personal history makes clear not only my personal motiva-

tion for this subject, but also the need for a program to find a piece of music given the

melody.

In this thesis I will describe the background information needed to understand this

thesis followed by the solution I created for the problem given above. This solution is

not however an end-product. It is merely a beginning. There are many improvements

to be made, but music recognition and/or retrieval is a very complicated field in which

there is no given standard method to use.

The solution is split up into a few parts. These parts include a user interface, a database

and the actual matching/retrieval part.

Next a few small experiments are conducted to test how good the program works.

Finally I will present my conclusions and any ideas for future work that I may have.

1The game my girlfriend meant was ’Digger’ released by Windmill software in 19832In the meantime the song has been identified as being ”The popcorn song”

5

CHAPTER 2

Theoretical background

In this chapter I will explain all theory used in the melodic contour matching. Even

though the focus of this paper lies with monophonic data, I have ensured compatibility

for Polyphonic data.

2.1 Music formats

For melodic contour matching, multiple file formats can be used. Of course Wave or MP3

is the most exact and widely used format, however this is hard to process. Therefore

this thesis relies on MIDI files (generated from sheet music).

These MIDI files are then converted to different formats used for searching, because the

MIDI files themselves are to complex to use. For very fast and coarse matching, a simple

format (UDRM) is used. For more precise matching, a more complex format is needed.

I will describe the formats implemented for this paper, ordered from simple (UDRM) to

complex (MIDI).

2.1.1 Monophonic versus Polyphonic

First of all, there are two main types of music: monophonic and polyphonic music.

Monophonic means literally, ’Able of sounding one note, or voice, at a time’ (Wikipedia

definition). Polyphonic then is easy to guess, ’Able of sounding multiple notes, or voices,

at the same time’. In MIDI music files there are pieces of music written for monophonic

instruments (e.g. Trumpet) as well as pieces written for polyphonic instruments (e.g.

Piano). There is also music written that is meant to be ’multitimbral’. This mean mul-

tiple different instruments at the same time playing the same notes. This multitimbral

music is therefore always polyphonic, but polyphonic music not always multitimbral.

For this paper, multitimbral will be considered as monophonic, as the instrument is not

considered, but only the notes played. Of course, multitimbral, polyphonic music will

6


Figure 2.1: The first two measures of Beethoven’s Fifth symphony

still be considered polyphonic.

2.1.2 UDRM

The ’UDRM’ format is the most straightforward format one can think of. It uses the

relative note frequency only. So note duration and exact frequency are not used. The

differences are given by the letters ’U’ for ’Up’, ’D’ for ’Down’, ’R’ for ’Repeat’ and ’M’

for ’Multiple’.

This is a very compact and easy format. Even though most people will not be able to

hear or remember the exact difference and/or pitch and octave of all notes, this format

can still be used as it solely relies on a melodic contour and not on the exact notes.

Of course in monophonic music the ’M’ will never occur, so we have another easy way

to tell whether music is monophonic or polyphonic: If the ’M’ occurs, then the piece of

music is polyphonic.

The first note in a piece of music does not have a previous note, so we always use ’U’ to

represent the first note.

Even though multiple simultaneous notes may all be higher than the previous note(s),

this is not checked. This means that all occurrences of multiple notes will be classified

as ’M’ even thought these could be classified as ’U’, ’D’ or ’R’.

For notes following an ’M’, the ’last’ note of the ’M’ will be used for classification.

The two cases mentioned above could be filtered out in a number of cases, but due to a

lack of time this is left as work for any follow-up studies.

Example: the first piece of Beethoven’s Fifth symphony (Figure 2.1) would become

’URRMURRM’

2.1.3 Relative

The ’Relative’ format is mostly the same as ’UDRM’. However the Relative format

measures the exact difference between successive notes, not just Up, Down or Repeat.

The format is notes separated by spaces, where each note is represented by ’U’, ’D’, ’R’

or ’M’ followed by the exact difference.



Of course the first note, a ’M’ and all notes directly after a ’M’ don’t have an exact

difference, so this will be set to 0.

For the notes directly after a ’M’, other heuristics can also be used. (e.g. The average

of all exact differences.)

Example: the first piece of Beethoven’s Fifth symphony (Figure 2.1) would become ’U0

R0 R0 M0 U0 R0 R0 M0’

2.1.4 XML

This format has been implemented, but has not been used for melodic contour matching

yet. The definition of this format is given by the following DTD:

<!DOCTYPE track [

<!ELEMENT track (note*, meta)>

<!ELEMENT note (start, pitch, octave, duration, relative)>

<!ELEMENT start (#PCDATA)>

<!ELEMENT pitch (#PCDATA)>

<!ELEMENT octave (#PCDATA)>

<!ELEMENT duration (#PCDATA)>

<!ELEMENT relative (#PCDATA)>

<!ELEMENT meta (notes, removed)>

<!ELEMENT notes (#PCDATA)>

<!ELEMENT removed (#PCDATA)>

]>

The meta information tells how much notes there are in the track and how much note

events (Multiples) have been removed.

Example: the first piece of Beethoven’s Fifth symphony (Figure 2.1) would become:

<?xml version="1.0" encoding="ISO-8859-1"?>

<track>

<note>



</note>

<note>


<pitch> F </pitch>




</note>

<note>


<pitch> F </pitch>




</note>

<note>


<pitch> M </pitch>

<octave> M </octave>

<duration> M </duration>

<relative> M </relative>

</note>

<meta>

<notes> 1800 </notes>

<removed> 860 </removed>

</meta>

</track>

2.1.5 MIDI

Musical Instrument Digital Interface, or MIDI, is an industry-standard electronic com-

munications protocol that defines each musical note and electronic musical instrument

such as a synthesizer by means of events. A single MIDI file can contain one or multi-

ple tracks with one or multiple events per track. As this paper focuses on monophonic

music, we will assume that every MIDI file has only a single track.



A single event contains a few values:

time this is the timestamp of the event.

channel the channel which the event occurs in.

type this is the type of event. For this paper, only ”MIDI note events” are considered.

(i.e. note-on and note-off.)

note this is the note number. For a reference of note numbers, see Figure 2.2.

velocity this is the pressure applied when the key on the keyboard was struck. Key-

boards that don’t record applied force, often use the value 64 (half of the maxi-

mum).

N.B. In the Standard MIDI File format, channel and type are merged into a single byte.

2.2 matching

This section describes the algorithms and tools used in the actual matching of MIDI

files.

There are two main methods of matching. Exact and Approximate matching. Exact

matching is finding an exact match to a given query. Approximate matching however,

finds the closest matches, even if they don’t completely match.

2.2.1 Exact matching

For exact matching (all formats), the simplified Boyer-Moore Fast String searching al-

gorithm (BM) is used.

The Boyer-Moore string search algorithm is a fast string searching algorithm. The

algorithm pre-processes the target string that is being searched for, but not the string

being searched. This algorithm doesn’t need to actually check every character of the

string to be searched but rather skips over some of them. Its efficiency derives from the

fact that, with each unsuccessful attempt to find a match between the search string and

the text it’s searching in, it uses the information gained from that attempt to rule out

as many positions of the text as possible where the string could not match.

For an explanation about the Boyer-Moore method see [BM77]. The algorithm used for

this experiment is given by Algorithm 1.



Figure 2.2: MIDI note numbers, frequency and international musical notation



Algorithm 1 search(Files, pattern)1: bC← badCharacter(pattern)2: n← length(pattern)3: for all file ∈ Files do4: m← length(file)5: offset← 06: while offset ≤ m− n do7: index← n8: while index ≥ 0 and pattern[index] = file[offset + index] do9: index← index− 1

10: end while11: if index < 0 then12: offset← offset + max(1, bC(file[offset]))13: results[file]← results[file] + 114: else15: offset← offset + max(1, bC(file[offset + index]))16: end if17: end while18: end for19: return results

Algorithm 2 badCharacter(pattern, Σ)1: m← length(pattern)2: for all a ∈ Σ do3: λ[a] = 04: end for5: for j← 1 to m do6: λ[pattern[j]]← j7: end for8: return λ



2.2.2 Approximate matching

UDRM

The algorithm used for approximate matching in the ’UDRM’ format is called ’bitap’.

The algorithm tells whether a given text contains a substring which is ”approximately

equal” to a given pattern, where approximate equality is defined in terms of Levenshtein

distance. if the substring and pattern are within a given distance k of each other, then

the algorithm considers them equal.

The bitap algorithm is perhaps best known as one of the underlying algorithms of the

Unix utility ’agrep’, written by Manber, Wu. For a better explanation of the algorithm

see [WM92]

Relative

The method used for Approximate matching in the ’Relative’ format is the Bhattagharyya

coefficient. This method is actually a very simple formula:

Bhattacharyya(p, q) =∑x∈X

√p(x)q(x)

Where X is all values in p. p and q are both a piece of relative data. This means a

series of numbers, which can be interpreted as a vector. The straightforward geometrical

interpretation of the Bhattacharyya coefficient is the cosine of the angle between two

n-dimensional vectors.

This method has the advantage that the amplitude of the melody doesn’t count anymore.

As long as the differences stay relatively the same, the Bhattacharyya coefficient will

match them as an exact match. So not only does the offset from a fixed point of the

melody disappear from the equation (e.g. CD will match DE), with this method, also

the amplitude become irrelevant.


CHAPTER 3

Program design

In order to do anything with the music database and user input, a program is of course

needed. How this program is built up, as well as the inner works are discussed in this

chapter.

The main matching program consists of a few separate parts. An interface for user

queries, a music database, format conversion modules and last but not least, the match-

ing program.

3.1 User interface

The user interface has gotten very little attention for this paper. The is currently only a

single format that queries can be given in (UDRM). The task of the user interface is to

give a user the chance to input his/her query and put the matching program to work.

For now it only has 4 basic elements. Input fields for the host and port the daemon is

running on, an input field for the query and a submit button. The result page is a listing

of possible matches to the user query sorted by descending probability. Every result is

at the same time a link to the piece of music, so the user can verify the results.

3.2 Music Database

The music Database used for this project contains all the Mutopia project files. The

music in this database consists of pieces of public-domain music. This means that all

copyright have expired and you are free to copy or perform this music. All copyrights on

music will expire 70 years after the last copyright holder dies. Because of this restriction

the Mutopia collection contains only older songs, so the program presented here has not

been tested or released with any popular music. Of course, everyone is free to expand

or replace the database with any or all music that they own themselves.

Since the program presented in this thesis works on formats devised by me, I had to

15


convert all the provided music to these formats. Of course in a conversion information

can only be lost, not added or modified. This means we can only convert from MIDI

to ’coarser’ representations as XML, Relative and UDRM. Of course conversions from

XML to Relative and from Relative to UDRM are also possible, but these scripts have

not been written yet.

All format conversions are done with a similar script, using Perl::MIDI. (For more in-

formation about Perl or the Perl::MIDI module please refer to the Perl website and the

CPAN website.) In total 3 scripts are used: MIDI2UDRM, MIDI2REL, MIDI2XML 1.

Since this thesis works only on a single track a time all the files from the Mutopia

database have been split up into separate tracks prior to conversion. The final step

in the conversion process is to remove all songs with less than 25 notes. This is done

because these tracks contain to little information to be of any use in pattern matching.

3.3 The program

The most essential part of the program is the daemon. This is the part that connects

the database and the matching algorithms to the interface through a socket. For the

working of this daemon see Algorithm 3.

The files are all loaded into memory prior to the first socket accept, so minimal loading

time can be guaranteed.

Algorithm 3 daemon(Files)1: while true do2: socket← acceptSocketRequest()3: query← socket.readQuery()4: results← search(Files, query)5: socket.send(results)6: socket.close()7: end while

3.3.1 Exact matching

For an exact matching daemon, Algorithm 3 is used with Algorithm 1 as the search

function.

For experimental purposes, the Simplified Boyer-Moore algorithm has been adapted to

accommodate extra features. These features include ’M matches all’ and ’Fractional

matching’.1All scripts are freely available from the author



M matches all

This expansion does exactly as the name suggests, it matches every pattern character

to a ’M’ in the text. e.g. a user searches ’URRDURRD’ (The beginning of Beethoven’s

fifth symphony), but the real pattern should be ’URRMURRM’. This expansion lets

these patterns match.

Besides an advantage, this expansion also has a disadvantage. Pieces of music with a

long series of ’M’ characters would always have a match to the pattern.

Fractional matching

Fractional matching tries to accomplish the advantage of the last expansion without the

disadvantage. It does this by checking the amount of ’M’ characters in the match found

and derive a partial match from this. The formula for each match becomes:

fractionalMatch =numberofMcharactersinpattern

totallengthofthepattern

What this actually does is give a match to a series of only ’M’ characters weight 0. A

match with only the exact characters will keep weight 1 (Unless there is a ’M’ in the

query).

In stead of using 1 for a match, this gives us the possibility of using a more precise

scale. Using 1 for an exact match, but less for a partial match gives us the possibility of

ranking a match to another match.

As with every method, this method also has a disadvantage. All pieces of music with

a lot of ’M’ characters are ’punished’ for this fact. Even if you were looking for that

particular piece of music, if the piece of music contains a lot of ’M’ characters it would

still end up low in the result listing.

3.3.2 Approximate matching

For the approximate matching daemon, the search method is exchanged for either bitap

(UDRM format) or Bhattacharyya (Relative format).

A choice of format is however, not included in both interface and communications pro-

tocol between the interface and the daemon yet.


CHAPTER 4

Experiments

All experiments done to determine how well matching program works are divided into

two separate categories. These are Monophonic data and Polyphonic data. This is done,

because the program is meant for monophonic data. So at least the experiments using

monophonic data should give acceptable results. Acceptable results for polyphonic data

would be nice, but are not expected for the simple UDRM format.

The normal measures used in information retrieval include precision, recall and fall-out,

but these cant be used very well here for a simple reason. All three measures use the

term relevant- or irrelevant documents retrieved. This is a bit ambiguous in this thesis.

All documents found are relevant, since they all contain the pattern searched for (exact

search), however, only is single document is really relevant. This is the piece of music

you were looking for.

More interesting here would be the rank where the piece of music you are looking for will

end up. Is it (almost) always in the top 10, or is it randomly distributed? In the first

case this program would be useful, but in the latter, this program would be completely

useless. If the rank was randomly distributed you would have found the requested piece

of music just as fast by randomly listening to all pieces of music in the database.

For these experiments the search string will be a random pattern selected from the

database. The length can be limited by a preset length or by the length of the data file,

but it both cases it will be randomly distributed over all possible values.

If there would be enough time and people willing to test the program presented here, I

would have liked to have people listen to a random piece of music and then searching

for it in the database. This would give more accurate results, since people hear music

and not read the data files in the database. However due to the nature of this project,

this is impossible here.

18


Table 4.1: Rank percentages for 5000 runs, exact matching, pattern length 10 to 150 onmonophonic data

Rank 1 % Cumulative top 10 %UDRM 96.9 top 7 is 100%

Relative 98.4 top 2 is 100%

Table 4.2: Statistics for 5000 runs, exact matching, pattern length 10 to 150, UDRMformat on monophonic data

Minimum Maximum Mean Standard deviationRank 1 12 1.0 0.4

Pattern length 10 150 66.1 39.5

4.1 Monophonic data

For monophonic data only 111 files are used from 1158 of the total database. This

already illustrates that most songs are Polyphonic. However a song with just a single

’M’ is classified a polyphonic, but with better methods for conversion the fraction of

monophonic songs could very well increase drastically.

The first experiment done here is to determine the average rank given a pattern length.

This gives us an idea what pattern length is required to give an acceptable result. If

for example the required pattern length for a top 10 result 95% of the time would be a

length of 500, this would clearly be unacceptable.

What I have done is take random selected patterns with a length of 10 to 150 (both sides

inclusive) and made a table of pattern length vs. ranking. Here ranking is of course the

rank where the source of the pattern ends up in the results listing of the program. This

experiment is then run 5000 times to get accurate statistics. This experiment is done

for both the UDRM format and the Relative format.

As can be seen in table 4.1 both UDRM and Relative format perform acceptable on

monophonic data. However Relative format works even better that UDRM, but this is

of course to be expected.

Given a larger database of monophonic data, another experiment could be done to

examine the scalability of this result. Will it always give a top 10 result, or will the

Table 4.3: Statistics for 5000 runs, exact matching, pattern length 10 to 150, Relativeformat on monophonic data





Figure 4.1: Plot of 5000 runs, exact matching, pattern length 10 to 150, UDRM formaton monophonic data

Figure 4.2: Plot of 5000 runs, exact matching, pattern length 10 to 150, Relative formaton monophonic data



Table 4.4: Rank percentages for 5000 runs, UDRM format, pattern length 10 to 150 onpolyphonic data

Rank 1 % Cumulative top 10 %M matches all 0.1 1.0

Fractional 0.3 1.6

Table 4.5: Statistics for 5000 runs, M matches all, pattern length 10 to 150, UDRMformat on polyphonic data



ranks scale with the size of the database?

4.2 Polyphonic data

The second experiment is to see how well ’M matches all’ and ’Fractional matching’

perform. To do this, the same experiment as above is executed with an adapted version

of the search method. Note that for the second experiment the full 1158 files are used,

so lesser results are to be expected.

For details on the precise patterns chosen, the rankings, etc., the log files of these runs

are available freely.

Table 4.6: Statistics for 5000 runs, Fractional matching, pattern length 10 to 150, UDRMformat on polyphonic data





Figure 4.3: Plot of 5000 runs, M matches all, pattern length 10 to 150, UDRM formaton polyphonic data

Figure 4.4: Plot of 5000 runs, Fractional matching, pattern length 10 to 150, UDRMformat on polyphonic data

Table 4.7: Rank percentages for 5000 runs, Relative format, pattern length 10 to 150 onpolyphonic data

Rank 1 % Cumulative top 10 %M matches all 0.1 0.8

Fractional 5.7 19.4



Table 4.8: Statistics for 5000 runs, M matches all, pattern length 10 to 150, Relativeformat on polyphonic data



Figure 4.5: Plot of 5000 runs, M matches all, pattern length 10 to 150, Relative formaton polyphonic data

Table 4.9: Statistics for 5000 runs, Fractional matching, pattern length 10 to 150, Rela-tive format on polyphonic data





Figure 4.6: Plot of 5000 runs, Fractional matching, pattern length 10 to 150, Relativeformat on polyphonic data


CHAPTER 5

Future work

In this chapter I will give ideas that I did not have time to implement, or were not within

the project scope from the beginning.

5.1 Improvements

The first and most important part as far as I am concerned is to improve the conversion

scripts. Especially reduction of the amount of cases classified as ’M’ would be useful.

The second improvement would be an adaptation to the communications protocol to

include algorithm selection and format selection.

5.2 Additions

The first addition would be an alternative matching algorithm (with hopefully better

results). As most algorithms can only be tested empirically, the selection and/or opti-

misation of a search algorithm is a project on its own.

Adding a new format is another possible expansion. Since this would also require an

additional module to do the matching, this would be a lot of work, if not an entire

project on its own.

A new search and scoring method is the simplest addition possible. Due to the structure

of the program this only requires a new method for pattern matching, which can be

plugged in to the existing program.

A feature that was planned, but not implemented is the use of the suggested XML

format. Of course everyone is free to suggest an alternative format as well.

The final addition would be to add an option to the communications protocol to ac-

commodate the addition of new songs to the database whilst the program is running.

This also requires minor modification the programs source, but this is considered easy

for everyone with basic C programming skills.

25

CHAPTER 6

Conclusions

Which of the initial goals have been achieved and which have not. Especially why not

would be nice to know, since this would give us a lead for future improvements.

First of all, did I achieve the basic goal? Given a pattern in UDRM format, where will

the song it originates from will up in the result listing? My program always gives the

required song in the top 7 places (see table 4.1), when using monophonic data. Using the

Relative format increases this rank to a guaranteed top 2 (see table 4.1). So the basics

of the program work. But there are a few but ’s to this. First, this database contains

’only’ 117 files. This is of course nothing compared to the size of a ’real’ database. For

example, my home database contains about 5000 songs. Second, this is monophonic

music. Almost all songs have a polyphonic element in them.

Next, how well does it work on polyphonic data? Unfortunately it is not usable in real

life yet. An average rank for UDRM of 118 (see table 4.6)is far to low. Using the relative

formats gives a significant improvement to 76 (see table 4.9) on average. However, this

is still to low to use in real life. Of course the rank improves with the length of the

pattern (See images 4.3, 4.4, 4.5 and 4.6), but no one will remember almost the entire

song. I think that the bottleneck in this program is the amount of ’M’ characters. In

my opinion it would work a lot better if we had less ’M’ characters.

Using the XML format probably increases the performance significantly also, since this

adds the element of rhythm. However, I did not implement the matching algorithm for

this format, so I could be very wrong. Using music theory however, is not the solution

I thought it would be. There is a difference between music notation and what people

hear. For this difference a mapping is very hard or even impossible to make, so we can’t

use it in our program. If one would succeed in creating this mapping, this would be a

huge advantage. For more information on this subject, see [BC02].

So, as I expected in the beginning of this project, the goals were doable, but they require

26


more time and work then I have available for this project.


Bibliography

[BC02] Donald Byrd and Tim Crawford. Problems of music information retrieval in

the real world. Inf. Process. Manage., 38(2):249–272, 2002.

[BM77] Robert S. Boyer and J. Strother Moore. A fast string searching algorithm.

Commun. ACM, 20(10):762–772, 1977.

[WM92] Sun Wu and Udi Manber. Fast text searching: allowing errors. Commun. ACM,

35(10):83–91, 1992.

28

melodic contour matching for midi ﬁles - uva · piano). there is also music written that is meant...

Documents