melodic contour matching for midi files - uva · piano). there is also music written that is meant...
TRANSCRIPT
Informatic
a—
Univ
ersi
teit
van
Amst
erdam
Supervisors: David Ahn (ILPS)
Balder ten Cate (ILPS)
Signed:
Melodic contour matching for MIDI
files
Alex Mazurel
August 23, 2006
submitted in the partial fullfillment of the B. Sc.
Computer Science
Bachelor Informatica
Universiteit van Amsterdam
Melodic contour matching for MIDI files University of Amsterdam
Alex Mazurel August 23, 2006 Page 2 of 28
Abstract
With ever increasing amounts of music in everyone’s personal library and more and more
music video channels, sometimes it is hard to keep track of it all. What if you hear a
song on the radio that you like, but didn’t catch the artist and or title? How can you
ever find out which song it is based on the tune stuck in your head?
The program presented in this thesis may help you out. It parses a (potentially large)
database of music and looks for the melody requested.
A few small experiments are conducted to test the how well the program works. This
test consists of repeatedly querying the program with computer generated queries, to
see how much input is required to give acceptable results.
1
Melodic contour matching for MIDI files University of Amsterdam
Acknowledgments
I would like to thank the following people for their help with this project:
• David Ahn and Balder ten Cate for guiding me during this project and assisting
me where needed,
• Mutopia for providing me with a large, copy-right free database of music,
• My girlfriend for all the mental support,
• Jeroen Bulters for reviewing my thesis and comments during the development of
the program,
• Stephan Schroevers and Marc Makkes for helping me out with all my C coding
problems,
• and of course all my fellow students for their ideas and comment during the weekly
meetings.
Alex Mazurel August 23, 2006 Page 2 of 28
Contents
1 Introduction 5
2 Theoretical background 6
2.1 Music formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Monophonic versus Polyphonic . . . . . . . . . . . . . . . . . . . . 6
2.1.2 UDRM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Relative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.4 XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.5 MIDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Exact matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Approximate matching . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Program design 15
3.1 User interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Music Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 The program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 Exact matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.2 Approximate matching . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Experiments 18
4.1 Monophonic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Polyphonic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Future work 25
5.1 Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Additions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3
Melodic contour matching for MIDI files University of Amsterdam
6 Conclusions 26
Alex Mazurel August 23, 2006 Page 4 of 28
CHAPTER 1
Introduction
Why would someone want to do a study about melodic contour matching? It’s all made
clear by a small anecdote. I was talking to my girlfriend about old DOS-games. She
then told me she always played this great game, but didnt remember the title 1. All she
remembered was the tune playing when the game was running. She hummed it for me
and I instantly recognised the song. However I didn’t know the title of this song either.
When asking more people if they knew the title of this song they all answered the same
thing: ”I know this song, but I can’t remember the title”2. A few weeks later I had
to choose a topic for my Bachelor’s thesis and I thought back to that song. The next
moment I chose the subject to be melodic contour matching.
Hopefully this small piece of personal history makes clear not only my personal motiva-
tion for this subject, but also the need for a program to find a piece of music given the
melody.
In this thesis I will describe the background information needed to understand this
thesis followed by the solution I created for the problem given above. This solution is
not however an end-product. It is merely a beginning. There are many improvements
to be made, but music recognition and/or retrieval is a very complicated field in which
there is no given standard method to use.
The solution is split up into a few parts. These parts include a user interface, a database
and the actual matching/retrieval part.
Next a few small experiments are conducted to test how good the program works.
Finally I will present my conclusions and any ideas for future work that I may have.
1The game my girlfriend meant was ’Digger’ released by Windmill software in 19832In the meantime the song has been identified as being ”The popcorn song”
5
CHAPTER 2
Theoretical background
In this chapter I will explain all theory used in the melodic contour matching. Even
though the focus of this paper lies with monophonic data, I have ensured compatibility
for Polyphonic data.
2.1 Music formats
For melodic contour matching, multiple file formats can be used. Of course Wave or MP3
is the most exact and widely used format, however this is hard to process. Therefore
this thesis relies on MIDI files (generated from sheet music).
These MIDI files are then converted to different formats used for searching, because the
MIDI files themselves are to complex to use. For very fast and coarse matching, a simple
format (UDRM) is used. For more precise matching, a more complex format is needed.
I will describe the formats implemented for this paper, ordered from simple (UDRM) to
complex (MIDI).
2.1.1 Monophonic versus Polyphonic
First of all, there are two main types of music: monophonic and polyphonic music.
Monophonic means literally, ’Able of sounding one note, or voice, at a time’ (Wikipedia
definition). Polyphonic then is easy to guess, ’Able of sounding multiple notes, or voices,
at the same time’. In MIDI music files there are pieces of music written for monophonic
instruments (e.g. Trumpet) as well as pieces written for polyphonic instruments (e.g.
Piano). There is also music written that is meant to be ’multitimbral’. This mean mul-
tiple different instruments at the same time playing the same notes. This multitimbral
music is therefore always polyphonic, but polyphonic music not always multitimbral.
For this paper, multitimbral will be considered as monophonic, as the instrument is not
considered, but only the notes played. Of course, multitimbral, polyphonic music will
6
Melodic contour matching for MIDI files University of Amsterdam
Figure 2.1: The first two measures of Beethoven’s Fifth symphony
still be considered polyphonic.
2.1.2 UDRM
The ’UDRM’ format is the most straightforward format one can think of. It uses the
relative note frequency only. So note duration and exact frequency are not used. The
differences are given by the letters ’U’ for ’Up’, ’D’ for ’Down’, ’R’ for ’Repeat’ and ’M’
for ’Multiple’.
This is a very compact and easy format. Even though most people will not be able to
hear or remember the exact difference and/or pitch and octave of all notes, this format
can still be used as it solely relies on a melodic contour and not on the exact notes.
Of course in monophonic music the ’M’ will never occur, so we have another easy way
to tell whether music is monophonic or polyphonic: If the ’M’ occurs, then the piece of
music is polyphonic.
The first note in a piece of music does not have a previous note, so we always use ’U’ to
represent the first note.
Even though multiple simultaneous notes may all be higher than the previous note(s),
this is not checked. This means that all occurrences of multiple notes will be classified
as ’M’ even thought these could be classified as ’U’, ’D’ or ’R’.
For notes following an ’M’, the ’last’ note of the ’M’ will be used for classification.
The two cases mentioned above could be filtered out in a number of cases, but due to a
lack of time this is left as work for any follow-up studies.
Example: the first piece of Beethoven’s Fifth symphony (Figure 2.1) would become
’URRMURRM’
2.1.3 Relative
The ’Relative’ format is mostly the same as ’UDRM’. However the Relative format
measures the exact difference between successive notes, not just Up, Down or Repeat.
The format is notes separated by spaces, where each note is represented by ’U’, ’D’, ’R’
or ’M’ followed by the exact difference.
Alex Mazurel August 23, 2006 Page 7 of 28
Melodic contour matching for MIDI files University of Amsterdam
Of course the first note, a ’M’ and all notes directly after a ’M’ don’t have an exact
difference, so this will be set to 0.
For the notes directly after a ’M’, other heuristics can also be used. (e.g. The average
of all exact differences.)
Example: the first piece of Beethoven’s Fifth symphony (Figure 2.1) would become ’U0
R0 R0 M0 U0 R0 R0 M0’
2.1.4 XML
This format has been implemented, but has not been used for melodic contour matching
yet. The definition of this format is given by the following DTD:
<!DOCTYPE track [
<!ELEMENT track (note*, meta)>
<!ELEMENT note (start, pitch, octave, duration, relative)>
<!ELEMENT start (#PCDATA)>
<!ELEMENT pitch (#PCDATA)>
<!ELEMENT octave (#PCDATA)>
<!ELEMENT duration (#PCDATA)>
<!ELEMENT relative (#PCDATA)>
<!ELEMENT meta (notes, removed)>
<!ELEMENT notes (#PCDATA)>
<!ELEMENT removed (#PCDATA)>
]>
The meta information tells how much notes there are in the track and how much note
events (Multiples) have been removed.
Example: the first piece of Beethoven’s Fifth symphony (Figure 2.1) would become:
<?xml version="1.0" encoding="ISO-8859-1"?>
<track>
<note>
Alex Mazurel August 23, 2006 Page 8 of 28
Melodic contour matching for MIDI files University of Amsterdam
<start> 192 </start>
<pitch> G </pitch>
<octave> 3 </octave>
<duration> 192 </duration>
<relative> U </relative>
</note>
<note>
<start> 384 </start>
<pitch> G </pitch>
<octave> 3 </octave>
<duration> 192 </duration>
<relative> R </relative>
</note>
<note>
<start> 576 </start>
<pitch> G </pitch>
<octave> 3 </octave>
<duration> 192 </duration>
<relative> R </relative>
</note>
<note>
<start> 768 </start>
<pitch> M </pitch>
<octave> M </octave>
<duration> M </duration>
<relative> M </relative>
</note>
<note>
<start> 1728 </start>
<pitch> F </pitch>
<octave> 3 </octave>
<duration> 192 </duration>
<relative> U </relative>
Alex Mazurel August 23, 2006 Page 9 of 28
Melodic contour matching for MIDI files University of Amsterdam
</note>
<note>
<start> 1920 </start>
<pitch> F </pitch>
<octave> 3 </octave>
<duration> 192 </duration>
<relative> R </relative>
</note>
<note>
<start> 2112 </start>
<pitch> F </pitch>
<octave> 3 </octave>
<duration> 192 </duration>
<relative> R </relative>
</note>
<note>
<start> 2304 </start>
<pitch> M </pitch>
<octave> M </octave>
<duration> M </duration>
<relative> M </relative>
</note>
<meta>
<notes> 1800 </notes>
<removed> 860 </removed>
</meta>
</track>
2.1.5 MIDI
Musical Instrument Digital Interface, or MIDI, is an industry-standard electronic com-
munications protocol that defines each musical note and electronic musical instrument
such as a synthesizer by means of events. A single MIDI file can contain one or multi-
ple tracks with one or multiple events per track. As this paper focuses on monophonic
music, we will assume that every MIDI file has only a single track.
Alex Mazurel August 23, 2006 Page 10 of 28
Melodic contour matching for MIDI files University of Amsterdam
A single event contains a few values:
time this is the timestamp of the event.
channel the channel which the event occurs in.
type this is the type of event. For this paper, only ”MIDI note events” are considered.
(i.e. note-on and note-off.)
note this is the note number. For a reference of note numbers, see Figure 2.2.
velocity this is the pressure applied when the key on the keyboard was struck. Key-
boards that don’t record applied force, often use the value 64 (half of the maxi-
mum).
N.B. In the Standard MIDI File format, channel and type are merged into a single byte.
2.2 matching
This section describes the algorithms and tools used in the actual matching of MIDI
files.
There are two main methods of matching. Exact and Approximate matching. Exact
matching is finding an exact match to a given query. Approximate matching however,
finds the closest matches, even if they don’t completely match.
2.2.1 Exact matching
For exact matching (all formats), the simplified Boyer-Moore Fast String searching al-
gorithm (BM) is used.
The Boyer-Moore string search algorithm is a fast string searching algorithm. The
algorithm pre-processes the target string that is being searched for, but not the string
being searched. This algorithm doesn’t need to actually check every character of the
string to be searched but rather skips over some of them. Its efficiency derives from the
fact that, with each unsuccessful attempt to find a match between the search string and
the text it’s searching in, it uses the information gained from that attempt to rule out
as many positions of the text as possible where the string could not match.
For an explanation about the Boyer-Moore method see [BM77]. The algorithm used for
this experiment is given by Algorithm 1.
Alex Mazurel August 23, 2006 Page 11 of 28
Melodic contour matching for MIDI files University of Amsterdam
Figure 2.2: MIDI note numbers, frequency and international musical notation
Alex Mazurel August 23, 2006 Page 12 of 28
Melodic contour matching for MIDI files University of Amsterdam
Algorithm 1 search(Files, pattern)1: bC← badCharacter(pattern)2: n← length(pattern)3: for all file ∈ Files do4: m← length(file)5: offset← 06: while offset ≤ m− n do7: index← n8: while index ≥ 0 and pattern[index] = file[offset + index] do9: index← index− 1
10: end while11: if index < 0 then12: offset← offset + max(1, bC(file[offset]))13: results[file]← results[file] + 114: else15: offset← offset + max(1, bC(file[offset + index]))16: end if17: end while18: end for19: return results
Algorithm 2 badCharacter(pattern, Σ)1: m← length(pattern)2: for all a ∈ Σ do3: λ[a] = 04: end for5: for j← 1 to m do6: λ[pattern[j]]← j7: end for8: return λ
Alex Mazurel August 23, 2006 Page 13 of 28
Melodic contour matching for MIDI files University of Amsterdam
2.2.2 Approximate matching
UDRM
The algorithm used for approximate matching in the ’UDRM’ format is called ’bitap’.
The algorithm tells whether a given text contains a substring which is ”approximately
equal” to a given pattern, where approximate equality is defined in terms of Levenshtein
distance. if the substring and pattern are within a given distance k of each other, then
the algorithm considers them equal.
The bitap algorithm is perhaps best known as one of the underlying algorithms of the
Unix utility ’agrep’, written by Manber, Wu. For a better explanation of the algorithm
see [WM92]
Relative
The method used for Approximate matching in the ’Relative’ format is the Bhattagharyya
coefficient. This method is actually a very simple formula:
Bhattacharyya(p, q) =∑x∈X
√p(x)q(x)
Where X is all values in p. p and q are both a piece of relative data. This means a
series of numbers, which can be interpreted as a vector. The straightforward geometrical
interpretation of the Bhattacharyya coefficient is the cosine of the angle between two
n-dimensional vectors.
This method has the advantage that the amplitude of the melody doesn’t count anymore.
As long as the differences stay relatively the same, the Bhattacharyya coefficient will
match them as an exact match. So not only does the offset from a fixed point of the
melody disappear from the equation (e.g. CD will match DE), with this method, also
the amplitude become irrelevant.
Alex Mazurel August 23, 2006 Page 14 of 28
CHAPTER 3
Program design
In order to do anything with the music database and user input, a program is of course
needed. How this program is built up, as well as the inner works are discussed in this
chapter.
The main matching program consists of a few separate parts. An interface for user
queries, a music database, format conversion modules and last but not least, the match-
ing program.
3.1 User interface
The user interface has gotten very little attention for this paper. The is currently only a
single format that queries can be given in (UDRM). The task of the user interface is to
give a user the chance to input his/her query and put the matching program to work.
For now it only has 4 basic elements. Input fields for the host and port the daemon is
running on, an input field for the query and a submit button. The result page is a listing
of possible matches to the user query sorted by descending probability. Every result is
at the same time a link to the piece of music, so the user can verify the results.
3.2 Music Database
The music Database used for this project contains all the Mutopia project files. The
music in this database consists of pieces of public-domain music. This means that all
copyright have expired and you are free to copy or perform this music. All copyrights on
music will expire 70 years after the last copyright holder dies. Because of this restriction
the Mutopia collection contains only older songs, so the program presented here has not
been tested or released with any popular music. Of course, everyone is free to expand
or replace the database with any or all music that they own themselves.
Since the program presented in this thesis works on formats devised by me, I had to
15
Melodic contour matching for MIDI files University of Amsterdam
convert all the provided music to these formats. Of course in a conversion information
can only be lost, not added or modified. This means we can only convert from MIDI
to ’coarser’ representations as XML, Relative and UDRM. Of course conversions from
XML to Relative and from Relative to UDRM are also possible, but these scripts have
not been written yet.
All format conversions are done with a similar script, using Perl::MIDI. (For more in-
formation about Perl or the Perl::MIDI module please refer to the Perl website and the
CPAN website.) In total 3 scripts are used: MIDI2UDRM, MIDI2REL, MIDI2XML 1.
Since this thesis works only on a single track a time all the files from the Mutopia
database have been split up into separate tracks prior to conversion. The final step
in the conversion process is to remove all songs with less than 25 notes. This is done
because these tracks contain to little information to be of any use in pattern matching.
3.3 The program
The most essential part of the program is the daemon. This is the part that connects
the database and the matching algorithms to the interface through a socket. For the
working of this daemon see Algorithm 3.
The files are all loaded into memory prior to the first socket accept, so minimal loading
time can be guaranteed.
Algorithm 3 daemon(Files)1: while true do2: socket← acceptSocketRequest()3: query← socket.readQuery()4: results← search(Files, query)5: socket.send(results)6: socket.close()7: end while
3.3.1 Exact matching
For an exact matching daemon, Algorithm 3 is used with Algorithm 1 as the search
function.
For experimental purposes, the Simplified Boyer-Moore algorithm has been adapted to
accommodate extra features. These features include ’M matches all’ and ’Fractional
matching’.1All scripts are freely available from the author
Alex Mazurel August 23, 2006 Page 16 of 28
Melodic contour matching for MIDI files University of Amsterdam
M matches all
This expansion does exactly as the name suggests, it matches every pattern character
to a ’M’ in the text. e.g. a user searches ’URRDURRD’ (The beginning of Beethoven’s
fifth symphony), but the real pattern should be ’URRMURRM’. This expansion lets
these patterns match.
Besides an advantage, this expansion also has a disadvantage. Pieces of music with a
long series of ’M’ characters would always have a match to the pattern.
Fractional matching
Fractional matching tries to accomplish the advantage of the last expansion without the
disadvantage. It does this by checking the amount of ’M’ characters in the match found
and derive a partial match from this. The formula for each match becomes:
fractionalMatch =numberofMcharactersinpattern
totallengthofthepattern
What this actually does is give a match to a series of only ’M’ characters weight 0. A
match with only the exact characters will keep weight 1 (Unless there is a ’M’ in the
query).
In stead of using 1 for a match, this gives us the possibility of using a more precise
scale. Using 1 for an exact match, but less for a partial match gives us the possibility of
ranking a match to another match.
As with every method, this method also has a disadvantage. All pieces of music with
a lot of ’M’ characters are ’punished’ for this fact. Even if you were looking for that
particular piece of music, if the piece of music contains a lot of ’M’ characters it would
still end up low in the result listing.
3.3.2 Approximate matching
For the approximate matching daemon, the search method is exchanged for either bitap
(UDRM format) or Bhattacharyya (Relative format).
A choice of format is however, not included in both interface and communications pro-
tocol between the interface and the daemon yet.
Alex Mazurel August 23, 2006 Page 17 of 28
CHAPTER 4
Experiments
All experiments done to determine how well matching program works are divided into
two separate categories. These are Monophonic data and Polyphonic data. This is done,
because the program is meant for monophonic data. So at least the experiments using
monophonic data should give acceptable results. Acceptable results for polyphonic data
would be nice, but are not expected for the simple UDRM format.
The normal measures used in information retrieval include precision, recall and fall-out,
but these cant be used very well here for a simple reason. All three measures use the
term relevant- or irrelevant documents retrieved. This is a bit ambiguous in this thesis.
All documents found are relevant, since they all contain the pattern searched for (exact
search), however, only is single document is really relevant. This is the piece of music
you were looking for.
More interesting here would be the rank where the piece of music you are looking for will
end up. Is it (almost) always in the top 10, or is it randomly distributed? In the first
case this program would be useful, but in the latter, this program would be completely
useless. If the rank was randomly distributed you would have found the requested piece
of music just as fast by randomly listening to all pieces of music in the database.
For these experiments the search string will be a random pattern selected from the
database. The length can be limited by a preset length or by the length of the data file,
but it both cases it will be randomly distributed over all possible values.
If there would be enough time and people willing to test the program presented here, I
would have liked to have people listen to a random piece of music and then searching
for it in the database. This would give more accurate results, since people hear music
and not read the data files in the database. However due to the nature of this project,
this is impossible here.
18
Melodic contour matching for MIDI files University of Amsterdam
Table 4.1: Rank percentages for 5000 runs, exact matching, pattern length 10 to 150 onmonophonic data
Rank 1 % Cumulative top 10 %UDRM 96.9 top 7 is 100%
Relative 98.4 top 2 is 100%
Table 4.2: Statistics for 5000 runs, exact matching, pattern length 10 to 150, UDRMformat on monophonic data
Minimum Maximum Mean Standard deviationRank 1 12 1.0 0.4
Pattern length 10 150 66.1 39.5
4.1 Monophonic data
For monophonic data only 111 files are used from 1158 of the total database. This
already illustrates that most songs are Polyphonic. However a song with just a single
’M’ is classified a polyphonic, but with better methods for conversion the fraction of
monophonic songs could very well increase drastically.
The first experiment done here is to determine the average rank given a pattern length.
This gives us an idea what pattern length is required to give an acceptable result. If
for example the required pattern length for a top 10 result 95% of the time would be a
length of 500, this would clearly be unacceptable.
What I have done is take random selected patterns with a length of 10 to 150 (both sides
inclusive) and made a table of pattern length vs. ranking. Here ranking is of course the
rank where the source of the pattern ends up in the results listing of the program. This
experiment is then run 5000 times to get accurate statistics. This experiment is done
for both the UDRM format and the Relative format.
As can be seen in table 4.1 both UDRM and Relative format perform acceptable on
monophonic data. However Relative format works even better that UDRM, but this is
of course to be expected.
Given a larger database of monophonic data, another experiment could be done to
examine the scalability of this result. Will it always give a top 10 result, or will the
Table 4.3: Statistics for 5000 runs, exact matching, pattern length 10 to 150, Relativeformat on monophonic data
Minimum Maximum Mean Standard deviationRank 1 3 1.0 0.1
Pattern length 10 150 65.9 39.5
Alex Mazurel August 23, 2006 Page 19 of 28
Melodic contour matching for MIDI files University of Amsterdam
Figure 4.1: Plot of 5000 runs, exact matching, pattern length 10 to 150, UDRM formaton monophonic data
Figure 4.2: Plot of 5000 runs, exact matching, pattern length 10 to 150, Relative formaton monophonic data
Alex Mazurel August 23, 2006 Page 20 of 28
Melodic contour matching for MIDI files University of Amsterdam
Table 4.4: Rank percentages for 5000 runs, UDRM format, pattern length 10 to 150 onpolyphonic data
Rank 1 % Cumulative top 10 %M matches all 0.1 1.0
Fractional 0.3 1.6
Table 4.5: Statistics for 5000 runs, M matches all, pattern length 10 to 150, UDRMformat on polyphonic data
Minimum Maximum Mean Standard deviationRank 1 727 122.5 117.7
Pattern length 10 150 66.8 39.3
ranks scale with the size of the database?
4.2 Polyphonic data
The second experiment is to see how well ’M matches all’ and ’Fractional matching’
perform. To do this, the same experiment as above is executed with an adapted version
of the search method. Note that for the second experiment the full 1158 files are used,
so lesser results are to be expected.
For details on the precise patterns chosen, the rankings, etc., the log files of these runs
are available freely.
Table 4.6: Statistics for 5000 runs, Fractional matching, pattern length 10 to 150, UDRMformat on polyphonic data
Minimum Maximum Mean Standard deviationRank 1 669 118.6 115.5
Pattern length 10 150 66.5 39.5
Alex Mazurel August 23, 2006 Page 21 of 28
Melodic contour matching for MIDI files University of Amsterdam
Figure 4.3: Plot of 5000 runs, M matches all, pattern length 10 to 150, UDRM formaton polyphonic data
Figure 4.4: Plot of 5000 runs, Fractional matching, pattern length 10 to 150, UDRMformat on polyphonic data
Table 4.7: Rank percentages for 5000 runs, Relative format, pattern length 10 to 150 onpolyphonic data
Rank 1 % Cumulative top 10 %M matches all 0.1 0.8
Fractional 5.7 19.4
Alex Mazurel August 23, 2006 Page 22 of 28
Melodic contour matching for MIDI files University of Amsterdam
Table 4.8: Statistics for 5000 runs, M matches all, pattern length 10 to 150, Relativeformat on polyphonic data
Minimum Maximum Mean Standard deviationRank 1 497 104.7 100.2
Pattern length 10 150 66.6 39.3
Figure 4.5: Plot of 5000 runs, M matches all, pattern length 10 to 150, Relative formaton polyphonic data
Table 4.9: Statistics for 5000 runs, Fractional matching, pattern length 10 to 150, Rela-tive format on polyphonic data
Minimum Maximum Mean Standard deviationRank 1 498 76.1 90.4
Pattern length 10 150 64.9 38.8
Alex Mazurel August 23, 2006 Page 23 of 28
Melodic contour matching for MIDI files University of Amsterdam
Figure 4.6: Plot of 5000 runs, Fractional matching, pattern length 10 to 150, Relativeformat on polyphonic data
Alex Mazurel August 23, 2006 Page 24 of 28
CHAPTER 5
Future work
In this chapter I will give ideas that I did not have time to implement, or were not within
the project scope from the beginning.
5.1 Improvements
The first and most important part as far as I am concerned is to improve the conversion
scripts. Especially reduction of the amount of cases classified as ’M’ would be useful.
The second improvement would be an adaptation to the communications protocol to
include algorithm selection and format selection.
5.2 Additions
The first addition would be an alternative matching algorithm (with hopefully better
results). As most algorithms can only be tested empirically, the selection and/or opti-
misation of a search algorithm is a project on its own.
Adding a new format is another possible expansion. Since this would also require an
additional module to do the matching, this would be a lot of work, if not an entire
project on its own.
A new search and scoring method is the simplest addition possible. Due to the structure
of the program this only requires a new method for pattern matching, which can be
plugged in to the existing program.
A feature that was planned, but not implemented is the use of the suggested XML
format. Of course everyone is free to suggest an alternative format as well.
The final addition would be to add an option to the communications protocol to ac-
commodate the addition of new songs to the database whilst the program is running.
This also requires minor modification the programs source, but this is considered easy
for everyone with basic C programming skills.
25
CHAPTER 6
Conclusions
Which of the initial goals have been achieved and which have not. Especially why not
would be nice to know, since this would give us a lead for future improvements.
First of all, did I achieve the basic goal? Given a pattern in UDRM format, where will
the song it originates from will up in the result listing? My program always gives the
required song in the top 7 places (see table 4.1), when using monophonic data. Using the
Relative format increases this rank to a guaranteed top 2 (see table 4.1). So the basics
of the program work. But there are a few but ’s to this. First, this database contains
’only’ 117 files. This is of course nothing compared to the size of a ’real’ database. For
example, my home database contains about 5000 songs. Second, this is monophonic
music. Almost all songs have a polyphonic element in them.
Next, how well does it work on polyphonic data? Unfortunately it is not usable in real
life yet. An average rank for UDRM of 118 (see table 4.6)is far to low. Using the relative
formats gives a significant improvement to 76 (see table 4.9) on average. However, this
is still to low to use in real life. Of course the rank improves with the length of the
pattern (See images 4.3, 4.4, 4.5 and 4.6), but no one will remember almost the entire
song. I think that the bottleneck in this program is the amount of ’M’ characters. In
my opinion it would work a lot better if we had less ’M’ characters.
Using the XML format probably increases the performance significantly also, since this
adds the element of rhythm. However, I did not implement the matching algorithm for
this format, so I could be very wrong. Using music theory however, is not the solution
I thought it would be. There is a difference between music notation and what people
hear. For this difference a mapping is very hard or even impossible to make, so we can’t
use it in our program. If one would succeed in creating this mapping, this would be a
huge advantage. For more information on this subject, see [BC02].
So, as I expected in the beginning of this project, the goals were doable, but they require
26
Melodic contour matching for MIDI files University of Amsterdam
more time and work then I have available for this project.
Alex Mazurel August 23, 2006 Page 27 of 28
Bibliography
[BC02] Donald Byrd and Tim Crawford. Problems of music information retrieval in
the real world. Inf. Process. Manage., 38(2):249–272, 2002.
[BM77] Robert S. Boyer and J. Strother Moore. A fast string searching algorithm.
Commun. ACM, 20(10):762–772, 1977.
[WM92] Sun Wu and Udi Manber. Fast text searching: allowing errors. Commun. ACM,
35(10):83–91, 1992.
28