musicological and technological exploration of...

Musicological and Technological Exploration of Truths and Myths inCarnatic Music, the Raagam in Particular

Thesis submitted in partial fulfillmentof the requirements for the degree of

Mastersin

Computer Science

by

Koduri Gopala Krishna200502005

[email protected]

Cognitive Science LabInternational Institute of Information Technology

Hyderabad - 500 032, INDIADecember 2010

Copyright c© Koduri Gopala Krishna, 2011

All Rights Reserved

To

Amma, Nanna & Chinni.

Acknowledgements

An individual is moulded by the experiences in life. Few of us are lucky to have valuable experiencesin a conducive environment to grow in. I’m deeply indebted to almost every person I have encounteredin the last few years of my stay at IIIT-H and outside. I’ll go in a chronological order, for it will help mebetter in recollecting most of them.

My journey at IIIT-H commenced with meeting two similar natured guys of coastal Andhra descent,Bharat ram (Ambati) from Vijayawada and Vijay bharat (Yaram) from Guntur. Without the endlessfun episodes of Yaram and mission-critical tutorials of Ambati, I would not have found the place thatinteresting and hospitable. In the course of my stay, I discovered a wide variety of creatures in thejungle, I can fill pages with their names.

Mesmerized by the lectures of Prof. Jawahar in the first year, we three stayed back in the campusin the summer holidays of 2006, with the sole aim of securing our future at the prestigious Center forVisual Information and Technology, under his guidance. Obviously, in the beginning I did not have anyobjective of my own, I was behind what is lucrative in the popular opinion. Prof. Jawahar was, is andwill be one of the most adored professors of IIIT-H. I spent two years in that lab. There, I interactedwith several people. I never saw Rasagna frowning or complaining, however be the day. He shared hisexperiences with a very open heart. He is also the person with whom I can relate to myself in mostcases. Pramod Nair had always been there whenever I needed some guidance. I would call him andsay, ”Anna, chinna salaha kavali..” (Bro, I need an advice), and then it would go on. I partnered withRavindra who thinks very analytically, in several of my course projects. He was a very good partner. Heresembled Buddha, as he never got emotional to which ever extent I freaked.

Above all, it is Prof. Jawahar to whom I’m indebted to. Though I did not take up any seriouscomputer vision stuff while I was in the lab, I was involved in two projects - one related to the work onfont encodings, and the other, document image retrieval. The work on font encodings has later helpedme serve my first love - Open Source and Ethnocomputing. Today, I still devote a significant part of mytime towards it. There are several valuable experiences I have gained in the way - leading a small teamfor the development of an Indic Firefox-plugin (Padma), working as an intern at a company, identifyingthe core problems with Ethnocomputing in India and so on. A very very special thanks to him.

In late 2008 and early 2009, I got interested in cognitive science. I conveyed the same to Prof. Jawa-har, who said that I should pursue whatever my interest is. It is then, I met Prof. Bipin, who warmlywelcomed me into his lab. At first, I did not have a special interest in any particular topic in cognitive

v

vi

science. Without the freedom Prof. Bipin usually leaves his students with, it would have been a real dif-ficult situation for me. I kept jumping from topic to topic; I was intrigued by the cognition of language,role of images in comprehension, narrative structures etc. But finally, I have zeroed down to musiccognition and music information retrieval. By late 2009, I started working seriously on it. The topic Ihave chosen is - explore the musicological literature of India, see what has been done technologically,and address an interesting issue. The interdisciplinary nature of the topic made it difficult to move aheadin a normal phase a typical masters student would do. I’m immensely indebted to Prof. Bipin for hispatience and especially the opportunities he had provided me to learn from.

Prof. Bipin was very kind to me in leaving the freedom to take decisions that would help me. Theinternship opportunity he has provided with Prof. Christophe of IRIT - ENSEEIHT, France during late2009, has deeply impacted my thoughts. It helped me to discover Prof. Christophe’s radically differentviews of human perception, of music in particular. I’m fascinated by the non-statistical approach toaddress problems in information retrieval. Prof. Christophe is a very kind and friendly person. Withouthim, my stay in France, which is my first stay outside India, would have been a nightmare.

I’m very grateful to Prof. Preeti Rao, DAP lab, IIT-B for guiding me in building the raaga recognitionsystem. The three months in summer 2010 I have spent in her lab, have been very fruitful in gettingseveral insights into audio processing. A special thanks to Sankalp Gulati and other DAP lab membersfor making my stay at IIT-B a peaceful and interesting one. I’m also grateful to Dr. SuvarnalathaRao for replying patiently to my queries on Indian classical music. I thank Prof. Navjyoti, PranavKumar Vasishta, Kavita Vemuri, Sai Gollapudi and Violin Vasudevan for providing me valuable contactsand resources on Indian classical music. I thank Anupama, Abhilash, Ambati, Divya and Siva forreviewing the drafts written for conferences. I’m also greatly thankful to Prof. Xavier and Joan, ofMusic Technology Group at UPF in Barcelona, for reviewing parts of this thesis and providing criticaland wonderful feedback.

I feel very lucky to be in the company of my friends at IIIT-H and outside. The experiences weshared are very influential. There are commendable outcomes from the discussions over societal issues- http://team-samvedana.org and http://techsetu.com. There is also a drastic change in my nature andthe way I socialized. My gratitude is inexpressible.

With a weak economic status and a rural background, the firm determination of my parents in pro-viding me a decent schooling is the sole reason why I’m doing something that I’m doing today - thatwhich I liked and enjoyed. No words can possibly express what I owe them.

Abstract

The classical music traditions of the Indian subcontinent, Hindustani and Carnatic, offer an excellentground on which to test the limitations of the current music information research approaches. At thesame time, their study can shed light on how to solve new and complex music modeling problems. Bothtraditions have very distinct characteristics, specially compared with western ones: they have developedtheir own instruments, musical forms, performance practices, social uses and context. In this thesis, wefocus on the Carnatic music tradition of south India, especially on its melodic characteristics.

Raaga is the spine of Indian classical music. It is the single most crucial element of the melodicframework on which the music of the subcontinent thrives. Naturally, automatic raaga recognition isan important step in computational musicology as far as Indian music is considered. It has severalapplications like indexing Indian music, automatic note transcription, comparing, classifying and rec-ommending tunes, and teaching to mention a few. Simply put, it is the first logical step in the process ofcreating computational methods for Indian classical music. In this thesis, we investigate the properties ofa raaga and the natural process by which people identify the raaga. We survey the past raaga recognitiontechniques correlating them with human techniques, in both Hindustani and Carnatic music systems.We identify the main drawbacks and propose minor, but multiple improvements to the state-of-the-artraaga recognition technique.

Music is said to evoke emotions. After the advent of advanced signal processing techniques andeasily accessible computational resources, the scientists and engineers have been trying to understandthe nature of music in this very context. In this context, one of the several aspects of Indian music whichinterests us is the traditional association of emotions with raagas. Besides the ancient scriptures likeNatyasastra, the recent articles of several scholars also associate the raagas with emotions. A part ofour work is dedicated to the investigation of the origin of this association. We discuss the term rasa,often mistaken as emotion. We also report the results of a survey conducted to study the aforementionedraaga-emotion association.

We also overview the other theoretical aspects that are relevant for music information research anddiscuss the scarce computational approaches developed so far. We put emphasis on the limitations ofthe current methodologies and we present some open issues that have not yet been addressed and thatwe believe are important to be worked on.

vii

Contents

Chapter Page

1 Introduction(Why did you end up doing this work?) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Why Carnatic music? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Goals of our work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Contributions from the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.5 Organization of the content . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Computational approaches to Indian classical music . . . . . . . . . . . . . . . . . . . . . . 52.1 Computational approaches to melody . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 How do people identify raaga . . . . . . . . . . . . . . . . . . . . . . . . . . 62.1.1.1 Non-trained person or the rasika’s way . . . . . . . . . . . . . . . . 62.1.1.2 The trained musician’s way . . . . . . . . . . . . . . . . . . . . . . 7

2.1.2 Swaras and Srutis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.1.3 Arohana and Avarohana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.4 Unexploited properties of raaga . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.4.1 Gamakas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.4.2 Various Roles Played by the Notes . . . . . . . . . . . . . . . . . . 13

2.2 Computational approaches to rhythm . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Musical forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.1 Improvisatory Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3.1.1 Raaga alapana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.1.2 Taanam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.1.3 Pallavi exposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.1.4 Swara kalpana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3.1.5 Niraval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.2 Composed Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.3 Associated work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Raaga and Rasa(History, context, truths and myths) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.1 A brief history of rasa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.1.1 Origin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.1.2 Rasa nishpatti

(The process of experiencing rasa) . . . . . . . . . . . . . . . . . . . . . . . . 19

viii

CONTENTS ix

3.1.3 Categories of rasa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.1.4 Rasa in drama and music . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.1.5 Raaga paintings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2 A behavioural study to analyse the raaga-emotion relationship . . . . . . . . . . . . . 233.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.2.3 Differences between Carnatic and Hindustani traditions . . . . . . . . . . . . . 243.2.4 The hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2.5 Conceptualization of emotions . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2.6 Details of the study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2.6.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.2.6.2 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2.6.3 Results and observations . . . . . . . . . . . . . . . . . . . . . . . 283.2.6.4 Implications to music recommendation systems . . . . . . . . . . . 31

3.2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Raaga Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.2 Problems that need to be addressed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.2.1 Gamakas and pitch extraction for Carnatic music . . . . . . . . . . . . . . . . 354.2.2 Skipping tonic detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2.3 Resolution of pitch-classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2.4 A comprehensive dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.3 Our method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.3.1 Pitch extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.3.2 Finding the tuning offset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3.3 Note segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3.4 Pitch-class profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.3.5 Distance measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.4 Experiment and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.4.2 Classification experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445.1 Impact of this work and the future directions . . . . . . . . . . . . . . . . . . . . . . . 455.2 Few guidelines for future students/researchers . . . . . . . . . . . . . . . . . . . . . . 46

Appendix A: Basic Acoustics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48A.1 Demonstration of various physical properties . . . . . . . . . . . . . . . . . . . . . . 48A.2 Sine waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49A.3 Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50A.4 Timbre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51A.5 Frequency measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52A.6 Tuning systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

A.6.1 Equal-temperament . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53A.6.2 Just-intonation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

x CONTENTS

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

List of Figures

Figure Page

2.1 Results of Rajeswari & Geeta’s raaga identification method . . . . . . . . . . . . . . . 10

3.1 Raaga paintings of Vasanta Ragini (left) and Hindola (right) raagas . . . . . . . . . . . 233.2 Average of the ratings collected per rasa, across all users and tunes, for each raaga.

X-axis denotes rasa index and Y-axis denotes the average value of the ratings. . . . . . 283.3 (a) Average rating obtained per rasa for a tune in Nadanamakriya. (b) Standard deviation

of the ratings for the same tune. X-axis denotes the rasa indices. Y-axis denotes thenormalized values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.4 Ratings given by six users for a sample track in each raaga. X-axis denotes rasa indices.Y-axis denotes the ratings quantifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5 Histograms of ratings obtained per rasa, for the six raagas. X-axis has the four ratingquantifiers - None at all, A Little, Somewhat and Very. Y-axis denotes the number ofratings obtained for each quantifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1 Screenshot from the melodic pitch extraction system of [37] showing the detected pitchsuperimposed on the signal spectrogram. The axis on the right indicates pitch value (Hz). 37

4.2 Note segmentation and labeling. Thin line: continuous pitch contour; Thick line: de-tected stable note regions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

A.1 Wave pattern generated by a plucked string . . . . . . . . . . . . . . . . . . . . . . . 48A.2 Condensation and rarefaction represented as a sine wave . . . . . . . . . . . . . . . . 49A.3 Examples of sine waves with high and low frequencies . . . . . . . . . . . . . . . . . 50A.4 Possible harmonics in a given string . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

xi

List of Tables

Table Page

2.1 The scales used in Indian classical music . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 The values of 22 Srutis derived by [41] . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Accuracy of raaga identification reported in [31] . . . . . . . . . . . . . . . . . . . . . 112.4 Comparison of various classifiers used in Chordia and Rae’s system . . . . . . . . . . 12

3.1 Categories of rasas as given in Natyasastra . . . . . . . . . . . . . . . . . . . . . . . . 213.2 An Emotion Classification Based on Navarasa . . . . . . . . . . . . . . . . . . . . . . 263.3 Raagas and their intended rasas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1 Description of the dataset across 10 raagas. . . . . . . . . . . . . . . . . . . . . . . . 404.2 Performance of weighted-k-NN classification with various pitch-class profiles . . . . . 41

A.1 Fundamental and its harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

xii

Chapter 1

Introduction

(Why did you end up doing this work?)

1.1 Motivation

In the last decade, the music information research played a vital role in the commercial music recom-mendation services in the western music industry. Examples include last.fm1, pandora2 etc. However,such services for Indian music were not encountered till date. Few scarce web music services likeraaga.com3, work merely with textual metadata. Even the current available web and desktop-basedservices for syncing or getting the metadata from the web are very western centric. The metadata ofa typical Indian film song is much more than that is allowed in Musicbrainz [52]. For instance, theinvolvement of various artists in creating an Indian film song can not be completely accounted for, usingthe schema used by, say Musicbrainz. This is because the involvement of various kinds of artists differin, say western pop and Indian films, and it is difficult to reflect these differences using, say Musicbrainz,without compromise. And majority of the content-based music recommendation algorithms found inopen source media players [57] are not at all suitable for Indian music - classical, film or otherwise.

In an attempt to build a web based service with metadata syncing and content based music recom-mendation for Indian film music, we surveyed the related music information research and musicologicalliterature. Methods proposed in several publications were based on western concepts which were notsufficient/relevant in the context of Indian film music. For instance, the concept of genre has no mean-ing as far as most film music is concerned. Classification of film songs requires a radically differentformulation, which, to our knowledge, has not been attempted. However, we found a few publicationsinteresting [24, 8]. They attempt to classify Indian classical music based on raagam and taalam [49]. Wealso encountered an interesting theoretical mood-based music classification in Indian classical music,which we found to be a scarcely researched topic.

1http://last.fm2http://pandora.com3http://raaga.com

1

In this thesis, we focus on Carnatic music and survey a few relevant computational models researchedin the past. We propose few enhancements to the state-of-the-art in the raaga recognition. Further, wealso investigate the raaga-emotion association with a behavioral study.

1.2 Why Carnatic music?

In the myriad of world music traditions, Indian classical music has few unique properties, as we’llsee in a detailed manner in the following chapter. There are two classical music traditions in India.Owing to the popularity of Pandit Ravi Shankar4 and The Beatles5 in the west, Hindustani, which isthe north Indian classical music, is often mistaken as the classical music tradition throughout India.The four south Indian states which form a large part of India - Andhra Pradesh, Karnataka, Kerala andTamilnadu - and parts of Maharashtra, Orissa have a distinct classical music tradition called Carnaticmusic.

Ever since India had seen the invaders, the evolution of the classical music tradition in India tooktwo different paths. In the north region, where it was greatly influenced by Sufi, it is called Hindustani.In south India, where it was less influenced, it is called Carnatic music. Carnatic tradition has adaptedsomething from outside, only if it proved to uphold the innate characteristics of the tradition. Violinstands as a living testimonial to this fact. The ability of the instrument to imitate human voice is verycrucial for its use in Carnatic music, which is full of gamakas, the curvy movements between notes.Though there are several commonalities between Carnatic and Hindustani, the differences are notableand very significant. In chapter 3, we outline few such differences between the two traditions.

Therefore, to be specific with what we are working with, we have chosen Carnatic music6. Moreover,the two classical music traditions of India have extensive musicological literature, and a few existingcomputational attempts that can help us in our investigation. Film music, however, neither have theextensive literature, nor any existing computational models. Since none of us are musicians by profes-sion, we chose classical music, for we can seek the guidance from the available literature during ourinvestigation.

1.3 Goals of our work

To our knowledge, this is the first thesis on computational and theoretical aspects of Carnatic music,presenting a thorough overview of the current state-of-the-art, and discussing several open issues that arecomputationally relevant in the realm. Though there are several musicological works in the past, therehas not been much discussion with an emphasis to build computational models barring few exceptions

4http://ravishankar.org5http://en.wikipedia.org/wiki/The_Beatles6It is also a natural choice since we are natives of the state of Andhra Pradesh.

2

[51] [53]7. As is the case with any area of research which is mostly untouched, it has been tough tochoose a narrow topic to work on. We chose these two broad aspects of Carnatic music to investigate indepth.

• Carnatic raaga recognition.

• The raaga-rasa relationship.

1.4 Contributions from the thesis

This thesis is intended to open doors to a new type of music for the scientific community to workon, than to propose solutions with a significant lap over the state-of-the-art. But we do report ourinvestigations in the two broad aspects which we have just listed. The following are the outcomes of thethesis:

1. Critical analysis of raaga-rasa relationship and a survey to test the hypothetical association be-tween them. Almost every scholarly article available on Indian music treats the term rasa asthough it is identical to the term emotion. In the course of our work, we used the term with thesame sense. But our investigation has yielded an insight which is very different from the currentunderstanding of the term. In this work, we report the results of a survey conducted to analyse therelationship between raaga and emotion, and discuss the the term rasa.

2. A survey of the state-of-the-art raaga recognition techniques identifying the drawbacks and thefuture directions. We present a general overview of the plausible approaches to the raaga recog-nition, and discuss various systems with respect to their contributions and drawbacks.

3. A raaga recognition system. To know a raaga, it is often said that there is no other way exceptto listen and feel it. This well defined, yet an abstract entity drew our attention to build a modelto identify the raaga of a given musical piece. In this work, we discuss the previous work andpropose few enhancements to a Hindustani raaga recognition model to suit the requirements ofCarnatic music, and discuss the results.

Apart from these major contributions, the thesis also includes the following minor contributions.

1. Discussed various concepts of Carnatic music like rasa, 22 srutis and microtonal intervals inthe light of knowledge shared by the recent investigations of the scientific community. We alsopresent several other open problems that are computationally relevant.

2. Built a ground-truth dataset of 10 raagas with 170 tunes. This dataset is drawn from real stageconcerts and audio CDs. As far as we know, it is also by far the most diverse Carnatic raagadataset reported.

7This thesis work which I have discovered only recently, has been done almost parallelly in IIT-Madras on other aspects ofCarnatic music.

3

1.5 Organization of the content

Chapter 2: This chapter introduces melodic, rhythmic and structural aspects of Carnatic music,coupled with the critical reviews of the past computational work. We focus on the drawbacks andpropose few enhancements. Further we discuss the advantages of computational modeling of variousmusical aspects of Carnatic music.

Chapter 3: We discuss the term rasa with a historical perspective, and present our criticism towardsits usage in today’s Indian classical music context. A thorough analysis of a survey on raaga and emotionis also presented and its implications on mood based music recommendation systems is discussed.

Chapter 4: In this chapter we identify few drawbacks of the current raaga recognition systems andpresent our method which attempts to overcome them. We’ll also present a new Carnatic raaga ground-truth data which can help researchers in their future efforts.

Chapter 5: We conclude the thesis presenting few open problems to the community, and also thepossible future direction of this work.

4

Chapter 2

Computational approaches to Indian classical music

Though all music traditions share few characteristics, each one can be recognized by some veryparticular features that need to be identified and preserved. The Information Technologies used for musicprocessing have typically targeted the western music traditions and current research is emphasizing thisbias even more. However, to develop technologies that can deal with the richness of our world’s musicwe need to study and exploit the unique aspects of other musical cultures. By looking at the problemsemerging from various musical cultures we will not only help those specific cultures but we will openup our computational methodologies, making them much more versatile. In turn, we will help preservethe diversity of our world’s culture.

The classical music traditions of the Indian subcontinent, Hindustani and Carnatic, offer an excellentground on which to test the limitations of the current music information research approaches. At thesame time, their study can shed light on how to solve new and complex music modeling problems.Both traditions have very distinct characteristics, specially compared with western ones: they havedeveloped their own instruments, musical forms, performance practices, social uses and context. Likewe said, in this thesis, we focus on the Carnatic music tradition of south India, especially on its melodiccharacteristics.

The computational study of Carnatic music offers a number of problems that require new research ap-proaches. Its instruments emphasize sonic characteristics that are quite distinct and not well understood.The concepts of Raaga and Taala are completely different from the western concepts used to describemelody and rhythm. Their music scores serve a different purpose than the ones of western music. Thetight musical and sonic coupling between the singing voice, the other melodic instruments and the per-cussion accompaniment within a piece, requires going beyond the modular approaches commonly usedin music information research (MIR). The tight communication established in concerts between per-formers and audience offer great opportunities to study issues of social cognition. Its devotional aim isfundamental to understand the music. The study of the lyrics of the songs is also essential to understandthe rhythmic, melodic and timbre aspects of the Carnatic music.

This chapter focuses on the melodic (Sec 2.1) and rhythmic (Sec 2.2) aspects of Carnatic music,overviewing the theoretical aspects that are relevant for MIR and discussing the scarce computational

5

approaches that have been presented. We put emphasis on the limitations of the current methodologiesand we present some open issues that have not yet been addressed and that we believe are important tobe worked on.

2.1 Computational approaches to melody

In Carnatic music, the melody is carried mainly by the vocalist. The voice plays always the centralrole, however, sometimes instruments like violin or veena take its place, usually imitating its mannerof articulating. The most fundamental melodic concept in Indian classical music is raaga. Matanga isthe first known person to define what a raaga is [45]: “In the opinion of the wise, that particularity ofnotes and melodic movements, or that distinction of melodic sound by which one is delighted, is raaga”.Therefore, the raaga is neither a tune nor a scale[32]. It is a set of rules which can together be called amelodic framework. The notion that a raaga is not just a sequence of notes is important in understandingit, and for developing a computational representation. A raaga evolves over time, i.e. no raaga wasunderstood the way it is today. A given raaga can nonetheless be described by a set of properties: Aset of notes (swaras), their progressions (arohana/avarohana), the way they are intoned using variousmovements (gamakas), characteristic phrases and the relative position, strength and duration of notes(types of swaras). In order to identify raagas computationally, swara intonation, scale, note progressionsand characteristic phrases are used (Sec 2.1.2 and 2.1.3). Other unexploited properties of a raaga includegamakas and the various roles the swaras play (Sec 2.1.4).

2.1.1 How do people identify raaga

Though there are no rules of thumb in identifying a raaga, usually there are two procedures by whichpeople get to know the raaga from a composition. It normally depends on whether the person is a trainedmusician or a rasika, the non-trained but knowledgeable person. People who have not much knowledgeof raagas cannot identify them unless they memorize the compositions and their raagas.

2.1.1.1 Non-trained person or the rasika’s way

In a nutshell, the procedure followed by a rasika typically involves correlating two tunes based onhow similar they sound. Years of listening to tunes composed in various raagas gives a listener enoughexposure. A new tune is juxtaposed with the known ones and is classified depending on how similar itsounds to a previous tune. This similarity can arise from a number of factors - the rules in transitionbetween notes imposed by arohana and avarohana, characteristic phrases, usage-pattern of few notesand gamakas.

This method depends a lot on the cognitive abilities of a person. Without enough previous exposure,it is not feasible for a person to attempt identifying a raaga. There is a note worthy observation in thismethod. Though the people cannot express in a concrete manner what a raaga is, they are still able to

6

identify it. This very fact hints at a possible classifier, that can be trained with enough data for eachraaga.

2.1.1.2 The trained musician’s way

A musician tries to find the characteristic phrases of the raaga. These are called pakads in Hindustanimusic and swara sancharas in Carnatic music. If the musician finds these phrase(s) in the tune beingplayed, the raaga is immediately identified. But at times these phrases might not be found or, are toovague. In this case, the musicians play the tune on an instrument (imaginary or otherwise) and identifythe swaras being used. They observe the gamakas used on these swaras, locations of various noteswithin the music phrases and the transitions between swaras. They use these clues to arrive at a raaga.

This method seems to use almost all the characteristics a raaga has. It looks more programmaticin its structure and implementation. If the current music technology can afford to derive various lowlevel features which can be used to identify such clues, the same procedure can be implemented com-putationally with almost perfect results! These two methods corresponding to the trained musiciansand the non-trained listeners are both important which are to be understood for implementing a raagarecognition system, or to model the raaga in a broad sense.

2.1.2 Swaras and Srutis

In Indian music, swaras are the seven notes in the scale, denoted by Sa, Ri, Ga, Ma, Pa, Da and Ni1

[43]. Except for the tonic and the fifth, all the other swaras have two variations each, which account for12 notes in an octave, called swarasthanas. There are three kinds of scales that one generally encountersin Carnatic and Hindustani music theory: a 12-note scale, a 16-note scale and the scale which claims 22srutis2. The 16-note scale is the same as the 12-note scale except that 4 of the 12 notes have two nameseach in order to be backward compatible with an older nomenclature. See Table 2.1. The tuning itself,whether it is just-intonation or equi-tempered, is an issue of debate3 [22]. Since Indian classical musicis an orally transmitted tradition, perception plays a vital role. For instance, tuning seldom involves anexternal tool. And even tambura, which is used as a drone, has a very unstable frequency. Hence theanalysis of the empirical data coupled with perceptual studies are important.

Few musicians and scholars claim that there are more srutis in practice than those explained above.Though many of them argue the total number to be 22, that itself is debated [18]. A more importantquestion to be asked is whether they are used in current practice at all. Some musicologists say that theyare no more used [35]. It is also said that they are wrongly attributed to Bharata, who used sruti to mean“the interval between two notes such that the difference between them is perceptible”. Krishnaswamy[23] argues that the micro tonal intervals observed in Carnatic music are the perceptual phenomena

1This notation is analogous to e.g. Do, Re, Mi, Fa, So, La and Ti.2Sruti is the least perceptible interval as defined in Natyasastra[36]3http://cnx.org/content/m12459/1.11

7

Table 2.1: The scales used in Indian classical music

Swaram Notation Western Sthanam RatioSadjamam Sa C 1 1Suddha Rishabam (Komal) Ri1 C # 2 16/15Chathusruthi Rishabam (Tivra) Ri2 D 3 9/8Shatsruthi Rishabam Ri3 D #/ E b 4 6/5Suddha Gandharam Ga1 D 3 9/8Sadharana Gandharam (Komal) Ga2 D # /E b 4 6/5Anthara Gandharam (Tivra) Ga3 E 5 5/4Suddha Madhyamam (Komal) Ma1 F 6 4/3Prati Madhyamam (Tivra) Ma2 F #/G b 7 64/45Panchamam Pa G 8 3/2Suddha Dhaivatham (Komal) Da1 G #/A b 9 8/5Chathusruthi Dhaivatham (Tivra) Da2 A 10 5/3Shatsruthi Dhaivatham Da3 A #/ B b 11 16/9Suddha Nishadam Ni1 A 10 5/3Kaisiki Nishadam (Komal) Ni2 A #/B b 11 16/9Kakali Nishadam (Tivra) Ni3 B 12 15/8

caused by the gamakas, i.e. that these micro tonal intervals are what few scholars and musicians claimas 22 srutis. However, we believe that these claims need to be verified with perceptual and behavioralstudies. In general, more empirical, quantitative and large-scale evidence on the tuning of Carnaticmusic needs to be gathered. In our encounters with most musicians, we can only conclude that they areunaware of the usage of 22 srutis in practice. Few musicians who claim they are used, are not ready todemonstrate them in a raaga. Table 2.2 shows the 22 sruti values derived by Sambamurthy [41].

It is a well accepted notion that a note (swarasthana) is a region rather than a point [13, 43]. Thus, a

fixed tuning for each note is not as important as it is in, say, western classical music. In addition, Sa can

be any frequency. It depends on the comfort of the singer or the choice of the instrument player. A given

note is intoned in different ways for each raaga. Even if two raagas have the same scale, the intonation

of notes vary significantly. Belle et al [4] have used this clue to differentiate raagas that share the same

scale. They evaluated their system on 10 audio excerpts accounting for 2 distinct scale groups (two

raagas each). They showed that the use of swara intonation features improved the accuracies achieved

with pitch-class distributions [8]. This clearly indicates that intonation differences are significant to

understanding and modeling raagas computationally. Levy [29] analyses the intonation in Hindustani

raaga performances and notes that it is highly variable, and that it does not seem to agree with any

standard tuning system. Subramanian [51] reports much the same for Carnatic music. These studies call

8

Table 2.2: The values of 22 Srutis derived by [41]

Name of Sruti Notation Ratio Interval Freq (Hz) Interval (cents) Equi-temp ratioShadja sa 1 240 0Ekasruti Rishabha ra, r1 256/243 1.0534 252.8 90 1.05946Dvisruti Rishabha ri, r2 16/15 1.0125 256 112Trisruti Rishabha ru, r3 10/9 1.0416 266.6 182Chatussruti Rishabha re, r4 9/8 1.0125 270 204 1.1224Suddha Gandhara Or KomalSadharana Gandhara

ga, g1 32/27 1.0534 284.4 294 1.1891

Sadharana Gandhara gi, g2 6/5 1.0125 288 316Antara Gandhara gu, g3 5/4 1.0416 300 386 1.2599Chyuta Madhyama GandharaOr Pythagorean Major 3rd

ge, g4 81/64 1.0125 303.75 408

Suddha Madhyama ma,m1 4/3 1.0534 320 498 1.3348Tivra Suddha Madhyama mi,m2 27/20 1.0125 324 520Prati Madhyama mu,m3 45/32 1.0416 337.5 590 1.4147Chyuta Panchama Madhyama me,m4 729/512 1.0125 341.7 610

64/45 1.0113 341.3Panchama pa 3/2 1.0534 360 702 1.4982Ekasruti Dhaivata dha,d1 128/81 1.0534 379 792 1.5873Dvisruti Dhaivata dhi,d2 8/5 1.0125 384 814Trisruti Dhaivata dhu,d3 5/3 1.0416 400 884Chatussruti Dhaivata OrPythagorean Major 6th

dhe,d4 27/16 1.0125 405 906 1.6817

Suddha Nishada Or KomalaKaisiki Nishada

na,n1 16/9 1.0534 426.6 996 1.7817

Kaisiki Nishada ni,n2 9/5 1.0125 432 1018Kakali Nishada nu,n3 15/8 1.0416 450 1088 1.8876Chyuta Shadja Nishada OrTivra Kakali Nishada OrPythagorean Major 7th

ne,n4 243/128 1.0125 455.6 1110

Tara Shadja sa 2 1.0534 480 1200 2

for the need to understand the extent to which a given note can be intoned. In particular, this could be

of interest to differentiate artists and styles.

All these works indicate that a complete characterization of swarasthanas must go beyond static

frequency measurements and that their dynamics need to be considered. The problem implies much

more than trying to discriminate whether swarasthanas are tuned to just-intonation, equi-tempered or

following 22 srutis. Much empirical data like the one reported in [51] and [29] needs to be gathered to

investigate the intervals, the range of intonations and the temporal evolution of each swarasthana.

2.1.3 Arohana and Avarohana

Typically, a raaga is represented using ascending (arohana) and descending (avarohana) progressions

of notes. There are certain note transition rules that are necessary to be followed when performing a

raaga. The set of unique notes in these progressions form a scale. For raaga identification, Rajeswari

et al [49] estimate the scale from the given tune, and compare it with the template scales in a given

database. Then, the raaga corresponding to the best match is output. Their test data consists of 30

9

tunes in 3 raagas sung by 4 artists. They use the harmonic product spectrum algorithm [28] to extract

the pitch, and the tonic is manually fed. The other frequencies in the scale are marked down based on

the respective ratio with the tonic. The results obtained are shown in Figure 2.1, which shows a 67%

accuracy. The authors claim that such a low accuracy could be due to discrepancies in the manually fed

tonic. But considering that their system identifies only the swaras that are used in a raaga and no other

relevant data, the result shows that the swaras alone can be very useful. However, there are raagas which

have the same swaras. Since the scales of the raagas they considered are different, this is not an issue.

Figure 2.1: Results of Rajeswari & Geeta’s raaga identification method

Shetty et al[46] use a similar approach when they try to recognize raagas. The features extracted are

the individual swaras and their relation in arohana-avarohana (swara pairs). The sequence of features

is used for training a neural network. They report an accuracy of 95% over 90 tunes from 50 raagas,

using 60 tunes as training data and the remaining 30 tunes as test data. However, such a high accuracy

is questionable due to the few data per class used.

Sahasrabudde et al [40] model the raaga as finite automata. A finite automata has a set of states

between which the transitions take place. In the case of raaga, the swarasthanas are the states and the

note transitions are observed. This idea is used to generate a number of music compositions in the

form of symbolic notation, which they claim are technically correct and indistinguishable from human

compositions. Inspired by this, Pandey et al [31] use HMM models to recognize the raagas. The rules

to form a melodic sequence for a given raaga are well defined [41] and the number of notes is finite.

Therefore, intuitively, HMM models should be good at capturing those rules in note transitions imposed

by arohana and avarohana patterns.

10

Raaga Samples HMM HMM +Phrase matching

Yaman Kalyan 15 80% 80%Bhupali 16 75% 94%Total 31 77% 87%

Table 2.3: Accuracy of raaga identification reported in [31]

Each raaga has also few characteristic phrases. They are called swara sancharas in Carnatic and

pakads in Hindustani. These phrases are said to be very crucial for conveying the feeling of the raaga.

Typically, in a concert, the artist starts by singing these phrases. They are the main clues for the listeners

to identify which raaga it is. Pandey et al have complemented their approach with values obtained

from two modules that match characteristic phrases, taking advantage of this information. In one such

module, characteristic phrases are identified with a substring matching algorithm. In the other one, they

are identified by counting the occurrences of frequency n-grams in the phrase.

The other important contributions by Pandey et al include two heuristics to improve the transcription

of Indian classical music: the hill peak heuristic and the note duration heuristic. As mentioned, Indian

music has a lot of micro tonal variations which makes even the monophonic note transcription a chal-

lenging problem [31]. The two heuristics proposed in their approach try to get through these micro tonal

fluctuations in attaining a better transcription. The hill peak heuristic states that a significant change in

the slope of a pitch contour (or the sign reversal of such slope) is closely associated with the presence

of a note. The note duration heuristic considers only the notes that are played for at least a certain span

of time. The approach was tested on two raagas. Table 2.3 shows the results obtained by using HMM

models alone, and by complementing the models with characteristic phrase matching. Not much can be

said about the reliability of the features they have used since the number of classes considered are just

two. But the advantage of characteristic phrase matching is evident. HMM models of raaga are also

used by Sinith et al [47] to search for musical patterns in a catalog of monophonic Carnatic music. They

build HMM models for 6 typical music patterns corresponding to 6 raagas (they report a 100% accuracy

in identifying an unknown number of tunes into 6 raagas). HMMs are also used by Das and Choudary

[12] to automatically generate Hindustani classical music.

Chordia and Rae [8] use pitch class profiles and bi-grams of pitches to classify raagas. The dataset

used in their system consists of 72 minutes of monophonic instrumental (sarod) data in 17 raagas played

11

Classifier AccuracyMulti Variate Normal 94%FFNN 75%K-NN Classifier 67%Tree-based Classifier 50%

Table 2.4: Comparison of various classifiers used in Chordia and Rae’s system

by a single artist. Again, the harmonic product spectrum algorithm [28] is used to extract the pitch.

Note onsets are detected by observing the sudden changes in the phase and the amplitude of the signal.

Then, the pitch-class profiles and the bi-grams are calculated. It is shown that bi-grams are useful in

discriminating the raagas with the same scale. They use several classifiers combined with dimensionality

reduction techniques. The feature vector size is reduced from 144 (bi-grams) + 12 (pitch profile) to 50

with PCA. Using just the pitch class profiles, the system achieves an accuracy of 75%. Using only bi-

grams of pitches, the accuracy is 82%. Best accuracy of 94% is achieved using maximum a posteriori

rule with multi-variate likelihood model. Comparison to other classifiers is shown in Table 2.4.

2.1.4 Unexploited properties of raaga

2.1.4.1 Gamakas

The various forms of pitch movements are together called as gamakas. A sliding movement from

one note to another or a vibrato are examples of gamakas. There are various ways to group these

movements, but the most accepted classification speaks of 15 types of gamakas. Gamakas are not just

decorative items or embellishments, but very essential constituents of a raaga [18]. Each raaga has

gamakas characteristic to its nature. Thus the detection of gamakas is a crucial step to model and

identify raagas.

A gamaka is often represented using discrete notes, but it does not necessarily mean that one plays

them using discrete notes. The representation is only a handy expression of a more continuous sounding

pattern, which is difficult to represent on the paper. The gamaka is almost always a smooth change in

the dynamics of a pitch contour. Though they are used in both Carnatic and Hindustani [33], the pattern

of usage is very distinct. Owing to their tremendous influence on how a tune sounds, they are often

considered the soul of Indian classical music.

12

There are two major issues that make identifying a gamaka a challenging problem. First, it requires

accurate pitch transcription, without octave errors. Second, the variations found for different artists

in performing a gamaka complicate it further. Krishnaswamy [25] and Subramanian [51] report such

variations across different artists performing the same gamaka. They also propose some theoretical

guidelines to resolve the second problem to some extent. These variations should be exploited in per-

formers’ computational modeling, a field that lacks much research in the case of Indian classical music.

2.1.4.2 Various Roles Played by the Notes

In a given raaga, not all the notes play the same role. Though two given raagas have the same set of

constituent notes, their functionality can be very different, leading to a different feeling altogether [55].

For example, some swaras occur frequently, some are prolonged, some occur either at the beginning or

the end of the phrases, etc.

In addition, there are alankaras, patterns of note sequences which are supposed to beautify and instill

feelings when listened to.

Though emotion is a subjective issue, it gets into almost every discussion involving raagas. That

is because each raaga is said to evoke characteristic emotions. To test this hypothesis, Chordia and

Rae [9] have conducted a survey to check whether Hindustani raagas elicit emotions consistently across

listeners. Positive results are reported, jointly with the musical properties like relative weight of the

notes, which partially explain the phenomenon. Koduri et al4 [21] have conducted a similar survey with

Carnatic raagas. Though not as significant as the pattern reported by Chordia et al, the results indicate

that Carnatic raagas elicit emotions which are consistent across listeners. Wieczorkowska et al [56] tests

if raagas elicit emotions, and also arrives at a mapping between melodic sequences of 3 or 4 notes and

the elicited emotions. Their work suggests that different compositions in the same raaga might elicit

different emotions, what is consistent with the observations made by Koduri et al [21]. Wieczorkowska

et al note that these melodic sequences are related vaguely to the subjects’ emotional responses. Another

interesting observation is the significance in the similarity between the responses of people from various

cultures, which is consistent with the observations made in a previous study conducted by Balkwill et al

[2].

The previous work has verified whether raagas elicit emotions, and tried to map the musical features

which are responsible for such phenomenon. Besides the note sequences, another important aspect

4This work is discussed in detail in Chapter 3

13

which is responsible for emotional aspect of Indian classical music, is gamaka. So far, there are no

studies which report their effect. The kind of instruments used and the rhythmic aspects also need to be

accounted for.

2.2 Computational approaches to rhythm

In Indian classical music the rhythm is carried by mrudangam in Carnatic music, a cylindrical drum

with two faces, and tabla, a pair of kettle drums in Hindustani music. While raaga is the most funda-

mental concept related to melody, taala is the most fundamental concept related to rhythm [55]. A taala

is a rhythmic cycle, which is divided into specific uneven sections, each of them subdivided into even

measures. The first beat of each taala section is accented, with notable melodic and percussive events.

Further rhythmic variations are developed along the cycle, giving the taala a unique pattern of bols. A

bol is the main unit in learning and playing taalas. It is a mnemonic associated to the sound produced

by either a single stroke or a combination of multiple strokes.

Though the concept of taala is used in both Carnatic and Hindustani music, there are slight differ-

ences. For example, based on the composition and the order of sections, there are seven classes of taalas

in Carnatic music. Sambamoorty [41] lists all such taalas and provides the description for each. On the

other side, in Hindustani music, there are hundreds of taalas that have been used and are mentioned, but

nowadays just about ten of them are more common with their known variations [11].

The first research on the acoustics of Indian drums was conducted by Raman [34] who studied the

relationship between the vibrational modes of the drum’s membrane and the harmonic overtones of

the sound, which allow the drum to be finely tuned. In general, the current MIR research on rhythm

deals with detecting drum strokes in monophonic context. Gillet and Richard [16] segmented tabla

solo recordings in order to build a database of bols. Then they used a probabilistic approach based

on HMMs to label the specific bols. The resultant transcription system was embedded in a real time

environment called Tablascope [16]. Chordia and Rae [10] built a system to recognize bols, as part of

an automatic tabla-solo accompaniment software, Tabla Gyan. Their research extends the studies of

Gillet and Richard on categorizing bols using different classifiers: Multivariate Gaussian, probabilistic

neural network and feed forward neural network. Classification accuracies of 92%,94% and 84% over

10-15 classes were obtained, respectively [6]. Chordia and collaborators’ research focuses more on the

tabla solos and the logic of building the improvisation sequences and ornamentation. Chordia et al [7]

14

have described a system which predicts the continuation of tabla compositions, using a variable length

n-gram model, to attain an entropy rate of 0.780 in a cross-validation experiment.

Bol transcription in a polyphonic context is an open topic. The current MIR research on drum

transcription uses a small number of drum stroke classes. Each class is associated with a specific (single)

drum, usually based on the typical drum set. Conversely, with tabla and mrudangam, multiple classes

(bols) are associated to each drum. Considering that they are tuned, such information can be used,

besides timbre, to separate the classes. A more ambitious topic is the classification of taalas, which

currently can only be performed by musicians and trained audience. As the musician always tries

to embellish the taala, there is a strong variation from performance to performance. Therefore, the

main goal would be to gain insensitivity to these variations in order to classify taalas or, otherwise, to

model these variations for understanding performance and improvisation. Indeed, there is a well-defined

structure to improvisation which should be exploited [18].

2.3 Musical forms

Apart from melodic and rhythmic aspects of Carnatic music, another important concept is the musical

form. It is the way a given song is organized or perceived. During the evolution of Carnatic music, there

are several historical stages one can refer to as having a number of such musical forms. This section

briefly discusses the need for identifying them computationally, and its advantages. Our discussion on

musical forms is limited in its scope to a few of the current forms in Carnatic music. Sambamurthy [41]

and Janakiraman [18] thoroughly discuss them in a historical perspective.

There are a number of ways to classify a given musical form. A classification scheme which is com-

putationally relevant and therefore discussed here is, the classification of Manodharma5 / Improvisatory

and Kalpita6 / Composed forms of music.

2.3.1 Improvisatory Forms

There are five kinds of improvisatory forms - Raaga alapana, Taanam, Pallavi exposition, Swara

kalpana and Sahitya prastara or Niraval. We breifly discuss them and describe the need to disntinguish

them computationally.

5Manodharma indicates acting according to ones heart.6Kalpita literally means that which was already created.

15

2.3.1.1 Raaga alapana

The artist improvises in a raaga without any rhythmic constraints. The improvisation is done using

syllables which does not have any meaning, such as vowels. It does not involve any accompaniment. It

normally has three-four stages. During the first stage the artist gives direct hints about the raaga, such

as the swara sancharas7. The second stage is the most elaborate one which constitutes singing in three

octaves and various speeds in the given raaga. The third and fourth stages conclude the alapana.

2.3.1.2 Taanam

Taanam involves permuting and combining swaras. It starts with as few as 2 or 3 swaras a time, and

once the artist runs out of more combinations, a swara is added. Taanam is sung using few auspicious

words such as anandam, anantam and taanam. There is a perceptible rhythmic component unlike

alapana. Indeed, often when pallavi of kruti is preceded by alapana, it is customary to sing a brief

taanam as a conclusion to alapana.

2.3.1.3 Pallavi exposition

The word pallavi comes from three words, pada laya vinyasam, which literally translates to the grand

showcase of magic with the words and the rhythm. The artist’s virtuosity in rhythmic aspects is put to

a tough test in this section. It consists of playing the pallavi, which is the first stanza of a kruti, in three

speeds keeping the taala constant. So, the same pallavi, if it sung in t seconds the first time, it is sung

twice in the same t seconds next time, and four times the next time. This pattern is called anuloma.

There is another called pratiloma, which is the converse of it. The pallavi exposition also consists of

sangatis. They are the subtle melodic variations introduced each time the artists sings the stanza.

2.3.1.4 Swara kalpana

The phrase itself conveys that it is the creative imagination of swara patterns. This musical form is

sung with solfege syllables. This form is rhythmically constrained too. Each melodic phrase constructed

with swaras is sung to fit in a cycle of taala. The phrase presents a complete melodic picture, that is

each taala cycle accommodates a complete musical phrase. The length of such patterns grows as the

artist progresses. There are several decorative structures like alankaras that are used in constructing

7Swara sancharas are the characteristic phrases of a raaga.

16

interesting patterns. In old literature few such alankaras are mentioned as gamakas. But now, they are

well distinguished from each other.

2.3.1.5 Niraval

This is a very interesting form which brings the two most important elements together, namely the

sahitya/lyrics and sangeeta/music. Lyrics are almost indispensable part of Carnatic music. So much

so that even the training which involves just the instruments like violin, is bound to include the lyrical

aspect since it inherently carries with it the context and more importantly, the musical structure. For

instance, the pauses, the flexibility to stretch or break few phrases are important for the artist to improvise

[18].

In Niraval, the artist chooses a phrase which is deep in meaning and reflective of the poem/kruti and

presents it with varying melodic formations. The goal is to sketch the hidden bhavas/feelings in the

kruti. This is done by elaborating the central idea musically. The variations in such melodic formations

include changing gamakas, pauses, stressed points etc.

2.3.2 Composed Forms

The composed forms are primarily of two kinds - abhyasa gana and sabha gana. The forms which

are classified as abhyasa gana are helpful to a student in gaining insights into the raaga and the taala,

incrementally. Examples include swarajatis, varnams etc. Each of them is an exercise to practise an

aspect of the raaga or the taala. For instance, a form called jatiswara helps the student in learning

the intricacies of the taala. The forms which are classified as sabha ganas are sung on stage. Krutis,

Keertanas etc are the examples of sabha gana. These are more elaborate and also accommodate the

artistic improvisations. For instance, when kruti is sung as the main piece in a concert, it features the

alapana, pallavi exposition, neraval and swara kalpana.

2.3.3 Associated work

No research has been found in understanding the musical forms computationally. Improvisation is

an indispensable element of Indian classical music. A significant effort in teaching goes in training the

students to be creative, using the basic building blocks taught as part of the improvisatory forms. They

are musically very structured and provide a good scope to be explored programatically. Understanding

17

how a particular form evolves helps us to see the role played by properties of the raaga and the structure

of the taala in shaping it. Further, each artist has his/her own style of improvisation. Thus, studying the

evolution of musical forms facilitates modeling the performances and artists.

18

Chapter 3

Raaga and Rasa

(History, context, truths and myths)

3.1 A brief history of rasa

Music is certainly known to evoke feelings. An example which everyone is familiar with is a lullaby

the mother sings to put a baby to sleep. Those babies barely understand the words used in the lullaby.

What soothes them is the melody. Another familiar example is the war music. Traditionally, raagas are

often associated with the feelings they are said to evoke, called rasas. In this section we’ll learn more

about them.

3.1.1 Origin

The term Rasa in the context of arts, to our knowledge, was first used in Natyasastra, a magnum-

opus on theatrical arts, especially the drama. The text opines that there is no limit to the continuum

of emotions a human can express. However, it propagates eight distinct rasas which are experienced

by amalgamation of the emotions. Please note that emotions are well recognised, distinct and different

from rasas. Rasa is not an atomic entity. It is a process though which the audience goes through whilw

watching a drama. This process is outlined in the following subsection.

3.1.2 Rasa nishpatti

(The process of experiencing rasa)

Nishpatti literally means production. Rasa is best understood with an analogy. Think of any dish that

you like the most. There are several condiments which are included in making it. Those condiments

19

correspond to several flavours. The final taste obtained in the dish using all those spices, is analogical to

rasa. Here, the spices are bhavas/emotions. Now, we will layout the actual process of experiencing the

rasa. We’ll run a simultaneous example to help the reader comprehend the process easily.

In all, there are around 49 bhavas listed in Natyasastra. These bhavas are sub classified into three

groups.

1. Sthayi (8)

2. Satvika (8)

3. Sanchari (33)

The sanchari bhavas are the temporary feelings or emotions which a person goes through before

attaining a rasa. The satvika bhavas are the physiological responses, and the sthayi bhavas are the final

emotional states a person attains.

1. Vibhaava (The stimulus): Let us consider an example - a small incident with a father and his kid.

Let us say the father is not very happy with the kid’s mischievous nature. Now the cause of this

worry can be from two kinds of sources.

(a) Alambana vibhaava (From the actual source): The father might have seen the kid doing

some mischief. This is the direct source of information.

(b) Uddeepana vibhaava (That which reinforces the information from the actual source): A

neighbour might have complained to the father about the kid’s mischievous behaviour. This

is not a direct source of information.

2. Anubhaava (Involuntary responses): The immediate reaction of the father without much thought

is called Anubhaava. ‘Worry’ and ‘Disappointment’ are most probably his Anubhaavas.

3. Satvika Anubhaava (Physical responses): If the father is too worried, he might be sweating, or

having tears in his eyes. These are bodily responses to the incident.

4. Sanchari/Vyabhichaari bhaava (Temporary sentiments): The father gently talks to the kid and tries

to teach and persuade him/her not to repeat the mistakes. He teaches the kid to behave properly

with neighbours. These are his intermediary reactions to the incident, which will finally give way

to a final sentiment.

20

Table 3.1: Categories of rasas as given in Natyasastra

Index Rasa Sthayi Bhaava1 Srungara Love2 Hasya Humour3 Karuna Grief4 Raudra Anger5 Vira Bravery6 Adbhuta Surprise7 Beebhatsa Disgust8 Bhayanaka Fear

5. Sthayi bhaava (The perceived final sentiment): The kid has committed some mistakes. The

parental love towards the kid made the father worried. The love, worry drove him to teach the kid

about good behaviour. Now, the final state of the father, or even the audience observing the father,

is the sthayi bhaava.

After witnessing this process, the state of audience corresponds to a rasa. Another example given

in Natyasastra speaks about sadness, happiness, disappointment etc in Srungara (Love) rasa. With the

process outlined here, it should be clear that a rasa may contain an interplay of several emotions.

3.1.3 Categories of rasa

Table 3.1 lists the rasas identified in Natyasastra, and the equivalent English term of the Sthaayi

bhaava corresponding to the rasa.

Later, Saantha rasa (Peace) is added to these 8 rasas. Together, they are called Navarasas1. The

number of rasas is a subject of debate. For instance few scholars identify devotion as a seperate rasa.

Few others go further ahead and identify mother’s love towards a child as a seperate rasa. However,

Bharata’s classification together with Saantha rasa, is the most popular one today.

3.1.4 Rasa in drama and music

It is an understatement to say that the rasa theory of Natyasastra is very influential on Indian art

forms. The magnum-opus has discussed the rasa in the context of drama. It considers music to be a part

of the drama. For the audience to experience a rasa, music alone is insufficient. It must be accompanied

by the other components of drama like gestures, body movements, dialogues, context etc. A typical

1Navarasas is a Sanskrit term for nine rasas.

21

drama depicts an incident from a story which involve human actions and responses. These often consist

the regular activities in the day to day life. The stronger element which moved the audience came from

these depictions.

Whereas today, music is an art form by itself. It has evolved much beyond its usage in dramas. It

can be true that different raagas evoke different emotions in the audience. But calling such emotions as

rasas is something which we found very inconsistent with the definition of rasa. The process involved

in experiencing a rasa is completely irrelevant in the case of music. Music needs a different emotional

model and corresponding nomenclature to describe the feelings evoked by it. However, to our knowl-

edge, there are no such attempts. We discuss the only nearest and unintended attempt we knew so far,

in the following subsection.

3.1.5 Raaga paintings

The feelings triggered by music are not just confined to aural senses. When we listen to raaga music,

it is not unusual that we visualize (ourselves in) some place/scenery appropriate to the kind of music

we are listening to. For example, the serene slow-tempo Indian flute music drives our thoughts around

mountains, rivers and fresh air. The fast paced high frequency patterns on violin can lead to a different

visualization or interpretation. In the medieval period, rather an interesting development took place in

this direction. Each raaga is painted by various schools of art in India. The painting usually depicted the

mood set by the raaga, season and time of the day when it is sung etc. They are an interesting projection

of aesthetics of a raaga. They leave us a hint to sense the feasibility of expressing the emotions evoked

my music in a non-traditional way. These paintings are popular in Hindustani and not so much in

Carnatic. Two such paintings are shown in 3.1.

In our effort to validate the hypothesis that raagas evoke peculiar rasas, we have conducted a be-

havioural study. Like many scholars, earlier we were also mislead to believe that rasa is the same as

emotion. Like we said earlier, we did not have the knowledge of the previous section when we con-

ducted this study. Nevertheless, we would like to describe the set-up and analysis of the study as it is

done. Towards the end, we’ll conclude the analysis in the light of the knowledge from the previous

section.

22

Figure 3.1: Raaga paintings of Vasanta Ragini (left) and Hindola (right) raagas

3.2 A behavioural study to analyse the raaga-emotion relationship

The motivation for this behavioural study is our desire to build a culture-specific content-based music

recommendation system. A total of 750 subjective emotional responses to tunes composed in popular

raagas of Carnatic music are collected to investigate the long speculated relation between raagas and

rasas (considered as emotion clusters). We discuss the results from analysis of this survey, which show

that raagas are indeed useful as a first step in a different direction, towards building a content-based

music recommendation system. We used a classification based on a novel approach in conceptualization

of emotions based on navarasa, which is the emotion classification method given by Bharata, that suits

behavioural studies with Indian arts.

3.2.1 Introduction

For a long time, there has been an ongoing debate on how music induces emotions in listeners

[17, 27]. Past and recent developments [48], with behavioural [26] and neurological [58] evidences,

show that music indeed induces emotions and the corresponding activation patterns in brain circuits are

found [42] in same locations that are found to be active when emotions are induced by a different stimuli,

say language. However, emotion induced by music does not provoke us to act out that emotion, as is

the case when we are emotionally excited by a different stimulus like language. But there are instances

23

when music does seem to evoke physiological responses like respiration and perspiration [27] while it

lasts, as in war music. It is unclear whether we have more cognitive control on emotions induced by

music [54].

There has been a tremendous progress in carving out theories which can explain what patterns of

music elicit which emotions. Juslin and Sloboda [19] have published a thorough survey of the state-of-

the-art in this field. Most of the work has been around western music. The music cultures from eastern

hemisphere have been left fairly untouched, barring few attempts. In this paper, we study the emotional

responses of Indian listeners to Carnatic music.

As discussed in the chapter 2, raaga is a the most fundamental melodic aspect in Carnatic and Hin-

dustani music traditions. Many definitions of a raaga given in chapter 2 clearly say that it is much more

than a scale. In cognition of Indian music, the gamakas, which typically come as note transitions, are

held as important as an independent note [38]. In each raaga, few notes are emphasized, based on which

the emotion of the tune changes. Such notes are called jeeva-swaras. Essentially, the emotion carried

by the compositions in a raaga depends on the role played by various swaras in the respective raaga. If a

raaga has many possible jiva-swaras, it is capable of evoking more emotions, based on the emphasized

note. So, each one of these raagas is associated with a particular rasa or a set of them accordingly.

Sambamurthy [41] discusses the theory behind such attribution to some extent.

3.2.2 Related work

Very little empirical work on behavioural studies in Indian classical music has been reported till date.

We are aware of two similar studies [2, 9] in the context of Indian classical music. Both the studies are

conducted with raagas from Hindustani music. The same data can not be extended without empirical

backing, to be valid with Carnatic music due to the following important differences between raagas in

these two music traditions.

3.2.3 Differences between Carnatic and Hindustani traditions

Most raagas of these two traditions differ. Few raagas share the same scale but have different names,

such as Hindolam in Carnatic and Malkauns in Hindustani. Some others have the same name and

the same scale as well, but they are rendered in distinct styles, such as Hamsadwani, owing to other

properties of the raaga. Carnatic tradition does not always employ pakad which is a characteristic phrase

of a raaga in Hindustani tradition [44]. Pakad is typically used in Hindustani concerts to establish or set

24

the mood of raaga. Though Carnatic music has characteristic phrases called swara-sancharas, they are

not used the same way as pakad.

Carnatic tradition, unlike Hindustani, does not adhere any more to the time concept of raaga where

each raaga is used only in specific duration of the day. Hindustani organizes its raagas with Thaat system

[5] whereas the raagas in Carnatic are organized into Melakarta system which affects the properties

such as scale and composition of raagas [41]. Hindustani tradition is highly influenced by other music

traditions like Sufi. But Carnatic tradition has remained relatively unaffected from the influences of

other traditions. These factors clearly indicate that these two music traditions of India are quite distinct

and the results of studies on Hindustani music cannot be extended to Carnatic music, hence demanding

a separate study of association between raagas and emotions in Carnatic tradition. Later, a study on

both the traditions analysed together might help us to better understand the similarities and differences

between Hindustani and Carnatic music, in the context of raagas and emotions.

3.2.4 The hypothesis

Our goal is to test a hypothesis held in Indian classical music tradition for many centuries. It states

that each raaga is peculiar in evoking a particular rasa. Rasa is understood as a conglomerate of a few

emotional states [1]. But, in the context of modern India, it has become ambiguous. Rasa, by many

modern musicians, is taken as a simple emotional state. It would become clear in the following section

that the traditional definition of rasa is not a proper choice in the context of music.

We selected a set of popular Carnatic raagas and conducted a behavioural experiment with tunes

composed in those raagas by expert artists. Before going into details of the experiment, we will discuss

the subjective measures which we chose to collect from the listeners.

3.2.5 Conceptualization of emotions

Western researchers have outlined several psychological models of emotions which were used in

behavioural studies [14, 38, 39]. There are several other approaches in conceptualizing emotions [19].

But for our task, we need a classification that suits the culture and perception of native Indians. Indian

arts use navarasa as a way to understand and represent human emotions. Therefore, we use the emotion

clusters derived from a classification given by ancient Indian musicologist, Bharata. Our motive is to

present few choices through which one can express his/her emotions after listening to a tune. To this

25

Table 3.2: An Emotion Classification Based on Navarasa

Index rasa Cluster0 Srungaram Arousal, Longing, Desire, Naughty, Romance, Love1 Hasyam Satire, Imitation, Tickle, Wit, Comedy2 Karunam Hesitation, Poverty, Hardships, Ignorance, Repen-

tance3 Raudram Anger, Arrogance, Frenzy, Demolition4 Viram Pride, Bravery, Fury, Perseverance, Awestruck5 Bhayanakam Startle, Tremor, Stiffen, Paleness, Dreadful6 Adbhutam Surprise, Wonder, Inexplicable, Overwhelming7 Santam Steady, Rest, Peace8 Bhakti Devotion, Eternal, Ritual, Preaching9 Santosham Joy, Pleasure, Excitement, Contentment

end, we have made few changes to the emotion clusters of navarasa to incorporate three major aspects

which we think are important. We present an overview of the procedure followed.

Natyasastra lists 49 emotional and non-emotional states which are shared between eight clusters. We

modify and prune these clusters by making the following changes. As we do not want to merge two

seemingly different emotions and, at the same time, not to make it too difficult for the listeners to record

their responses, we have limited the clusters to 10.

The first change concerns the difficulty in choosing between some emotions like longing, sad and so

on. So, we have only considered keeping those emotional states in each cluster that are considerably

different from each other in their valence and/or strength. The second major change is to regroup the

emotions, i.e., add a new cluster, modify or remove the existing cluster from among the navarasas. We

will give an example as to why this is important. Navarasa does not explicitly accommodate some

emotions. For example, devotion is not a very obvious choice in it. The third change is to tune the

clusters to gather responses to music. Since Natyasastra is intended for theatrical arts, the rasas in it

are defined to fit in a greater context. It is justified to say that resolution and recognition of emotions

in music is a hard task when compared to, say, a drama. Table 3.2 gives the final classification we have

arrived at.

26

Table 3.3: Raagas and their intended rasas

Raaga Intended rasaAnanda Bhairavi Srungaram, Karunam, SantoshamAtana VeeraHamsadwani Veera, HasyaKalyani Multiple positive valence rasasKedaragowla VeeraNadanamakriya Pathos

3.2.6 Details of the study

3.2.6.1 Design

We have chosen six raagas based on their popularity with the help of a music trainer. They are

Ananda Bhairavi, Atana, Hamsadwani, Kedaragowla, Kalyani and Nadanamakriya. Each of these raa-

gas, by their properties, is believed to evoke a peculiar emotion as shown in Table 3.3. In each of

these raagas, five tunes played on violin, of approximately 1 minute duration are selected. These are

excerpts from kruti renditions. In a similar survey with Hindustani raagas, Chordia [9] used excerpts

from alapana section, which, in our view, has few drawbacks. In alapana, the artist improvises within

the constraints of the raaga. Most listeners in our unreported pilot survey conducted with six subjects

with excerpts of alapana from two tracks each of a raaga, have only reported whether they enjoyed it or

not. They said they did not feel any emotion. This observation is congruent with remarks of Samba-

murthy [41], who says the so called art music leaves listeners in an ecstasy called sangitananda, which

is bliss and does not necessarily evoke any particular emotion. Listeners appreciate this very process

which is highly dependent on the artist’s skill, but one might not essentially feel any emotion. However,

choosing the stimuli from alapana section bars effects from other variables such as accompaniment and

tempo. Tempo is another important aspect which can affect the perception so much that a raaga which is

typically used for melancholic tunes can be used with a faster tempo to bring about ferociousness. But

for this survey, for each raaga, we have selected those tunes that have more or less a common tempo.

Please note that it is tempo, and not taalam. The tempo varied only slightly in the tunes across the

raagas. The problems that might arise due to accompaniment differences are taken care of, since the

only accompaniment in the tunes selected, mrudangam2, if at all present, is mild.

2Mrudangam is a barrel shaped percussion instrument with two ends of barrel covered with skin. It is a tuned instrument.

27

Figure 3.2: Average of the ratings collected per rasa, across all users and tunes, for each raaga. X-axisdenotes rasa index and Y-axis denotes the average value of the ratings.

To reduce participants’ fatigue, we divided the selected 30 tunes into two sets of 15 tunes each. Each

set had at least 2 tunes from each raaga. We set-up a web portal where each participant gets one set and

marks the subjective responses. Each cluster of words from Table 1 is allowed to be rated as None at

all, A Little, Somewhat or Very, based on how best that cluster expresses the participant’s emotions. For

example, if the listener feels the tune is very romantic, he/she selects Very for cluster Srungara. They

can mark multiple clusters for the same tune. They were also asked to respond verbosely on how they

feel after listening to the tune.

3.2.6.2 Participants

A total of 750 responses were recorded by 48 people with a median age of 22. Majority of them

are undergraduate or graduate students. 88% of them are male and 12% are female participants. They

described their familiarity with Indian classical music tradition, either Hindustani or Carnatic, as None

(35%) and Moderate (65%).

3.2.6.3 Results and observations

Figure 3.2 shows the normalized averages of subjective responses recorded by participants for all

the tracks in each raaga. We have quantized the verbal responses — None at all, A Little, Somewhat

28

and Very as 0, 1, 2 and 3 — to arrive at this plot. One can immediately comprehend the similarity

between plots of Ananda Bhairavi and Kalyani, Atana and Hamsadwani respectively. Kedaragowla and

Nadanamakriya are unique in regard to their plot as they do not show much similarity with other raagas.

However, Ananda Bhairavi differs from Kalyani when the relative height of peaks within each of their

plots is considered. To cross check whether the obtained peaks for rasas in each raaga correspond to

consistent user ratings, standard deviation of ratings given for each rasa across all tracks for each raaga

has been calculated.

Figure 3.3: (a) Average rating obtained per rasa for a tune in Nadanamakriya. (b) Standard deviation ofthe ratings for the same tune. X-axis denotes the rasa indices. Y-axis denotes the normalized values.

Figure 3.4: Ratings given by six users for a sample track in each raaga. X-axis denotes rasa indices.Y-axis denotes the ratings quantifiers.

29

But it has been realized that the popular measures like standard deviation cannot be relied upon for

this analysis since the actual subjective responses were obtained through verbal terms which were later

quantified numerically for analytical purpose. Standard deviation of these values will not reflect the

truth. Let us look into an example. After listening a particular tune, if few users have rated Somewhat

for a rasa and a few have rated Very for the same rasa, the deviation in the values for that particular

rasa grows high when compared to the case where all the users either rated it either Very or Somewhat.

Figure 3.3 shows one such example of the average rating and standard deviation for all rasas for a tune

in raaga Nadanamakriya. Hence, non-standard numerical quantification of such terms is ruled out as a

measure to check the consistency in ratings.

With that in mind, we have resorted to a naive but straight-forward method to validate if the rasa-

peaks for each raaga are valid in attributing a rasa to the corresponding raaga. Figure 3.5 shows the

histograms of values obtained for each rasa for all raagas. For instance, let us consider the raaga

Nadanamakriya. One can observe that consistently a large number of responses have been recorded

against rasa clusters other than the third cluster, which is sympathy. So, observing those histograms and

correlating with the mean plot of Nadanamakriya shown in Figure 3.2, we arrive at a conclusion that

Nadanamakriya primarily induces sympathy. Figure 3.4 shows responses of nine users for a track in

each raaga. For instance let us consider Nadanamakriya again. A consistent trend has been observed in

such plots for this raaga, which reaffirms that this raaga primarily induces Karuna rasa.

But this is not the case with all the raagas though. For instance, ratings for raaga Kedaragowla have

not been very consistent across listeners. This can be observed from plots for other raagas in Figure

3.4. Kalyani is an interesting raaga with many jiva-swaras, which are the most stressed notes. The

results reaffirm this, showing that it arouses an array of emotions based on the frequencies stressed

in the composition. Though not as evident as Nadamanakriya, the other raagas more or less show a

convergence in ratings and the ratings show a consistency across users. However, from Figure 3.5 and

Figure 3.4, it can be observed that the emotional responses for any given raaga are not in favour of

any single rasa. Almost every participant has chosen the ratings for multiple clusters and expressed

verbosely that he/she can feel emotions in multiple rasas listening to the tune, as is evident in the plot.

What interests us is the converging patterns in those ratings for a given raaga. The observed rasas

of few raagas are not consistent with the actual rasas, but there is certainly an overlap. Though this

observation keeps us from making a final statement on the raaga-rasa association, from the converging

30

pattern between ratings for a raaga and their variance across raagas, we can say that the raaga certainly

encapsulates the melodic patterns responsible for eliciting specific emotions.

3.2.6.4 Implications to music recommendation systems

We have investigated the possibility of building a novel recommendation system based on emotion,

specific to Indian culture. The results from the analysis of this survey have several direct and indirect

implications which can be used to increase the effectiveness of content-based music recommendation

systems in general. The fact that raaga holds properties of a tune responsible for perception of melody

and evoking emotions can be a potential key to build recommendation systems that compliment and

contrast the western approaches.

3.2.7 Conclusions

We have reported a behavioural study that has empirically tested the hypothesis that each raaga in

Carnatic music evokes peculiar emotions characteristic to that raaga. But still few questions remain

to be answered. The participants in the study were Indians. So, the influence of raaga-based music

on other cultures is yet to be seen. For now, we can only speculate based on an analysis reported by

Balkwill et al [3]. In an attempt to avoid biasing the raagas and the tunes chosen, we have deferred from

deliberately picking up raagas and/or tunes. The tunes selected were not very distinct from each other.

This has resulted in a dataset with less polarity in emotional content of the tunes, as can be seen from

Figure 3.2.

With an added data of new raagas and tunes that bring in more polarity to the dataset, a study must

be conducted with participants from other ethnic groups around the world to observe if the cultural fac-

tors play a dominant role in responding emotionally to Carnatic music. As we have observed a great

similarity of plots of user responses between few raagas, it will be interesting to see if the note transition

patterns are common in the melodies constructed using those raagas. Further, these patterns can be anal-

ysed to see if they make a dominant contribution in evoking peculiar emotions. It will be interesting to

see the extent to which this analysis improves the mood-based recommendation system by incorporating

the results of this study and our future work, with an existing statistical music recommendation program

and identify the features responsible for perception of various emotions. Later it can also be verified if

the same features hold true in perceiving other genres of music.

31

And the most important aspect in taking this work any further is to take care in incorporating the

knowledge of the previous section on rasa. Without it, this study can only be considered as the one which

investigates the relationship between raaga and emotion. Even if the rasa term is properly interpreted,

it is almost impossible to take the user responses to empirically measure it. So, as we have said in the

beginning, the definition of the rasa does not make it a good emotion model in the case of music.

32

Figure 3.5: Histograms of ratings obtained per rasa, for the six raagas. X-axis has the four ratingquantifiers - None at all, A Little, Somewhat and Very. Y-axis denotes the number of ratings obtainedfor each quantifier.

33

Chapter 4

Raaga Recognition

Raaga is the spine of Indian classical music. It is the single most crucial element of the melodic

framework on which the music of the subcontinent thrives. Naturally, automatic raaga recognition is

an important step in computational musicology as far as Indian music is considered. It has several

applications like indexing Indian music, automatic note transcription, comparing, classifying and rec-

ommending tunes, and teaching to mention a few. Simply put, it is the first logical step in the process of

creating computational methods for Indian classical music. In this chapter, we identify the main draw-

backs of the previous raaga recognition techniques and propose minor, but multiple improvements to the

state-of-the-art raaga recognition technique. We discuss the results obtained with our raaga recognition

system with those improvements.

4.1 Introduction

Geekie [15] very briefly summarizes the importance of raaga recognition for Indian music and it’s

applications in music information retrieval in general. Raaga recognition is primarily approached as

determining the scale used in composing a tune. However the raaga contains more information which

is lost if it is dealt with western methods such as this. This information plays a very central role in the

perception of Indian classical music.

Though our work primarily concerns with Carnatic music, but most of the discussion applies to Hin-

dustani music as well, unless mentioned otherwise. In chapter 2, we have surveyed the computational

approaches to melody. In the following section, we identify a few problems that we address using our

raaga recognition system.

34

4.2 Problems that need to be addressed

4.2.1 Gamakas and pitch extraction for Carnatic music

An appropriate pitch extraction module is that which can accurately represent the gamakas. It has not

been a severe problem for the classification systems that were not depending on gamakas of a note for

classification. If there is such a pitch extraction system in place, gamakas can be used as an additional

feature to improve the accuracies of existing systems. Gamakas assume a major role when the number

of raaga classes is high in the dataset.

4.2.2 Skipping tonic detection

The manually implemented tonic (the base frequency of the instrument/singer) identification stage

needs to be eliminated if possible. Since the tonic identification itself involves some amount of error,

this could adversely impact the performance of a raaga recognition system. Neither the Carnatic nor

Hindustani systems adhere to any absolute tonic frequency, therefore it makes sense to build a system

that can ignore the absolute location of the tonic.

4.2.3 Resolution of pitch-classes

Though 12 bins for pitch-class profiles look ideal to the Western eye,we hypothesize that a more

continuous model can capture more relevant information related to Indian classical music. Dividing an

octave into n bins where n 12 can help us model the distribution with better resolution. Gamakas (the

micro tonal variations) play a vital role in the perception of Indian music, and this has been confirmed

by several accomplished artists. The transitions involved in a gamaka and the notes through which its

trajectory passes are two factors that need to be captured. We hypothesize that this information can be

obtained, at least partially, using a higher number of bins for the first-order pitch distribution.

4.2.4 A comprehensive dataset

The previous datasets which are used for testing have several problems. In Tansen, and the work by

Sridhar and Geeta, the datasets had as few as 2 or 3 raagas. The dataset used by Chordia has all the data

played on a single instrument by a single artist. The test datasets were constrained to some extent by

the requirement of monophonic audio (unaccompanied melodic instrument) for reliable pitch detection.

35

In the present work, we investigate raaga recognition performances on a more comprehensive dataset

with more raaga classes with significant number of tunes in each across different artists and different

compositions. This should enable us to obtain better insight into the raaga identification problem.

With these issues about the raaga recognition in mind, we have implemented a system which ad-

dresses some of the challenges described. The following sections introduces our method, and presents

a detailed analysis and discussion of the results.

4.3 Our method

As mentioned earlier, we propose to address some of the issues described in the previous section.

We have taken a diverse set of tunes to include in the dataset. The use of amply available recorded

music necessitates a pitch detection method that can robustly track the melody line in the presence of

polyphony. The obtained sequence of pitch values converted to cents scale (100 cents = 1 semitone)

constitutes the pitch contour. The pitch contour may be used as such to obtain a pitch-class distribution.

On the other hand, given the heavy presence of ornamentation in Indian music, it may help to use identi-

fied stable note segments before computing the pitch-class distribution. We investigate both approaches.

Finally, a similarity measure, that is insensitive to the location of the tonic note, is used to determine the

best matched raaga to a given tune based on available labeled data. Each of the aforementioned steps is

detailed next.

4.3.1 Pitch extraction

Pitch detection is carried out at 10 ms intervals throughout the sampled audio file using a predom-

inant pitch detection algorithm designed to be robust to pitched accompaniment [37]. The pitch de-

tector tracks the predominant melodic voice in polyphonic audio accurately enough to preserve fast

pitch modulations. This is achieved by the combination of harmonic pattern matching with dynamic

programming based smoothing. Analysis parameter settings suitable to the pitch range and type of

polyphony are available via a graphical user interface thus facilitating highly accurate pitch tracking

with minimal manual intervention across a wide variety of audio material. Figure 4.1 shows the output

pitch track superimposed on the signal spectrogram for a short segment of Carnatic vocal music where

the instrumental accompaniment comprised violin and mridangam (percussion instrument with tonal

characteristics). While the violin usually follows the melodic line, it plays held notes in this particular

36

segment. Low amounts of reverberation were audible as well. We observe that the detected pitch track

faithfully captures the vocal melody unperturbed by interference from the accompanying instruments.

Figure 4.1: Screenshot from the melodic pitch extraction system of [37] showing the detected pitchsuperimposed on the signal spectrogram. The axis on the right indicates pitch value (Hz).

4.3.2 Finding the tuning offset

The pitch values obtained at 10 ms intervals are converted to the cents scale by assuming an equi-

tempered tuning scale at 220 Hz. All the pitch values are folded into a single octave. The finely-binned

histogram maximum of the deviation of the cents value from the notes of the equi-tempered 12-note

grid provides us the underlying tuning offset of the audio with respect to 220 Hz. The tuning offset is

applied to the pitch values to normalize the continuous pitch contour to standard 220 Hz tuning by a

simple vertical shift but without any quantization to the note grid at this point.

4.3.3 Note segmentation

As we observe in Figure 4.1, the pitch contour is continuous and marked by glides and oscillations

connecting more stable pitch regions. The stable note regions too are marked by low pitch modulations.

As described in Sec. 2, melodic ornamentation in Indian classical music is very diverse and elaborate.

For our investigation of pitch class profiles confined to stable notes, we need to detect relatively stable

note regions within the continuously varying pitch contour. The local slope of the pitch contour can be

used to differentiate stable note regions from connecting glides and ornamentation.

At each time instant, the pitch value is compared with its two neighbors (i.e. 10 ms removed from

it) to find the local slope in each direction. If either local slope lies below a threshold value of 15

semitones per second, the current instant is considered to belong to a stable note region. This condition

is summarized by the Eq. 4.1.

37

(| (F (i− 1)− F (i)) |< θ) ‖ (| (F (i+ 1)− F (i)) |< θ) (4.1)

where F (i) is the pitch value at the time index i and θ being the slope threshold. To put the selected

threshold value in perspective, a large vibrato (spanning a 1 semi-tone pitch range) at 6 Hz pitch modu-

lation frequency has a maximum slope of about 15 semitones per second. All instants where the slope

does not meet this constraint are considered to belong to the ornamentation.

Finally, the pitch values in the segmented stable note regions are quantized to the nearest available

note value in the 220 Hz equi-tempered scale. This step smoothes out the minor fluctuations within

intended steady notes. Figure 4 shows a continuous pitch contour with the corresponding segmented

and labeled note sequence superimposed. We note several passing notes are detected which on closer

examination are found to last for durations of 30 ms or more.

Figure 4.2: Note segmentation and labeling. Thin line: continuous pitch contour; Thick line: detectedstable note regions.

4.3.4 Pitch-class profiles

We investigate various approaches to deriving the pitch class profile. The first of two broad ap-

proaches corresponds to considering only the stable notes, segmented and labeled in the previous step.

The pitch class profile is then a 12-bin histogram corresponding to the octave-folded note label values.

There are two choices for weighting the note values for histogram computation. We call these P1 and

P2, where P1 refers to weighting a note bin by the number of instances of the note, and P2 refers to

weighting by total duration over all instances of the note in the music piece.

A second broad approach is ignore the note segmentation step and to consider all pitches in the

pitch contour irrespective of whether they correspond to stable notes or ornamentation regions. We

call this P3. Further, the number of divisions of the octave is varied representing different levels of

38

fineness in pitch resolution. The investigation of varying quantization intervals is motivated by the

widely recognized microtonal character of Indian music.

4.3.5 Distance measure

In order to compare pitch-class profiles computed from two different tunes, it is necessary that the

distribution intervals are aligned in terms of the locations of corresponding scale degrees. This can be

ensured by the cyclic rotation of one of the distributions to achieve alignment of its tonic note interval

with that of the other distribution. Since information about the tonic note of each tune is not available

a priori, we consider all possible alignments between two pitch class profiles and choose the one that

matches best in terms of minimizing the distance measure. This is achieved by cyclic rotation of one of

the distributions in 12 steps with computation of the distance measure at each step.

As for choosing the distance measure itself, we would like it to reflect the extent of similarity between

two tunes in terms of shared raaga characteristics. We choose the Kullback-Leibler (KL) divergence

measure as a distance measure suitable for comparing distributions. Symmetry is incorporated into this

measure by summing the two values as given below [4].

DKL(P,Q) = dKL(P |Q) + dKL(Q|P ) (4.2)

dKL(P |Q) =∑i

P (i) logP (i)

Q(i)(4.3)

where i refers to the bin index in the pitch class profile, and P and Q refer to pitch class distributions

of two tunes.

4.4 Experiment and results

We describe a raaga classification experiment and present results on the comparative performances of

the various types of pitch-class profiles for different classifier settings. A suitable dataset is constructed

from commercially available CD audio recordings. To make the best use of available data, we use leave-

one-out cross validation with a k-NN (k Nearest Neighbors) classifier to evaluate the performance of our

system. The details of the experiment are provided next.

39

4.4.1 Dataset

There are a few observations worth mentioning in connection with the design of a test dataset for

our raaga recognition system. During preliminary trials of our system, we observed a performance bias

in available datasets arising from the fact that several popular compositions in Carnatic music originate

in the 17th, 18th and 19th centuries. Some of these compositions sung by several artists lead to the

occurrence of several sets of near identical tunes in the dataset resulting in very similar pitch profiles for

supposedly different pieces of music. This prompted us to exercising due care in selecting music pieces

for our test dataset. We have been careful not to include different versions of the same composition in

the dataset. For instance, a tune which renders the kruti ‘nanu brOvamani cheppavE’ is not included

if another tune based on that kruti already existed in the dataset. However, since alapanas are not pre-

composed, and are purely based on the artists virtuosity, we have included them. To get a bigger dataset

we considered the complete Raagam-Taanam-Pallavis of various artists besides shorter krutis. This

expanded the list of options from which it is possible to extract a clip to be included in the dataset. The

clips were extracted from the live performances and CD recordings of 31 artists, both vocal (male and

female) and instrumental (veena, violin, mandolin and saxophone) music. The dataset consisted of 170

tunes from across 10 raagas with at least 10 tunes in each raga (except Ananda Bhairavi with 9 tunes)

as summarized in Table 4.1. The duration of each tune averages 1 minute. The tunes are converted to

mono-channel, 22.05 kHz sampling rate, 16 bit PCM. The dataset can be considered very representative

of the Carnatic classical music, since it includes artists spanning several decades, male and female, and

all the popular instruments.

Raaga Total tunes Avg. duration in seconds Composition of TunesAbheri 11 61.3 6 vocal, 5 instrumentalAbhogi 10 62 5 vocal, 5 instrumentalAnanda Bhairavi 9 64.7 4 vocal, 5 instrumentalArabhi 10 64.9 8 vocal, 2 instrumentalAtana 21 56.75 12 vocal, 9 instrumentalBegada 17 61.17 9 vocal, 8 instrumentalBehag 14 59.71 12 vocal, 2 instrumentalBilahari 13 61.38 10 vocal, 3 instrumentalHamsadwani 41 57.07 14 vocal, 27 instrumentalHindolam 24 60 15 vocal, 9 instrumental

Table 4.1: Description of the dataset across 10 raagas.

40

Pith-class profile k=1 k=3 k=5 k=7P1 (12 bins, weighted by number of instances) 55.9 56.5 57.1 59.4P2 (12 bins, weighted by duration) 71.2 73.5 76.5 76.5P3 (12 bins) 73.5 70 74.7 75.3P3 (24 bins) 72.4 72.9 75.3 74.1P3 (36 bins) 68.2 72.4 72.9 74.1P3 (72 bins) 67.7 68.2 69.4 68.2P3 (240 bins) 65.3 68.2 66.5 65.9

Table 4.2: Performance of weighted-k-NN classification with various pitch-class profiles

4.4.2 Classification experiment

A k-NN classification framework is adopted where several values of k are tried. In a leave-one-out

cross-validation experiment, each individual tune is considered a test tune in turn while all the remaining

constitute the training data. The k nearest neighbors of the test tune in terms of the selected distance

measure are considered to estimate the raaga label of the test tune. The distance measure used is the

symmetric KL distance presented in the previous section. Since there are in all a minimum of 9 tunes

per raaga, we consider values of k=1, 3, 5 and 7. Since the number of classes is high (10 raagas), it is

more appropriate to consider a weighted-distance k-NN classification rather than simple voting to find

the majority class. Weighted k-NN classification is described by the equations below. The chosen class

is C*,

C∗ = arg maxc∑i

wiδ(c, fi(x)) (4.4)

where c is the class label (raaga identity in our case) , fi(x) is the class label for the ith neighbor of x

and δ(c, fi(x)) is the identity function that is 1 if fi(x) = 0, or 0 otherwise. The weights are given by,

wi =1

d(x, y)(4.5)

where d(x,y) is the symmetric KL distance between two pitch-class profiles x and y (e.g. its ith neigh-

bor).

The results in terms of percentage accuracy in raaga identification, obtained on the test dataset,

appear in Table 4.2. Two important points emerge from the comparison of accuracies across the different

types of pitch-class profiles. For all values of k, except k=1, in the k-NN classification, we see that P2

41

(the note segmented, duration weighted pitch-class profile) yields the highest accuracies. This implies

that note durations play an important role in determining their relative prominence for a particular raaga

realization. This is consistent with the fact that long sustained notes like dirgha swaras play a major role

in characterizing a raaga than other functional notes which occur briefly in the beginning, the end or in

the transitions. The benefit of note segmentation is seen in the slightly superior performance of P2 over

P3 (12 bin). P2 does not consider those instants that lie outside detected stable note regions. The second

important point emerging from Table 4.2 is the decreasing classification accuracy with increasing bin

resolution. Although the reverse might be expected in view of the widely held view that the specific

intonation of notes within micro-intervals are a feature peculiar to a raaga, a more carefully designed,

possibly unequal, division of the octave may be needed to observe this.

The overall best accuracy of 76.5%, which value is much higher than chance for the 10-way classi-

fication task, indicates the effectiveness of pitch-class profile as a feature vector for raaga identification.

It is encouraging to find that a simple first order pitch distribution provides considerable information

about the underlying raaga although the complete validation of this aspect can be achieved only by test-

ing with a much larger number of raaga classes on larger dataset. Including the ornamentation regions

in the pitch-class distribution did not help. As mentioned before, the gamakas play an important role

in characterizing the raaga as evidenced by performance as well as listening practices followed. How-

ever, for gamakas to be effectively exploited in automatic identification, it is necessary to represent their

temporal characteristics such as the actual pitch variation with time. A first-order distribution which

discards all time sequence information is quite inadequate for the task.

4.5 Conclusions

A brief but comprehensive introduction to the raaga and its properties is presented. Previous raaga

recognition techniques are surveyed with a focus on their approach and contributions. Key aspects that

need to be addressed are outlined and a method which deals with a few of them is discussed. Apart from

these contributions of our work, we have also highlighted details such as the composition of the testing

dataset, and provided insights into the post-processing steps involved with pitch extraction procedure

for Carnatic music. This is the first work, to the best of our knowledge, that uses polyphonic audio

recordings in the raaga recognition task.

42

The transitions in gamakas are discarded in the method explained, or are not fully utilized. A higher

number of bins in the pitch distribution proved to be not necessarily useful. Future raaga recognition

techniques can take into account the other properties of a raaga. Most important of these are the charac-

teristic phrases and gamakas which suggest that temporal properties may be usefully exploited in future

work. An automatic pitch-transcription system as accurate as the semi-automatic polyphonic pitch-

extraction system used in our work, is also necessary to scale the work to a large number of raagas.

43

Chapter 5

Conclusions

Very little research has been carried out on Indian music and even less on the specific characteristics

that makes it so special. The few existing computational approaches to melody, discussed in chapter 2,

have focused mainly on raaga recognition. Given the number of raagas which are commonly performed

and their unique properties, the data used in the literature is not representative. Indeed, the high accu-

racies reported might be due to the limited number of raagas used and the overall size of the dataset.

Moreover, important properties of the raagas, like their specific use of gamakas, have not been exploited

and issues beyond recognition have not been approached. As more representative datasets are gathered,

the features used will not be sufficient to discriminate the raaga classes. Features such as pitch-class

profiles and pitch-class dyad distributions infer partial information about the raagas. But the other roles

of notes are not evident, which need to be exploited. Symbolic scores can also be used for building

more complex models, especially to model the characteristic melodic movements of particular raagas.

It should be noted that raaga recognition is only a starting point to model a raaga and thus a lot remains

to be done.

At the level of musical instruments there is practically nothing done. Physical modeling of their

many non-linear behaviors is quite complex and the lack of instrument standardization does not help.

Some research has been done on modeling tabla and sitar [20] and there have been a few attempts in

developing sound synthesis systems [50]. In order to obtain credible synthesized sounds, as well as to

describe performance practise, the modeling of gamakas is a bottleneck.

The variability in performances of the same song is quite large, especially due to the importance

of improvisation. The same composition sung by two artists can be different in many musical and

expressive facets. These differences may challenge the version identification methods developed for

44

western commercial music. In addition to the compositional forms, there are many improvisatory forms

that are performed with well-defined structural criteria [18]. Nothing has been done in these topics.

Through this thesis, we have mentioned a number of characteristics of the Carnatic music that deserve

to be studied. Given that this music tradition is so different from the ones used to develop the current

methodologies, there is a need to also deal with some more fundamental issues. We need to study

how the musical concepts and terms in Indian music are understood, specifying proper ontologies with

which to frame our work. Also the cultural and community aspects of the music are so important that

without studying them we will not be able to develop proper musical models. In summary, to approach

the computational modeling of Carnatic music, making justice to its richness, is fundamental to take a

cultural approach and thus take into account musicological and contextual information.

To conclude, in this thesis, we have made the following contributions.

1. Strong theoretical arguments are presented to show that the term rasa cannot be used in the context

of music, and with the help of a behavioural study, raagas in Carnatic music are shown to evoke

feelings to certain extent.

2. A survey of state-of-the-art in raaga recognition is presented identifying the problems to be ad-

dressed.

3. Based on an existing raaga recognition system for Hindustani music, a system with several im-

provements is built for Carnatic music and has been tested on a comprehesive real world data.

4. Contributed the ground-truth data drawn from real stage concerts and CD recordings, making it

the most diverse and extensive dataset for Carnatic raagas till date.

5. A brief discussion on few standing debates like 22 srutis concluding with necessary future steps

to resolve such debates.

6. A discussion on several open problems in Carnatic music, to be explored computationally.

5.1 Impact of this work and the future directions

Possible applications of our work include music recommendation systems based on mood and raaga,

learning-aid for students in visualizing the feedback from their practice sessions, digitizing and archiv-

45

ing the huge amount of music data automatically with correct metadata, analysing various artistic styles

etc.

The data used to test the raaga recognition systems so far, is very less. when compared to hundreds

of raagas in Carnatic music. The first future step of this raaga recognition technique is to consider much

larger and diverse data of say, 100 raagas. This step is not as obvious as it sounds. In Carnatic music,

the same kruti is often sung by different artists, during several instances. A new kruti is not a common

phenomenon. The dataset gets biased if we include two versions of the same kruti. To handle this, we

need to grab those sections of a rendition which are not pre-composed. Alapana and Swarakalpana are

two such sections. One possible step would be to extract such sections programmatically from a huge

pool of renditions As the data grows, there will arise a need to exploit the unexploited properties of

raaga.

Cognitive and behavioural aspects of Carnatic music need to be studied in a systematic manner.

These studies will shed light on various things like the perception of gamakas, the extent to which the

variations of a swara are normally allowed, and raaga and emotion association etc.

5.2 Few guidelines for future students/researchers

The thesis has been a huge learning activity for us. It included a number of domains - musicology,

cognition, music performance, signal processing, machine learning and pattern recognition. During the

course of this thesis work, we have had several experiences which helped us to get an overview of the

scientific research in Indian classical music. We would like to draw the reader’s attention, especially

those who are working in this field, to few points.

The first point concerns the flexibility of an art form. Music being a very highly celebrated art

form, is practically an endless creative domain. It is very natural to observe deviations from the written

rules. Particularly the Indian classical music being an oral tradition, depends heavily on what people

perceive and transfer to next generation. Students learn from listening to the guru sing and perform.

The only feedback is the agreement with the guru. But this does not mean that rules can be broken.

Good examples for this are gamakas and the swaras. The artists take liberty in changing them a little

to sound good depending on the context. This is very different from western scenario where music is

played reading the notation!

46

Second point we would like to stress is the scale used. Some artists say that it is the same as the west-

ern equi-tempered scale. Others disagree saying it is just-intonated. Though we have spent some time in

trying to obtain the correct information, it appears that it is not that important, since a swarasthanam is

not a fixed point anyway, it is a region. Moreover, the tuning is often based on perceptual measures than

the objective tuning instruments. For mathematical simplicity, we have used equi-tempered scale in our

raaga recognition system. However, we do observe that there is a slight perceptual difference between

the two tuning systems.

The third point, something which bothered us throughout is - does it require to be a musician to do

research related to Indian music? There is no definitive answer for this. It is always better to have hands-

on experience with something one works with. But a musician can be as good/bad as a non-musician in

explaining the science behind it. A non-musician stature should not constrain one from approaching the

domain as a researcher. In this case, a lot of listening activity and reading the musicological literature

helps a lot as has been the case with us.

47

Appendix A

Basic Acoustics

Consider any object. Everything we observe about this object has an explanation. Studying its

physical aspects allows us to know how it behaves in response to various actions. Sound is the behaviour

of objects to natural/our actions. Of course, not all sounds qualify as music. But the physical properties

we study about an object are same be the sound it produces music or noise.

Pluck a string and observe the wave pattern generated. It looks like the complex wave pattern in

Figure A.1. Our vocal cords are no different, only that they are made of muscles.

Figure A.1: Wave pattern generated by a plucked string

A.1 Demonstration of various physical properties

Like any other object, sound also has certain physical properties. Though we are intuitively very

sensitive to them, we do not always realize them. To be able to appreciate this fact better, do the

following experiments, and see how the sound changes compared to the default case.

1. Stretch the string keeping the length of the string between the two points the same.

48

2. Vary the length keeping the tension in the string the same. i.e., hold it at different points to vary

the length without stretching or letting it go slack.

3. Pluck it with force.

4. Change the string. If it is rubber, now consider a brass/other-material string.

See if your observations concur with the following, they probably should.

1. The sound is sharper, like your younger sibling’s shrill cry.

2. The sound is flat compared to the original one.

3. It is louder.

4. It is a different sound altogether, though sharpness/flatness may be the same/different compared

to the original one.

Now, we’ll learn the physical properties which are involved in these observations. Any sound is caused

because the source vibrates and the vibrations are carried to our ear drum which sympathizes. As a

result of the vibrations of the source, alternative high and low pressure regions are created next to it.

These particles of the medium, typically air, are set in motion and disturb their adjacent particles in turn.

In this way the whole region around the source is set to vibrate. Observe the waves shown in Figure

A.2.

Figure A.2: Condensation and rarefaction represented as a sine wave

A.2 Sine waves

The wave in Figure A.2 can be represented as a series of crests and troughs, as shown. In mathemati-

cal terminology, it is called a sine wave. It has certain physical properties. It is periodic, i.e., it is nothing

49

but a pattern that is copied over and over. The number of such repetitions passing through a point in a

second is called frequency (f). Each such repetition is called a cycle. The time it takes for a cycle to pass

through a point is called time period (T). The distance between corresponding points on two adjacent

cycles is called the wave length (λ). An example to such points would be points on extreme tips in two

adjacent cycles. The height of the wave is called amplitude (A). See Figure A.2.

The frequency is the factor which we perceive when we observe a tone to be sharper or flatter.

More the frequency, sharper the sound. Time period and wave length are inversely proportional to the

frequency. The amplitude is the volume/loudness perceived. More the amplitude, louder the sound. See

Figure A.3, for example.

Figure A.3: Examples of sine waves with high and low frequencies

Normally when we speak or sing, or when an instrument is being played, these factors keep changing

and we perceive them collectively to be either noise/speech/music.

A.3 Harmonics

But we have one more question left to be answered. Why do two sounds produced by different

materials sound different when all these factors are made equal? Let us experiment again. Set a string

to vibrate and observe it. It might be difficult to see with naked eye, but the vibration does not look like

a perfect single sine wave. As said earlier, it is a complex wave pattern. When a string is plucked, there

are various modes in which it can vibrate. See Figure A.4. These are called harmonics; the frequency

of wave whose wavelength is λ1 is called 1st harmonic or the fundamental.

50

Figure A.4: Possible harmonics in a given string

Now let’s see Figure A.1 again. When we pluck a string, we can observe it vibrating not exactly

like a simple sine wave, but something more complex. This complexity arises from the mixture of other

harmonics with the fundamental. Given that the two ends of the vibrating string are fixed, the length

and other physical properties of the string will have a crucial effect on the properties of wave generated

with it. A string of length L can only have waves of wavelength 2L, 2L/2, 2L/3, 2L/4, etc. Frequencies

corresponding to the fundamental and second, third and fourth harmonics are given in the Table A.1. v

is the velocity with which the wave travels.

Frequency =velocity

wavelength(A.1)

Table A.1: Fundamental and its harmonics

Fundamental or First harmonic v/2L f1Second harmonic or first overtone 2v/2L 2f1Third harmonic or second overtone 3v/2L 3f1Fourth harmonic or third overtone 4v/2L 4f1nth harmonic or (n-1)th overtone nv/2L nf1

A.4 Timbre

Due to the fact that the force between molecules varies according to the material, the nature of wave

propagation is affected. In turn, the harmonic pattern is affected. Thus, the nature of these harmonics

51

varies from material to material. It is this property that distinguishes the sound waves of varied origins

(e.g.: rubber and steel). It is called the timbre. This is also one of the important reasons why we are able

to discriminate between two persons speaking.

Let us now look at the frequency measures in use.

A.5 Frequency measures

Frequency is measured in Hertz and Cents. Hertz is a linear scale measure whereas cents is used

to measure the interval between two frequency values in logarithmic scale. The interval in cents is

calculated using the following formula.

V alue in Cents(C) = log (a

b)× k (A.2)

Here a and b are the frequencies, k is a constant and it’s value is 120010log(2) .

For calculation purposes, if one has to deal with more than one octave, which usually is the scenario,

it is often advisable to use cents. Let us look at a quick example. Consider two octaves. One from 10 to

20, another from 1000 to 2000. If we take the absolute differences between the fifth note and the tonic,

they would be (10*3/2 10 == 5 Hz) and (1000*3/2 1000 == 500 Hz) respectively. But if we take the

interval in cents, they both will be the same.

D1 =1200

10log(2)∗ 10log(15

10) = 701.95cents

D2 =1200

10log(2)∗ 10log(1500

1000) = 701.95cents

The cents measure will also enable us to know the number of octaves between two frequency values.

Each octave has an interval of 1200 cents. So, for example the number of octaves between 12 Hz and

16400 Hz would be,

Number of Intervals =(1200/log(2) ∗ log(16400/12))

1200=

(log(16400/21))

log(2)= 10.41 octaves.

This knowledge should be sufficient to understand the chapters in this thesis. We recommend the

reader to refer to relevant books [30] for other concepts if and when one requires them. We will now

introduce the scales used in Carnatic music and then understand raaga and it’s properties.

52

A.6 Tuning systems

A.6.1 Equal-temperament

This is a western standard. It is preferred mostly due to its mathematical simplicity. In this scale, all

the 12 notes in the scale are equally spaced. That is, the ratio between 2nd and 1st note is the same as

the ratio between 3rd and 2nd note. We can derive the value corresponding to this ratio. Say xi is the ith

frequency value.

k =x2x1

=x3x2

=x4x3

= ... =x12x11

=(2 ∗ x1)x12

x2 = k ∗ x1, x3 = k ∗ x2 and so on x12 = k ∗ x11, and 2 ∗ x1 = k ∗ x12

So, x12 = k11 ∗ x1

Hence k ∗ x12 = k12 ∗ x1

2 ∗ x1 = k12 ∗ x1

k =12√2

So, in the equi-tempered scale, each subsequent note is obtained by multiplying the current note with

k, where k = 12√2.

A.6.2 Just-intonation

On the other hand, just-intonation tuning system uses ratios of small integers as intervals between

two notes of the scale. There are a number of ways to tune using small integer ratios. But the key essence

of this tuning method is to use these ratios instead of geometric progression to find the intervals.

53

Bibliography

[1] P. S. R. Apparao. Natyasastramu. Natyamala Publications, 1959.

[2] L. L. Balkwill and W. F. Thompson. A cross-cultural investigation of the perception of emotion in music:

Psychophysical and cultural cues. Music Perception, 17(1):43–64, 1999.

[3] L. L. Balkwill, W. F. Thompson, and R. Matsunga. Recognition of emotion in japanese, western, and

hindustani music by japanese listeners. Japanese Psychological Research, 46(4):337–349, 2004.

[4] S. Belle, R. Joshi, and P. Rao. Raga Identification by using Swara Intonation. Journal of ITC Sangeet

Research Academy, 23, 2009.

[5] V. N. Bhatkande. Hindusthani Sangeet Paddhati. Sangeet Karyalaya, 1934.

[6] P. Chordia. Segmentation and recognition of tabla strokes. In Proc. of ISMIR, pages 107–114, 2005.

[7] P. Chordia, A. Albin, A. Sastry, and T. Mallikarjuna. Multiple viewpoints modeling of tabla sequences. In

International Conference on Music Information Retrieval, number Ismir, pages 381–386, 2010.

[8] P. Chordia and A. Rae. Raag recognition using pitch-class and pitch-class dyad distributions. In Proc. of

ISMIR, pages 431–436, 2007.

[9] P. Chordia and A. Rae. Understanding emotion in raag: An empirical study of listener responses. Computer

Music Modeling and Retrieval, pages 110–124, 2008.

[10] P. Chordia and A. Rae. Tabla Gyan : An Artificial Tabla Improviser. In International Conference on

Computational Creativity, 2010.

[11] M. Clayton. Time in Indian Music : Rhythm , Metre and Form in North Indian Rag Performance. Oxford

University Press, 2000.

[12] D. Das and M. Choudhury. Finite State Models for Generation of Hindustani Classical Music. In Proceed-

ings of International Symposium on Frontiers of Research in Speech and Music, 2005.

[13] A. Datta, R. Sengupta, N. Dey, and D. Nag. Experimental Analysis of Shrutis from Performances in Hin-

dustani Music. Scientific Research Department, ITC Sangeet Research Academy, 2006.

[14] P. Ekman. An argument for basic emotions. Cognition & Emotion, 6(3):169–200, 1992.

[15] G. Geekie. Carnatic ragas as music information retrieval entities. In Proc. of ISMIR, pages 257–258, 2002.

[16] O. Gillet and G. Richard. Automatic labelling of tabla signals. In Proc. of ISMIR, 2003.

[17] E. Hanslick, G. Cohen, and M. Weitz. The beautiful in music. Liberal Arts Press, New York, 1957.

54

[18] S. R. Janakiraman. Essentials of Musicology in South Indian Music. The Indian Music Publishing House,

2008.

[19] P. Juslin. Music and Emotion: Theory and Research. Oxford University Press, November 2001.

[20] A. Kapur, P. Davidson, P. Cook, W. Schloss, and P. Driessen. Preservation and extension of traditional

techniques:digitizing north indian performance. Journal of New Music Research, 34(3):227–236, 2005.

[21] G. K. Koduri and B. Indurkhya. A Behavioral Study of Emotions in South Indian Classical Music and its

Implications in Music Recommendation Systems. In SAPMIA, ACM Multimedia, pages 55–60, 2010.

[22] A. Krishnaswamy. On the twelve basic intervals in south indian classical music. 10 2003.

[23] A. Krishnaswamy. Inflexions and Microtonality in South Indian Classical Music. In Frontiers of Research

on Speech and Music, 2004.

[24] A. Krishnaswamy. Melodic atoms for transcribing carnatic music. In Proc. of ISMIR, pages 345–348, 2004.

[25] A. Krishnaswamy. Multi-Dimensional Musical Atoms in South Indian Classical Music. In Proc. of the

International Conference of Music Perception & Cognition, 2004.

[26] C. L. Krumhansl. Reasoning about naming systems. Canadian Journal of Experimental Psychology,

51(4):336–353, 1997.

[27] S. K. Langer. Philosophy in a new key: a study in the symbolism of reason, rite, and art. Nueva York, EUA

: Mentor Books, 1959.

[28] K. Lee. Automatic chord recognition from audio using enhanced pitch class profile. In Proc. of the Inter-

national Computer Music Conference, 2006.

[29] M. Levy and N. A. Jairazbhoy. Intonation in North Indian Music: A Select Comparison of Theories with

Contemporary Practice. Aditya Prakashan, New Delhi, 1982.

[30] G. Loy. Musimathics: the mathematical foundations of music. Vol. II. With a foreword by John Chowning.

Cambridge, MA: MIT Press, 2007.

[31] G. Pandey, C. Mishra, and P. Ipe. Tansen: A system for automatic raga identification. In Proc. of Indian

International Conference on Artificial Intelligence, pages 1350–1363, 2003.

[32] H. S. Powers. The Background of the South Indian Raaga-System. PhD thesis, Princeton University, 1959.

[33] Pratyush. Analysis and Classification of Ornaments in North Indian (Hindustani) Classical Music. Master’s

thesis, University of Pompeu Fabra, 2010.

[34] C. V. Raman. The Indian musical drums. Proceedings Mathematical Sciences, 1(3):179–188, 1934.

[35] N. Ramanathan. Shrutis according to ancient texts. Journal of the Indian Musicological Society, 12(3):31–

37, 1981.

[36] A. Rangacharya. The Natyasastra. Munshiram Manoharlal Publishers, 2010.

[37] V. Rao and P. Rao. Vocal melody extraction in the presence of pitched accompaniment in polyphonic music.

Audio, Speech, and Language Processing, IEEE Transactions on, 18(8):2145–2154, 2010.

[38] E. Rosch. Principles of Categorization. John Wiley & Sons Inc, 1978.

55

[39] J. A. Russell. A circumplex model of affect. Journal of Personality and Social Psychology, 39:1161–1178,

1980.

[40] H. Sahasrabuddhe and R. Upadhye. On the computational model of raag music of india. In Workshop on AI

and Music: European Conference on AI, 1992.

[41] P. Sambamoorthy. South Indian Music. The Indian Music Publishing House, 1998.

[42] L. A. Schmidt and L. J. Trainor. Frontal brain electrical activity (eeg) distinguishes valence and intensity of

musical emotions. Cognition & Emotion, 15(4):487–500, 2001.

[43] V. Shankar. The art and science of Carnatic music. Music Academy Madras, Chennai, 1983.

[44] M. Sharma. Tradition of Hindustani Music. A.P.H Publishing Corporation, 2006.

[45] P. Sharma and K. Vatsayan. Brihaddeshi of Sri Matanga Muni. South Asian Books, 1992.

[46] S. Shetty and K. Achary. Raga Mining of Indian Music by Extracting Arohana-Avarohana Pattern. In

International Journal of Recent trends in Engineering, volume 1, pages 362–366. Acamey Publisher, 2009.

[47] M. Sinith and K. Rajeev. Hidden Markov Model based Recognition of Musical Pattern in South Indian

Classical Music. In IEEE International Conference on Signal and Image Processing, Hubli, India, 2006.

[48] J. A. Sloboda. Music Structure and Emotional Response: Some Empirical Findings. Psychology of Music,

19(2):110–120, 1991.

[49] R. Sridhar and T. Geetha. Raga identification of carnatic music for music information retrieval. International

Journal of Recent trends in Engineering, 1(1):571–574, 2009.

[50] M. Subramanian. Synthesizing Carnatic Music with a Computer. Sangeet Natak,(Journal of Sangeet Natak

Akademi), New Delhi, 133-134(June):16–24, 1999.

[51] M. Subramanian. Carnatic Ragam Thodi Pitch Analysis of Notes and Gamakams. Journal of the Sangeet

Natak Akademi, XLI(1):3–28, 2007.

[52] A. Swartz. Musicbrainz: A semantic web service. IEEE Intelligent Systems, 17:76–77, January 2002.

[53] D. Swathi. Analysis of Carnatic Music : A Signal Processing Perspective. Master’s thesis, IIT Madras,

2009.

[54] L. J. Trainor and L. A. Schmidt. Processing emotions induced by music. In The cognitive neuroscience of

music, pages 311–324, 2003.

[55] T. Viswanathan and M. H. Allen. Music in South India. Oxford University Press, 2004.

[56] A. Wieczorkowska, A. Datta, R. Sengupta, N. Dey, and B. Mukherjee. On Search for Emotion in Hin-

dusthani Vocal Music. Advances in Music Information Retrieval, pages 285–304, 2010.

[57] G. Wood and S. O’Keefe. On techniques for content-based visual annotation to aid intra-track music navi-

gation, 2005.

[58] R. J. Zatorre, A. C. Evans, and E. Meyer. Neural mechanisms underlying melodic perception and memory

for pitch. Journal of Neuroscience, 14(4):1908, 1994.

56

musicological and technological exploration of...

Documents