a knowledge based signal processing approach to tonic

22
A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music Indian Institute of Technology, Madras. July 13, 2012

Upload: others

Post on 25-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

A knowledge Based Signal Processing Approach ToTonic Detection In Indian Classical Music

Indian Institute of Technology,Madras.

July 13, 2012

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Outline

Tonic

Pitch Histogram Approach

Carnatic Music

Hindustani Music

Drone Extraction

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Tonic

• Tonic - Pitch chosen by the performer to serve as reference• The reference note is the swara Sa also called shadja• Accompanying instruments also tune to the tonic• Melodies defined relative to tonic• Drone is played to establish tonic. Tanpura/Tambura• Pitch based analysis requires normalization

0 100 200 300 400 500 6000

0.5

1

0 100 200 300 400 500 6000

0.5

1

0 100 200 300 400 500 6000

0.5

1

Frequency Hz

−1,000 −500 0 500 1,000 1,5000

0.5

1

−1000 −500 0 500 1,000 1,5000

0.5

1

−1000 −500 0 500 1,000 1,5000

0.5

1

Frequency Cents

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Approaches to detect tonic

• Two approaches attempted

1. Exploit characteristics of music to detect tonic• Use pitch of entire recordings• Intra and inter piece analysis• Histograms are primary form of representation

2. Extract drone in background to detect tonic

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Pitch histogram Approach

• Single pitch extracted using YIN1

• On observing typical pitch histograms of Carnatic music• Shadja has a prominent peak, not necessarily the most dominant peak• Less inflected nature of shadja and swaraPa• Continuous nature of histograms due to gamakas• Tanpura and percussion accentuate the shadja peak

• Objective is to exploit these characteristics to detect tonic

0 100 200 300 400 500 6000

0.5

1

0 100 200 300 400 500 6000

0.5

1

0 100 200 300 400 500 6000

0.5

1

Frequency Hz0 50 100 150 200 250 300

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Frequency Hz

Percussion Histogram

Tonic

1Yin, a fundamental frequency estimator for speech and music. Journal of the Acoustical Society of America, page

111(4):19171930, 2002

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Group Delay Histograms

• Group delay function is the negative of differential phase spectrum1

• Used to study spectral resonances with high resolution• Group delay processing of a synthetic histogram

• Ability to resolve peaks• Emphasizes peaks with narrower bandwidth

100 150 200 250 300 350 400−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Synthesized HistogramGD Synthesized HistogramPeaks

1Hema A. Murthy and B. Yegnanarayana. Group delay functions and its application to speech processing. Sadhana,

3‘6(5):745?782, November 2011

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Concert Method

• Tonic detection for a whole CD or a concert

• Database in the form of audio CDs or recordings of concerts

• The tonic is kept constant across pieces

• CD or concert consists of a number of pieces in different ragas

• Shadja present in the percussion and drone, and with every raga

• Not necessarily the case with the other swaras

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Concert Method

• Compute Histogram/ GD histogram for each piece• Multiply histogram bin wise• Bin value of the tallest peak of the product histogram is the tonic

50 100 150 200 250 Sa300 350 400 450 500

50 100 150 200 250 Sa300 350 400 450 500

50 100 150 200 250 Sa300 350 400 450 500

50 100 150 200 250 Sa300 350 400 450 500

50 100 150 200 250 Sa300 350 400 450 500Frequency Hz

Product of GD Histograms of Track 1, 2, 3 and 4

• Shadja might not have the tallest peak in individual histograms• Multiplying them ensures shadja to have the tallest peak• Histograms and group delay histograms perform equally well

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Segmented Histograms Method

• Piecewise tonic detection• Attempt to detect shadja inspite it not being the most dominant note

0 50 100 150 200 250 300 350 400 450 5000

0.5

1

0 50 100 150 200 250 300 350 400 450 5000

0.5

1

0 50 100 150 200 250 300 350 400 450 5000

0.5

1

0 50 100 150 200 250 300 350 400 450 5000

0.5

1

0 50 100 150 200 250 300 350 400 450 5000

0.5

1

Frequency Hz

Seg HistogramSeg GD Histogram

Seg HistogramSeg GD Histogram

Seg HistogramSeg GD Histogram

Seg HistogramSeg GD Histogram

Tonic

Product of Seg GD Histogram

• Piece divided into segments and group delay histograms detected• Drone and the mridanga ensure a local peak in the segment histogram• Group delay function accentuate peaks with narrower bandwidth• Multiplying segmented histograms results in tallest peak at the shadja bin

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Template Matching Method

• Piecewise tonic detection• Identify local peaks in Group Delay Histogram• Bin values are the candidate shadja values• CreateSa Pa Sa template for each of the candidates1

• Candidate ShadjaSj, of frequencyfj[

0.5(fj) 0.75(fj) fj 1.5(fj) 2(fj) 3(fj)

Sjlower Pjlower Sj Pj Sjhigher Pjhigher

]

50 100 150 200 250 300 350 400 450 5000

0.2

0.4

0.6

0.8

1

Frequency Hz

Local peak

GD histogram

• Candidate with the best template fit is the Shadja

1S Arthi H G Ranjani and T V Sreenivas. Shadja, swara identification and raga verification in alapana using stochastic

models. WASPAA 2011 pages 29?32, 2011.

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Template Matching Method

50 100 150 200 250 300 350 400 450 5000

0.2

0.4

0.6

0.8

1

Frequency Hz

Local peak

GD histogram

50 100 150 200 250 300 350 400 450 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Frequency Hz

50 100 150 200 250 300 350 400 450 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Frequency Hz

• This method attempts to exploit the• Fixed ratio between shadja andPa• Less inflected nature of shadja andPa

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Results of Pitch Histogram Approach on Carnatic Dataset

• Randomly chosen 78 concerts comprising of 722 pieces in total

• Concerts are of varying quality

• 44 male and 13 female artists, 21 instrumental leads

• Ground truth estimated manually

Method GD Histogram HistogramConcert 100% 100%Template Matching(T1) 92.37 % 91.66%Template Matching(T2) 95 % 92.17%Unsegmented Histograms(US) 90.69 % 87.14%Segmented Histograms(S) 95.28 % 87.14%

Template T1 = Pm S P St Pt

Template T2 = Sm Pm S P St Pt

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Pitch Histogram method on Hindustani Music

• Poor performance of the above methods on Hindustani music• Concert method

• Lesser number of pieces• Swara other than the shadja might dominate across pieces

• Segmented Histogram and Template method• Notes not as heavily inflected as in carnatic music• Group delay function not successful in emphasizing the Shadja peak

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

vAdi samvAdi Template Matching

• Use of raga information to detect tonic• vAdi and samvAdi - dominant notes of a raga• vAdi and samvAdi based template matching

Template TVS = [S vadi samvAdi St]

Example: candidate peakSj of frequencyfj of rag Darbari[

fj 1.1225(fj) 1.5(fj) 2(fj)

Sj R2j Pj Sjhigher

]

• The methods were tested on 126 pieces of Hindustani Music• Drawback is raga information is required

Method Accuracy (GD histogram)Template Matching(T1) 66 %Template Matching(TVS) 84.9 %Segmented Histogram(S) 62%

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Drone Extraction Approach

• Tanpura/Drone as a cue to detect tonic.

• Drone is omnipresent throughout the concert in the background

• Pitch of drone not captured by the single pitch extraction methods in thepresence of vocal/instrumental music

• Objective: to locate frames in which sound of the drone is prominentlyaudible

• Process these frames to detect tonic

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Spectral Entropy and Short Term Energy

• It was observed that frames with prominently audible drone had• Low short term energy• High Spectral Entropy

PSDn[k] =| Xn[k] |

2

k | Xn[k] |2

SE[n] = −∑

k

PSDn[k] logPSDn[k] (1)

• Xn is the DFT of thenth frame.K is the index of the DFT bin.

0 0.5 1 1.5 2 2.5 3 3.5 4

x 106

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Samples

Am

plit

ud

e

WaveformZero Mean FFTEntropy

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

• High entropy and low energy regions identified as regions of prominentdrone sound

• Histogram of pitch in these regions

0 50 100 Tonic 150 200 250 300 3500

0.5

1

Frequency Hz

0 50 100 Tonic 150 200 250 300 3500

0.5

1

Frequency Hz

Pitch Histogram

High Entropy Pitch Histogram

• Mostly lower octave shadja pitch registered in regions of prominentdrone

• Only a small excerpt of a piece required to detect pitch

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Results of Pitch Histogram Approach on MTG Dataset

• Tonic identification was attempted on MTG phase 1 tonic database

• Audio CD recordings

• Pieces are 3 minutes excerpts extracted from any part of a recording

• Preliminary Results

Music Number of Files AccuracyCarnatic 344 88%Hindustani 195 87.2%

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

NMF - Creating Dictionary

• Frames with entropy higher and low energy identified1

• Magnitude spectrum estimated on these bag of frames

• NMF is applied on the magnitude spectrum.

• Set of basis vectors estimated

• Basis vectors referred to as the tanpura dictionary

• Represent the space in which tanpura lies in terms of spectral vectors

1P. Smaragdis, B. Raj, and M.V. Shashanka. Supervised and semi-supervised separation of sounds from single-channel

mixtures. ICA and Signal Separation 2007

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

NMF - Projecting along the Dictionary

• NMF is applied on the magnitude spectrum of the original waveform

• Basis vectors corresponding to tanpura dictionary kept constant

• Weights for these vectors,H′ is estimated

• ComputeS′ = W ′H′ and synthesize the waveform

• Projection along the ’Tanpura direction’

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

• On extracting pitch on the synthesized waveform

0 2000 4000 6000 8000 10000 12000 14000 16000 180000

100

200

300

400

500Pitch Contour

0 2000 4000 6000 8000 10000 12000 14000 16000 180000

100

200

300

400

500Pitch Contour after NMF

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music

Tonic Pitch histogram Approach Carnatic Music Hindustani Music Drone Extraction

Conclusions

• Pitch histograms work well for Carnatic music

• Improves with rag information for Hindustani music

• Promising results on processing drone to establish tonic

• Drone extraction along with NMF to establish tonic with very smallexcerpts.

A knowledge Based Signal Processing Approach To Tonic Detection In Indian Classical Music