methods for modeling realistic playing in acoustic guitar...

12
38 Computer Music Journal Computer Music Journal, 25:3, pp. 38–49, Fall 2001 © 2001 Massachusetts Institute of Technology. Sound synthesis based on physical modeling of stringed instruments has been an active research field for the last decade. The most efficient synthe- sis models have been obtained using the theory of digital waveguides (Smith 1992). Commuted waveguide synthesis (Smith 1993; Karjalainen et al. 1993) is based on the linearity and time-invari- ance of the synthesis model and is an important method for developing a generic string instrument model. Recently, such a model has been presented including consolidated pluck and body wavetables, a pluck-shaping filter, a pluck-position comb filter, string models with loop filters and continuously variable delays, and sympathetic couplings be- tween the strings (Karjalainen et al. 1998). Our model is realized in a real-time software synthesizer called PWSynth. PWSynth is a user li- brary for PatchWork (Laurson 1996) that attempts to effectively integrate computer-assisted compo- sition and sound synthesis. PWSynth is a part of our project that investigates different control strat- egies for physical models of musical instruments. PatchWork is used also to generate control data from an extended score representation, the Expres- sive Notation Package (ENP) (Laurson et al. 1999; Kuuskankare and Laurson 2000; Laurson 2000). Calibration of the synthesis model is based on the analysis of recorded guitar tones (Välimäki et al. 1996; Tolonen 1998). A recent article (Erkut et al. 2000) addressed the revision of the calibration pro- cess to improve efficiency and robustness. It also proposed extended methods to capture information about performance characteristics such as different pluck styles, vibrato, and dynamic variations of a professional player. In addition, the article pre- sented basic techniques for simulation of the tran- sients. Instead of using a detailed finger–string interaction model like that proposed by Cuzzucoli and Lombardo (1999), the simulation consolidates all the transient effects into the excitation signal and the update trajectories of the model parameters. The current article summarizes our achieve- ments in model-based sound synthesis of the acoustic guitar with improved realism. First, a simplified physical model of a string instrument realized in our work is described. The next section discusses the calibration of the synthesis model. Then, we address controlling the synthesizer using ENP. After this, we provide an overview of the real-time synthesizer PWSynth. The final section discusses how we simulate various playing styles used in the classical guitar repertoire. Musical ex- cerpts related to this article will be included on the forthcoming Computer Music Journal 25:4 compact disc. Structure of the Synthesizer We have implemented a string instrument model that is based on the principle of commuted waveguide synthesis. We now present both the ba- sic string model and a guitar string model that contains two basic models. Basic String Model A model for a vibrating string is the only part of the system that explicitly models a physical phe- nomenon. Our string model implementation is il- lustrated in Figure 1. It is a feedback loop that contains a delay line and two digital filters, as sug- gested previously in the literature (Jaffe and Smith 1983; Välimäki et al. 1996). The input signal x(n) of the system is obtained from a recorded guitar tone, as described later. The digital filter seen in Figure 1 Methods for Modeling Realistic Playing in Acoustic Guitar Synthesis Mikael Laurson,* Cumhur Erkut, Vesa Välimäki, and Mika Kuuskankare* *Center for Music and Technology Sibelius Academy, Helsinki, Finland http://cmt.siba.fi Laboratory of Acoustics and Audio Signal Processing Helsinki University of Technology, Espoo, Finland http://www.acoustics.hut.fi/

Upload: others

Post on 13-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

38 Computer Music Journal

Computer Music Journal, 25:3, pp. 38–49, Fall 2001© 2001 Massachusetts Institute of Technology.

Sound synthesis based on physical modeling ofstringed instruments has been an active researchfield for the last decade. The most efficient synthe-sis models have been obtained using the theory ofdigital waveguides (Smith 1992). Commutedwaveguide synthesis (Smith 1993; Karjalainen etal. 1993) is based on the linearity and time-invari-ance of the synthesis model and is an importantmethod for developing a generic string instrumentmodel. Recently, such a model has been presentedincluding consolidated pluck and body wavetables,a pluck-shaping filter, a pluck-position comb filter,string models with loop filters and continuouslyvariable delays, and sympathetic couplings be-tween the strings (Karjalainen et al. 1998).

Our model is realized in a real-time softwaresynthesizer called PWSynth. PWSynth is a user li-brary for PatchWork (Laurson 1996) that attemptsto effectively integrate computer-assisted compo-sition and sound synthesis. PWSynth is a part ofour project that investigates different control strat-egies for physical models of musical instruments.PatchWork is used also to generate control datafrom an extended score representation, the Expres-sive Notation Package (ENP) (Laurson et al. 1999;Kuuskankare and Laurson 2000; Laurson 2000).

Calibration of the synthesis model is based on theanalysis of recorded guitar tones (Välimäki et al.1996; Tolonen 1998). A recent article (Erkut et al.2000) addressed the revision of the calibration pro-cess to improve efficiency and robustness. It alsoproposed extended methods to capture informationabout performance characteristics such as differentpluck styles, vibrato, and dynamic variations of aprofessional player. In addition, the article pre-sented basic techniques for simulation of the tran-sients. Instead of using a detailed finger–string

interaction model like that proposed by Cuzzucoliand Lombardo (1999), the simulation consolidatesall the transient effects into the excitation signaland the update trajectories of the model parameters.

The current article summarizes our achieve-ments in model-based sound synthesis of theacoustic guitar with improved realism. First, asimplified physical model of a string instrumentrealized in our work is described. The next sectiondiscusses the calibration of the synthesis model.Then, we address controlling the synthesizer usingENP. After this, we provide an overview of thereal-time synthesizer PWSynth. The final sectiondiscusses how we simulate various playing stylesused in the classical guitar repertoire. Musical ex-cerpts related to this article will be included onthe forthcoming Computer Music Journal 25:4compact disc.

Structure of the Synthesizer

We have implemented a string instrument modelthat is based on the principle of commutedwaveguide synthesis. We now present both the ba-sic string model and a guitar string model thatcontains two basic models.

Basic String Model

A model for a vibrating string is the only part ofthe system that explicitly models a physical phe-nomenon. Our string model implementation is il-lustrated in Figure 1. It is a feedback loop thatcontains a delay line and two digital filters, as sug-gested previously in the literature (Jaffe and Smith1983; Välimäki et al. 1996). The input signal x(n) ofthe system is obtained from a recorded guitar tone,as described later. The digital filter seen in Figure 1

Methods for ModelingRealistic Playingin Acoustic GuitarSynthesis

Mikael Laurson,* Cumhur Erkut,†

Vesa Välimäki,† and Mika Kuuskankare**Center for Music and TechnologySibelius Academy, Helsinki, Finlandhttp://cmt.siba.fi†Laboratory of Acoustics and Audio Signal ProcessingHelsinki University of Technology, Espoo, Finlandhttp://www.acoustics.hut.fi/

cerkut
© MIT, 2001
cerkut

Laurson, Erkut, Välimäki, and Kuuskankare 39

inside a box drawn with a broken line is called theloop filter. Its output signal is computed as

)1()()()()(11

--= nynanynbny (1)

where n is the discrete time index, a(n) is the feed-back coefficient, and b(n) = g(n)[1 + a(n)] is the gaincoefficient of the loop filter. The magnitude ofboth g and a must be less than 1 to ensure that thefeedback loop will be stable. Usually, g is slightlysmaller than 1 and a is slightly smaller than 0, inwhich case the filter has a gentle lowpass charac-teristic. Note that the loop-filter coefficients a(n)and b(n) are allowed to be time-varying in Figure 1,since they must be changed, for example, duringattenuation or re-plucking of the string.

In Figure 1, the delay line denoted by z–M and theFIR filter with coefficients h(k)—which acts as afractional delay filter—together constitute the loopdelay which controls the pitch of the synthetic tone.The length of the delay line is M = L – 2 samples,where L is the integer part of the loop delay. Theoutput of the four-tap FIR filter is computed as

)4()3()3()2(

)2()1()1()0()(

11

11

--+--+

--+--=+

MnyhMnyh

MnyhMnyhny dL(2)

where h(k) are the third-order Lagrange interpola-tion coefficients that are functions of the frac-tional delay parameter d:

h d d d

h d d d d

h d d d

h d d d

016

1 2

112

1 1 2

212

1 2

316

1 1

( )= -( ) -( )

()= +( ) -( ) -( )

( )= +( ) -( )

( )= +( ) -( )

– .

(3)

When d = 0, all coefficients h(k) will be zero ex-cept h(1) = 1.0, thus the delay produced by the FIRfilter will correspond to one sample, and the over-all loop delay in Figure 1 will be L = M + 2 samples(assuming that a is small and the loop filter doesnot contribute much to the delay).

In some studies, an allpass filter is used to imple-ment the fractional delay (Jaffe and Smith 1983),but we prefer an FIR filter, because we find it easierto generate clean vibrato and glissando tones withit. While the Lagrange interpolation filter producesa good approximation of ideal delay at low frequen-cies, it has a disadvantage: for non-zero values of d,it attenuates high frequencies. If this is harmful, ahigher filter order can be used that effectivelypushes the problem further towards the Nyquistfrequency. According to our experience, a third-or-der filter is the minimum required for high-qualitysynthesis at the sampling rate of 44.1 kHz.

Guitar String Model

The structure of the guitar string model is illus-trated in Figure 2 for one of the guitar’s six strings.Two basic string models of Figure 1 are used foreach guitar string, since the horizontal and verticalpolarizations of vibration are modeled separately.When one of the string models is slightly detuned,a pleasant beating—characteristic of natural stringtones—appears.

The excitation signals are collected in a data-base. For each tone, an excitation signal is selectedaccording to string and fret numbers. The signal isfirst processed by the pluck-shaping filter (see Fig-ure 2) and then by a feedforward comb filter thatchanges the plucking-point effect, if needed. Thisprocessed excitation signal is fed into both stringmodels. Note that we could adjust the gain of theexcitation signal for each polarization (Karjalainenet al. 1998), but we decided this was unnecessaryin practice. The output of the horizontal stringmodel is coupled with the vertical part of all guitarstrings to account for sympathetic vibrations in aninherently stable manner (Karjalainen et al. 1998).Note that alternatively vertical models could becoupled to the horizontal ones; the key point,

Figure 1. Block diagram ofthe basic string model.

cerkut
© MIT, 2001

40 Computer Music Journal

however, is to ensure stability by avoiding feed-back. As seen in Figure 2, the synthetic guitar toneis composed of the sum of the horizontal and ver-tical output signals. Again, we could use weightsin the mixing of these signals, but have decidedthis is uneccessary in practice.

Furthermore, as indicated in Figure 2, we havecreated a database of ”special effects”—such asrubbing and scraping of the string and variousknockings on the guitar body—that are essentialin synthesizing modern guitar repertoire. Suchsamples can be mixed with the guitar signal beforeplayback. In our implementation, two sampleplayers per string can read the effects database si-multaneously, allowing another effect to be trig-gered before the previous one has ended.

To synthesize various playing effects, rules areneeded that generate envelopes for parameter val-ues. These are discussed briefly later in this ar-ticle. Next, we consider how parameter values canbe extracted from a recording.

Calibration of the Synthesizer

The calibration of PWSynth can be divided intothree subtasks. The first task is the estimation ofthe basic string model parameters (see Figure 1) foreach string and fret, namely, the gain g and the co-efficient a of the loop filter, the fractional part d

and the integer part L of the loop delay, and theexcitation signal x(n). The second task is the cali-bration of the guitar string model (see Figure 2).Here, a subset of the calibration system is used tocapture important characteristics of the analyzedrecordings. Instead of providing the final param-eters of the model, these tools render restrictedinitial estimates that can be used for further ex-perimentation. The third task involves the analy-sis of special data that contains examples ofdifferent playing styles. In general, the analysisprovides a template for parameter updates. Thefirst and second tasks are presented below,whereas the third task is discussed with the meth-ods for simulation of playing styles.

Calibration of the Basic String Model

The basic string model parameters are estimatedwith an iterative parameter extraction algorithmdescribed in Erkut et al. (2000). The algorithm firstextracts the initial model parameters from theanechoic recordings using a method based on thepitch-synchronous short-time Fourier transform asdescribed in Tolonen (1998). We call this the corecalibrator. Then, it updates the model parametersto provide a better match between the analyzedand synthesized tones. Figure 3 shows the detailsof the system.

Plucking-point filter

Pluck-shapingfilter

Out

)(h zS

)(v zS

Gain for effect #1

Couplingmatrix

Fro

mho

rizo

ntal

stri

ngm

odel

s

Gain for effect #2

Tove

rtic

alst

ring

mod

els

Database ofspecial effects

Database ofexcitation signals

Database ofdamping signals

Figure 2. Block diagram ofthe guitar string model.

cerkut
© MIT, 2001

Laurson, Erkut, Välimäki, and Kuuskankare 41

The core calibrator decomposes the analyzed toneinto its deterministic and residual parts, just likethe spectral modeling synthesis (SMS) method de-scribed by Serra and Smith (1990). However, in thecore calibrator, the underlying model structure thataccounts for the deterministic part is the basicstring model, rather than the SMS sinusoidal model.This approach allows us to obtain the loop-filter pa-rameters that provide the best match between thestring model output and the deterministic part. Un-like SMS, the residual is not modeled as a stochasticprocess; it is directly used as a component of the ex-citation signal. The initial levels of the harmonicsconstitute the other component of the excitation.Tolonen (1998) provides further details.

The use of iteration is motivated by the corecalibrator’s high degree of sensitivity to measure-ment noise and the regularity of the analyzedsamples. As a consequence, the loop-filter param-eter estimates exhibit a large variance dependingon the data. Estimation errors in the loop-filter pa-rameters result in an unnatural decay in the syn-thetic tone. Moreover, the quality of the excitationsignal is affected.

The iterative system shown in Figure 3 synthe-sizes a signal with the initial parameter estimatesof the core calibrator. The amplitude envelope ofthe synthetic tone is compared to that of the origi-nal tone. If there is a discrepancy between the de-cay of envelopes of the original and synthetictones, an iterative optimization algorithm is usedto detect the optimal loop-filter parameters byminimizing the error, e(k). This approach improvesthe perceived quality of the synthetic guitar tones.

Recent research has shown that the variation ofthe loop-filter parameters is not perceived below cer-tain frequency-dependent thresholds (Tolonen andJärveläinen 2000). This property can be incorporatedinto the envelope-matching algorithm in Figure 3.

Calibration of the Guitar String Model

The excitation signals xij(n) for each string i andeach fret j are obtained from samples of a guitarplayed mezzo-forte using the iterative algorithmdiscussed above, and they are stored in the data-base shown in Figure 2. The special-effects data-

Figure 3. An extended it-erative parameter extrac-tion scheme that matchesthe overall decay of theanalyzed and synthesizedsounds, after Erkut et al.(2000).

cerkut
© MIT, 2001

42 Computer Music Journal

base consists of non-processed samples. A smallnumber of damping signals is included in a sepa-rate database for simulation of natural-soundingtransients. These signals are further elaboratedlater in this article when we discuss the simula-tion of playing styles.

The current pluck-shaping filter of PWSynth is aone-pole lowpass filter, similar to the loop filter inEquation 1. This filter converts the mezzo-forteexcitations into softer (i.e., piano) ones. A methodbased on the deconvolution of two excitation sig-nals differing in dynamics has been proposed forestimation of the pluck-shaping filter coefficients(Erkut et al. 2000). Some coefficients estimated bythis method are plugged into PWSynth, but theyare manually varied and extrapolated to ensure thesynthesis quality over a broad frequency and dy-namic range.

Recently, we developed another technique fordesigning a second-order (biquad) pluck-shapingfilter. The design prevents high deviations of the

filter coefficients from the noise introduced bydeconvolution. In this technique, we represent theindividual low-pass characteristics of the excita-tion signals parametrically and calibrate the filteraccording to the difference between the parametriccurves. The following example (see Figure 4) dem-onstrates the method.

The excitation signals for mezzo-forte and pianoplucks are parameterized using a second-order Lin-ear Predictive Coding (LPC) fit. The ratio of theparametric curves (i.e., the difference on the dBscale) directly gives the magnitude response of thepluck-shaping filter to obtain the piano excitationsignals from the mezzo-forte ones. The filtering iscarried out with a biquad filter that has the follow-ing transfer function:

H z

g a z a z

g a z a z( )

( )

( )=

+ +

+ +

- -

- -p m m

m p p

1

11

12

2

11

22 . (4)

In this example, the mezzo-forte pluck parametersare gm = 0.0479, a1m = –1.8746, and a2m = 0.8765,

Figure 4. Calibration ofthe biquad pluck-shapingfilter. The second-orderLPC models of mezzo-forte (top) and piano

(middle) excitation sig-nals are drawn with thicklines. The magnitudespectra of the excitationsignals are also shown.

The difference betweenthe parametric curves isthe desired magnitude re-sponse of the pluck-shap-ing filter (bottom).

0 5 10 15 20­ 80­ 60­ 40­ 20

020

Mezzoforte

0 5 10 15 20­ 60­ 40­ 20

020

Mag

nitu

de [

dB] Piano

0 5 10 15 20­ 15

­ 10

Frequency [kHz]

Transfer function of pluck­ shaping filter

cerkut
© MIT, 2001

Laurson, Erkut, Välimäki, and Kuuskankare 43

and the piano pluck parameters are gp = 0.0084,a1p = –1.9531, and a2p = 0.9541. Note that the inher-ent nonlinearities in the energetic excitation signalsprevent an exact perceptual match between differ-ent dynamics, and the described linear filtering pro-cedure cannot handle these effects. Therefore,nonlinear methods such as frequency modulationmay improve the realistic simulation of dynamics.

The excitation signals of the database contain acomb-filtering effect caused by the actual pluckingpoint of the recorded samples. Ideally, this effectshould be cancelled by preprocessing and then re-simulated at the synthesis level, thereby providingprecise timbral control by varying the plucking point.Although the guitar model includes a plucking-pointfilter, excitation equalization has not been performedin PWSynth, owing to the lack of a method that canreliably estimate the plucking point. Such a methodhas been recently reported, however (Traube andSmith 2000). Incorporation of this method into thecalibration system is left as a future task.

Currently, a method for estimating the delaylengths of a dual-polarization string model is avail-able (Välimäki et al. 1999), and techniques for esti-mating the other parameters are under development.However, these techniques have not yet been usedin PWSynth. Instead, in order to introduce a slightbeating, a detuning parameter value for each dual-polarization string model was found by ear.

The calibration of the coupling matrix (see Fig-ure 2) requires a detailed analysis of the string ter-minations at the bridge and is an open researchproblem. The coupling matrix coefficients of thePWSynth have been adjusted by trial and error.

Synthesis Control Using ENP

In the following, we discuss a music notation soft-ware package called ENP that we currently use forcontrolling the synthesizer. The use of notation inexpressive synthesis control is motivated by thelack of adequate real-time controllers, the famil-iarity of music notation, and the required preci-sion of control. The use of music notation requiresno special technical training, which in turn makesit possible to employ professional players to test

and verify various physical models at a deeperlevel than before. This kind of collaboration is ofcourse crucial, as it allows the combination of theboth musical and technical expertise.

Expressive Notation Package (ENP)

The user enters musical material in common mu-sic notation into ENP. The system requires no tex-tual input. Users can also add both standard andnon-standard expressions that allow them tospecify instrument-specific playing styles withgreat precision. Expressions can be applied to asingle note (such as string number, pluck position,vibrato, or dynamics) or to a group of notes (e.g.,left-hand slurs or finger-pedals). Groups can over-lap, and they may contain other objects, such asbreakpoint functions (BPFs). Macro expressionsgenerate additional note events, such as tremolo,trills, portamento, and rasgueado (a strummingtechnique using the right-hand fingernails that iscommon in the flamenco playing style).

ENP allows fine-tuning of timing with the helpof graphical tempo functions. To ensure synchro-nization of polyphonic scores, all tempo functionsare merged and translated internally into a globaltime-map (Jaffe 1985). Tempo modifications aredefined by first selecting a group of notes in thescore. After this, a tempo function (group-BPF) isapplied to the group. The tempo function can beopened and edited with the mouse. In addition toconventional accelerandi and ritardandi, the usercan apply special ”give and take” rubato effects toa group. As in the previous case, the user startswith a selection in the score and applies a tempofunction to it. The difference, however, is that theduration of the selection is not affected by thetime modification. Time modifications are only ef-fective inside the selected group.

ENP also supports user-definable performancerules (Laurson et al. 1999) that allow modificationof score information. Performance rules are usedto calculate timing information, dynamics, andother synthesis parameters in a way similar to theSwedish ”Rulle” system (Friberg 1991). The ENPrules use a syntax that was originally designed for

cerkut
© MIT, 2001

44 Computer Music Journal

PWConstraints (for more detail, see Laurson 1996and Laurson et al. 1999).

After all musical information has been entered,the score is translated into control information.This process is executed in two main steps. First,the note information provided by the input scoreis modified by the tempo functions and the ENPperformance rules. In addition, some instrument-specific rules are applied that further modify theinput score. Second, all notes of the input score arescheduled. While the scheduler is running, eachnote sends a special method to its instrument,which in turn starts other scheduled methods thattypically produce the final control data. Thesemethods are responsible for creating discrete con-trol data (e.g., excitation information) or continu-ous data (e.g., gain of the loop filter, filtercoefficients, or other low-level data).

Example from the Classical Guitar Repertoire

Figure 5 gives a fairly typical ENP example fromthe classical guitar repertoire (a transcription ofLoure from the E major Partita for lute by J. S.Bach, BWV 1006a). In addition to conventionalpitch and rhythm information, the score containsseveral standard expressions, such as left-handslurs and the encircled ”2,” which indicates thatthe corresponding notes should be played on thesecond string. Furthermore, there are severalportamento expressions (lines marked with”port”). that indicate a rapid glide of the left-handfinger. The non-standard expressions ”vb5” and”vb4” denote that the notes in question should beplayed with a moderate vibrato. The example alsoillustrates a graphical tempo function that con-trols the amount of rubato applied to the passage.

The slow-moving dance character is kept, how-ever, by ensuring that the cadences are strictly intime. The timing is also modified within the dot-ted rhythmic motives characteristic of the piecewith the help of performance rules.

A Contemporary Notation Example

Figure 6 presents a contemporary excerpt for theclassical guitar (from Lettera Amorosa by Juan An-tonio Muro). This example is interesting from bothnotational and synthesis perspectives. Note thatENP permits the notation of unmeasured music.ENP also allows the use of non-standard noteheads,which in turn permits the user to express novel in-strumental playing techniques. For instance, thefirst non-standard notehead (the box with a trian-gularly shaped waveform right after the first run)indicates that the performer should rub the stringswith the left hand. The second one (the small boxcontaining the letter ”T”) stands for a ”tambura”effect whereby the player hits the bridge of the in-strument with the right-hand thumb. The last one(a note-head with an encircled ”x”) indicates a hitwith the right-hand nail on the body (golpe in theSpanish terminology). These extended techniquesare synthesized using the unprocessed samplescontained in the special effects database shown inFigure 2.

Synthesis Engine

This section describes first the synthesis engine,PWSynth, used in our project. After this, we showhow complex instrumental models can be param-eterized using special parameter matrices.

Figure 5. A musical ex-cerpt from the standardclassical guitar repertoire.

cerkut
© MIT, 2001

Laurson, Erkut, Välimäki, and Kuuskankare 45

PWSynth

The starting point in PWSynth is a patch consist-ing of boxes and connections. This patch is justlike any other PatchWork patch except that eachPWSynth box contains a private C structure. TheC structures inside PatchWork boxes are visibleboth to the Lisp environment and to a collectionof subroutines written in C that are interfacedwith the Lisp system. The Lisp environment is re-sponsible for converting a PWSynth patch to a treeof C structures. Also, it fills and initializes allneeded structure slots with appropriate data. Oncethe patch has been initialized, PWSynth calls themain synthesis C routine which in turn starts toevaluate the C structure tree. Real-time slidersand MIDI devices can be used inside the patch.

This scheme supports embedded definitions: a Cstructure can contain other C structures to anydepth. This feature is of primary importance whendesigning complex instrument models. For example,the guitar model used in our system—consisting ofa coupling matrix and six dual-polarization stringmodels—contains a C structure defining the com-plete instrument. The instrument, in turn, containsseven substructures, one for the sympathetic cou-plings and six for the strings. The most complex en-tity used in our example is the dual-polarizationguitar string model illustrated in Figure 2. It is builtfrom ten substructures: two delay lines with third-order Lagrange interpolation, two loop filters, threesample players, a delay, a pluck-shaping filter, and aplucking-point filter.

Parameter Matrix

The interaction between ENP and PWSynth is real-ized in complex cases with the help of graphical ma-

trices (see Figure 7). Each row and column arenamed. By taking the intersection of the row andcolumn names, a large symbolic parameter space iseasily generated. Each resulting parameter name(pointing to an address in a C structure) is apathname similar to the ones found in other synthe-sis communication protocols, such as Open SoundControl (Wright and Freed 1997). Thus, if we as-sume that our current guitar synthesizer is named”guitar1,” we can refer to the loop-filter coefficient(”lfcoef”) of the second string (”2”) of our guitarmodel with the pathname ”guitar1/2/lfcoef.” Thisnaming scheme is powerful because it allows the si-multaneous use of many instrument instances withseparate parameter paths. For instance, we can eas-ily add to a patch new instrument boxes (such as”guitar2,” ”guitar3,” etc.), each having a unique pa-rameter name space.

The rows in Figure 7 indicate the string numbers(1–6), and the topmost row gives the parameternames. The numerical values in the 6 ´ 18 matrixare used as initial values when starting the synthe-sis. During synthesis, the control data list, whichis generated by an ENP input score, updates thematrix items continuously.

The first two parameter names, ”SIno” and”sndno,” refer to the excitation sample used bythe current string. The system uses a double in-dexing scheme where the first index refers to a col-lection of samples or ”Sample-Instrument” (SI).The second index points to the actual excitationsample within the current SI. Assuming that theexcitation samples for the first string are found inthe SI with index 0, we would use an index pair (0,7) to refer to the excitation sample of the seventhfret of the first string.

The next parameter in Figure 7, ”freq,” gives thedesired frequency of the current string, and”plgain” in turn gives the pluck gain. The next

Figure 6. A musical tex-ture with some modernnotational conventions.

cerkut
© MIT, 2001

46 Computer Music Journal

pair, ”lfgain” and ”lfcoef,” controls the behavior ofthe loop filter. The parameter ”plpos” gives thecurrent pluck position, ”freqcor” is a frequency cor-rection factor used to correct the playback speed ofthe current excitation sample, ”detune” defines theamount of detuning used by the dual-polarizationstrings of the guitar string model, and ”plposg” de-termines the coefficient of the plucking-point filter(see Figure 2). The pair ”plfgain” and ”plfcoef” con-trol the pluck-shaping filter. The final six param-eters define the parameters of the two extra sampleplayers used to trigger special playing effects.

Simulation of Playing Styles

We now present some ways various playing stylesused by a classical guitarist can be simulated withENP. We describe basic techniques, such as simpleplucks, fast ”re-plucks” (i.e., repeated plucks),staccato, pizzicato, left-hand slurs, portamento, vi-brato, and forte playing. General implementationof some of these techniques has been mentionedin Erkut et al. (2000); here, we represent them asimplemented in our present system.

Simple pluck events are simulated as follows.The system selects from the database the appropri-ate excitation sample, the nominal fundamentalfrequency, and some other controller parametersaccording to the current string and fret numbers.Amplitude and pluck position values are read fromthe current note. After the system has written thenew parameter values to their appropriate ad-dresses, the sample player module triggers the cur-rent excitation signal to produce a tone.

A human player touches the vibrating string mo-mentarily for performing certain basic techniques,such as damping of a vibrating string or re-pluck-ing it. At this instant, the touch splits the string

into two portions and constitutes a common lossyboundary condition for each part. The vibration ofthe string dies out very rapidly due to the losses.Such transient regimes have been presented inErkut et al. (2000), and it has been shown thattheir inclusion improves the quality of synthetictones. The touching point is usually close to theplucking point; hence the portion from this pointto the bridge remains the same regardless of theactual fret position. Special excitation signals con-tained in the database of damping signals corre-sponding to various touching points are used forsimulation of this effect.

The harmonic tones resulting from the stringportion between the touching point and the nuthave been removed from these short damping sig-nals, so that they can be injected to any note. Theupdate of the loop-filter parameters roughly simu-lates the rapid decay of the string. The updatetimes of the parameters are based on our previousobservations from analyzed examples.

Just before the string is re-plucked, the gain ofthe loop filter of the current string is lowered tozero in a very short time (typically around 10msec). For fast re-plucks, that is, when the currentstring to be plucked is still ringing, the systemsends one of the special samples just before the ac-tual pluck excitation. This technique is also usedwhen playing in a staccato style where the playerdampens the strings with the right-hand fingers.

In guitar playing, pizzicati are a special class oftones that are produced by plucking the stringwith the nail of the right hand thumb and thendamping the string with the palm of the samehand. The analysis of pizzicati signals suggeststhat their main difference from normal plucks isthe decay characteristics of the harmonics. Figure8 shows the extracted amplitude envelope trajecto-ries for the first three harmonics of normal and

Figure 7. Graphical pa-rameter name matrix.

cerkut
© MIT, 2001

Laurson, Erkut, Välimäki, and Kuuskankare 47

pizzicato plucks, respectively. The overall decayrates increase significantly for the pizzicato case,and the figure suggests an additional frequency-de-pendent decay rate increase for higher harmonics.In our system, the pizzicato effect is accomplishedby slightly lowering the gain and the cut-off fre-quency of the loop filter of the current string. Theparameters extracted form the pizzicato sample setusing the iterative methods described in the cali-bration section which provide a template for therange of the alteration.

Although this technique produces reasonable re-sults, the quality of the pizzicato effect may be im-proved by filtering the excitation signal as well.This prediction is based on several recent observa-tions. Figure 9 shows the Fourier transforms of thenormal and pizzicato excitation signals togetherwith the magnitude responses of the correspondingsecond-order LPC models. The pizzicato excitationsignals contain considerably less high-frequencyenergy compared to those of normal plucks, be-cause the string is damped with the right hand soquickly. The LPC models can be used for compos-ing a biquad pluck-shaping filter (see Equation 4,where the same idea was applied) that converts anormal excitation into a pizzicato excitation.

The left-hand slurring technique is implementedsimply by sending a pluck event with a small am-plitude value. In this case, the gain of the loop fil-

ter is not modified. Portamento is realized with anextremely fast chromatic left-hand slur passage be-tween two notes situated on the same string. (Thissimulates the movement of the left-hand finger onthe fingerboard.)

Additional pitch information is sent from ENPas a scaling factor. If there is no glissando, the fac-tor has a constant value of unity. If the score has aglissando sign between two notes (having the ini-tial and target frequencies freq1 and freq2, respec-tively), the factor is an envelope starting at 1.0 andending at the ratio of freq2 to freq1.

Our system implements a parametric representa-tion of the vibrato control with a rate, maximumdepth, and temporal envelope for depth. Thisimplementation is consistent with our observationslive player’s use of a vibrato in a musical context.

The maximum depth (max-depth) of vibrato iscalculated as follows. If the score does not containany specific vibrato expressions (i.e., we want onlyto play a ”straight” tone), the max-depth value de-pends on whether the current fret is zero (i.e., anopen string) or not. If it is zero, the max-depth isequal to zero. For higher fret values, the max-depth is calculated by adding increasing amountsof vibrato (the amount is always moderate) as thefret value increases. This addresses the fact that,for higher frets, the string is looser, which in turnmakes it more difficult for the player to keep thepitch stable. If, however, the score contains vi-brato expressions, the vibrato max-depth is calcu-lated depending on the name of the vibratoexpression. Vibrato expressions are named using”vb” with a suffix (an integer from one to nine) in-dicating the max-depth of the vibrato. For ex-ample, a slight vibrato is simulated with theexpression ”vb1,” a moderate vibrato is producedwith ”vb5,” and an extreme vibrato is achievedwith ”vb9.”

The rate of the vibrato is normally kept constant(typically around 5–6 Hz). The overall depth, how-ever, is controlled by an envelope scaled to the cur-rent max-depth value, with an ascending–descendingfunction with two ”humps” to avoid a mechanical-sounding effect when applying a vibrato.

In addition to the pluck-shaping operations men-tioned previously, the initial pitch of a note played

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

­ 15

­ 10

­ 5

0

Mag

nitu

de [

dB]

Normal Pluck

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9­ 60

­ 40

­ 20

0

Time [s]

Mag

nitu

de [

dB]

Pizzicato Pluck

Figure 8. Amplitude enve-lopes of the first (solidline), the second (dash-dot line), and the third

harmonic (dashed line)for normal (top) andpizzicato (bottom) plucks.

cerkut
© MIT, 2001

48 Computer Music Journal

forte is somewhat sharp, quickly and smoothly re-turning to the nominal, fingered pitch value. Amore elaborate way to simulate this effect wouldbe to use a tension modulation algorithm (Tolonenet al. 2000).

Conclusions and Future Work

Recent developments in the model-based synthesisof the classical guitar were described, including sig-nal processing and control methods for simulatingseveral playing styles. Examples of short phrasesand excerpts from musical pieces that demonstratethe novel capabilities of our classical guitar synthe-sizer will be available on the forthcoming Com-puter Music Journal 25:4 compact disc.

Our future plans include further reduction of theexcitation database’s size by removing the redun-dancies between the excitation signals. The pa-rameterization of the low-frequency modes, assuggested previously (Välimäki et al. 1996;Tolonen 1998; Välimäki and Tolonen 1998) is alsoan attractive method for shortening the excitationsignals. We are currently working on the synthesisof different plucked string instruments, such asthose from the lute family. The excitation signalsand parameter values of the synthesizer must be

extracted, and some of the control rules must becustomized for each instrument.

Acknowledgments

This research was conducted within the projects”Sounding Score—Modeling of Musical Instru-ments, Virtual Musical Instruments, and TheirControl” and ”Sound Source Models—Analysis,Synthesis, and Coding” financed by the Academyof Finland. The work of C. Erkut has been sup-ported by the Jenny and Antti Wihuri Foundation,and the work of V. Välimäki has been supportedby a postdoctoral research grant from the Academyof Finland. The authors are grateful to Mr. KlausHelminen who played the samples used for cali-brating the guitar synthesizer.

References

Cuzzucoli, G., and V. Lombardo. 1999. ”A PhysicalModel of the Classical Guitar, Including the Player’sTouch.” Computer Music Journal 23(2):52–69.

Erkut, C., V. Välimäki, M. Karjalainen, and M. Laurson.2000. ”Extraction of Physical and Expressive Param-eters for Model-Based Sound Synthesis of the Classi-cal Guitar.” Paper presented at the 108th AESConvention . New York: Audio Engineering Society.

Friberg, A. 1991. ”Generative Rules for Music Perfor-mance: A Formal Description of a Rule System.”Computer Music Journal 15(2):49–55.

Jaffe, D. A. 1985. ”Ensemble Timing in Computer Mu-sic.” Computer Music Journal 9(4):38–48.

Jaffe, D. A., and J. O. Smith. 1983. ”Extensions of theKarplus–Strong Plucked-String Algorithm.” Com-puter Music Journal 7(2):76–87.

Karjalainen, M., V. Välimäki, and Z. J· nosy. 1993. ”To-wards High-Quality Sound Synthesis of the Guitar andString Instruments.” Proceedings of the 1993 Interna-tional Computer Music Conference. San Francisco: In-ternational Computer Music Association, pp. 56–63.

Karjalainen, M., V. Välimäki, and T. Tolonen. 1998.”Plucked-String Models: From the Karplus–Strong Al-gorithm to Digital Waveguides and Beyond.” Com-puter Music Journal 22(3):17–32.

Kuuskankare, M., and M. Laurson. 2000. ”ExpressiveNotation Package (ENP), a Tool for Creating Com-

Figure 9. The second-or-der LPC models of normal(top) and pizzicato(middle) excitation sig-

nals are drawn with thicklines. The magnitudespectra of the excitationsignals are also shown.

0 5 10 15 20­ 60

­ 40

­ 20

0

20

Mag

nitu

de [

dB]

Normal Pluck

0 5 10 15 20­ 60

­ 40

­ 20

0

20

Mag

nitu

de [

dB]

Pizzicato Pluck

Frequency [kHz]

cerkut
© MIT, 2001

Laurson, Erkut, Välimäki, and Kuuskankare 49

plex Musical Output.” Proceedings of Les Journéesd’Informatique Musicale. Bordeaux, France: SCRIME,pp. 49–56.

Laurson, M. 1996. ”PATCHWORK: A Visual Program-ming Language and Some Musical Applications.”Doctoral dissertation, Sibelius Academy.

Laurson, M., et al. 1999. ”From Expressive Notation toModel-Based Sound Synthesis: A Case Study of theAcoustic Guitar.” Proceedings of the 1999 Interna-tional Computer Music Conference. San Francisco:International Computer Music Association, pp. 1–4.

Laurson, M. 2000. ”Real-Time Implementation and Con-trol of a Classical Guitar Synthesizer in SuperCollider.”Proceedings of the 2000 International Computer MusicConference. San Francisco: International ComputerMusic Association, pp. 74–77.

Serra, X., and J. O. Smith. 1990. ”Spectral ModelingSynthesis: A Sound Analysis/Synthesis System Basedon a Deterministic Plus Stochastic Decomposition.”Computer Music Journal 14(4):12–24.

Smith, J. O. 1992. ”Physical Modeling Using DigitalWaveguides.” Computer Music Journal 16(4):74–91.

Smith, J. O. 1993. ”Efficient Synthesis of Stringed Musi-cal Instruments.” Proceedings of the 1993 Interna-tional Computer Music Conference. San Francisco:International Computer Music Association, pp. 64–71.

Tolonen, T. 1998. ”Model-Based Analysis and Resyn-thesis of Acoustic Guitar Tones.” Report 46. Espoo,Finland: Helsinki University of Technology, Labora-tory of Acoustics and Audio Signal Processing.

Tolonen, T., and H. Järveläinen. 2000. ”PerceptualStudy of Decay Parameters in Plucked String Synthe-sis.” Paper presented at the 109th AES Convention.New York: Audio Engineering Society.

Tolonen, T., V. Välimäki, and M. Karjalainen. 2000.”Modeling of Tension Modulation Nonlinearity inPlucked Strings.” IEEE Transactions on Speech andAudio Processing 8(3):300–310.

Traube, C., and J. O. Smith, 2000. ”Estimating the Pluck-ing Point on a Guitar String.” Proceedings of the COST-G6 Conference on Digital Audio Effects. Verona, Italy:Università degli Studi di Verona, pp. 153–158.

Välimäki, V., et al. 1996. ”Physical Modeling of PluckedString Instruments with Application to Real-TimeSound Synthesis.” Journal of the Audio EngineeringSociety 44(5):331–353.

Välimäki, V., and T. Tolonen. 1998. ”Development andCalibration of a Guitar Synthesizer.” Journal of theAudio Engineering Society 46(9):766–778.

Välimäki, V., et al. 1999. ”Nonlinear Modeling and Syn-thesis of the Kantele—a Traditional Finnish String In-strument.” Proceedings of the 1999 InternationalComputer Music Conference. San Francisco: Interna-tional Computer Music Association, pp. 220–223.

Wright, M., and A. Freed. 1997. ”Open Sound Control:A New Protocol for Communicating with Sound Syn-thesizers.” Proceedings of the 1997 InternationalComputer Music Conference. San Francisco: Interna-tional Computer Music Association, pp. 101–104.

cerkut
© MIT, 2001