automatic measurement and modelling of contact sounds · 2011-11-22 · this thesis documents the...
TRANSCRIPT
Automatic Measurementand Modelling of Contact Sounds
by
JoshuaL. Richmond
B.A.Sc.,Universityof Waterloo,1998
A THESISSUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTSFORTHE DEGREEOF
Master of Science
in
THE FACULTY OFGRADUATE STUDIES
(Departmentof ComputerScience)
Weacceptthis thesisasconformingto therequiredstandard
The University of British Columbia
August2000
c
JoshuaL. Richmond,2000
Abstract
Soundplaysan importantrole in our everydayinteractionswith theenvironment.Soundmodelsenablevirtual objectsto producerealisticsounds.The manualcreationofsoundmodelsfrom realobjectsis tediousandinaccurate.A brief review of soundmodelsis presented,with detailsof asoundmodelfor contactsounds.
This thesisdocumentsthe developmentof a systemfor the automaticacquisitionof soundmodels. The systemis composedof four modules:a soundacquisitiondevice,an asynchronousdataserver, an algorithmfor computingprototypicalsoundmodelsandan adaptive samplingalgorithm. A descriptionof eachmoduleand its requirementsisincluded.Implementationsof eachmodulearetestedandexplained.
Resultsof typical datacollectionsarediscussed.Soundmodelsfor a calibrationobject,brassvase,plasticspeaker andtoy drumareconstructedusingthesystem.Compar-isonsof thesoundmodelsto theoriginal recordingsaredisplayedfor eachobject.
Under ideal circumstancesthe systemproducesaccuratesoundmodels. Environ-mentalnoise,however, decreasestheaccuracy of theestimationtechnique.An evaluationof theparameterestimationalgorithmconfirmsthisobservation.
Many opportunitiesexist for future work on this system.Ideasfor improvementsandfutureinvestigationsaresuggested.
ii
Contents
Abstract iii
Contents v
List of Tables ix
List of Figures xi
Acknowledgements xv
Dedication xvii
1 Intr oduction 1
2 Contact Sounds 5
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 A ContactSoundModel . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 EmpiricalParameterEstimation . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.1 PerformanceEvaluation . . . . . . . . . . . . . . . . . . . . . . . 11
3 A Systemfor Automatic Measurementof Contact Sounds 15
3.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 SystemRequirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
iii
3.3 SystemOverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 A SoundAcquisition Device 21
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4 TheActiveMeasurementFacility . . . . . . . . . . . . . . . . . . . . . . . 23
4.5 SoundEffector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.6 SoundCaptureHardware . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.7 SoundEffectorSoftware . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5 An AsynchronousData Server 29
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2.1 GenericDataServer Requirements. . . . . . . . . . . . . . . . . . 30
5.2.2 SoundServer Requirements . . . . . . . . . . . . . . . . . . . . . 30
5.3 Server Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.1 Server Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3.2 DataFlow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3.3 Specialisationto SoundData . . . . . . . . . . . . . . . . . . . . . 35
5.4 ImplementationDetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.4.1 PerformanceEvaluation . . . . . . . . . . . . . . . . . . . . . . . 37
5.4.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6 Building a Prototypical Model 41
6.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.3 SpectrogramAveraging. . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.4 PerformanceEvaluation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
iv
7 An AdaptiveSampling Algorithm 49
7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.3 Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.4 SurfaceRepresentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.5 AcousticDistanceMetrics . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.5.1 Frequency-Independent DampingCoefficient . . . . . . . . . . . . 56
7.5.2 Frequency Similarity . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.6 PerceptualThresholds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
8 SampleData Collections 61
8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.2 TuningFork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.2.1 EstimationResults . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.3 BrassVase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
8.3.1 EstimationResults . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.3.2 RefinementResults. . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.4 PlasticSpeaker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.4.1 EstimationResults . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.4.2 RefinementResults. . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.5 Toy Drum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.5.1 EstimationResults . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.5.2 RefinementResults. . . . . . . . . . . . . . . . . . . . . . . . . . 75
9 Conclusions 77
9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.2 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Bibliography 81
v
Appendix A SoundEffector Specifications 85
A.1 MountingBracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
A.2 Controlcircuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Appendix B Effect of White Noiseon Spectrograms 89
Appendix C Detailsof Unique Frequency-MappingAlgorithm 91
vi
List of Tables
5.1 Resultsof soundserver performanceevaluation. . . . . . . . . . . . . . . . 39
7.1 Regressionof similarity on frequency anddecaydifference.. . . . . . . . . 59
7.2 Examplesof calculatedsimilarity factors. . . . . . . . . . . . . . . . . . . 60
8.1 Summaryof setupparametersfor testobjects. . . . . . . . . . . . . . . . . 62
8.2 Summaryof refinementresults.. . . . . . . . . . . . . . . . . . . . . . . . 67
vii
List of Figures
2.1 Attributesof contactsounds. . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 ParameterEstimationAlgorithm. . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Meanerrorsof theparameterestimationalgorithmon syntheticdata. . . . . 13
3.1 SoundMeasurement-to-ProductionPipeline.. . . . . . . . . . . . . . . . . 16
3.2 Generalprocedurefor soundmodelcreation.. . . . . . . . . . . . . . . . . 19
3.3 A testobjecton theACME teststation.. . . . . . . . . . . . . . . . . . . . 20
4.1 TheSoundEffector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Digital OutputConnectionserver architecture.. . . . . . . . . . . . . . . . 27
5.1 Dataserver architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 Dataserver statediagram.. . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3 Control loop for RUNNING state.. . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Control loop for CAPTURING state. . . . . . . . . . . . . . . . . . . . . . 34
5.5 Server dataflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.6 OneAudioPacket. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.7 Spectrogramof soundcontaining“pops”. . . . . . . . . . . . . . . . . . . 38
6.1 Meanerrorsof theparameterestimationalgorithmonspectrogram-averaged
syntheticdata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.2 Standarddeviation of errorfor estimateddampingparameters.. . . . . . . 47
viii
7.1 Adaptive samplingalgorithm.. . . . . . . . . . . . . . . . . . . . . . . . . 50
7.2 Resultof adaptive sampling. . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.3 Loop schemerefinementmasksfor subdivision. . . . . . . . . . . . . . . . 54
7.4 Algorithm to find uniquefrequency mappingbetweentwo models. . . . . . 58
8.1 Setupfor acquiringmodelof tuningfork. . . . . . . . . . . . . . . . . . . 62
8.2 Recordedspectrogram.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
8.3 Resultsof tuningfork collection. . . . . . . . . . . . . . . . . . . . . . . . 62
8.4 A photoof thebrassvaseandthesubdivision surfacewhich representsit. . 64
8.5 Setupfor acquiringsoundmodelof brassvase. . . . . . . . . . . . . . . . 64
8.6 Resultsof brassvaseexperiment.. . . . . . . . . . . . . . . . . . . . . . . 66
8.7 Detail of narrow-bandlow-frequency noise . . . . . . . . . . . . . . . . . 67
8.8 Effectof noiseon low-frequency modes.. . . . . . . . . . . . . . . . . . . 67
8.9 Refinementresultsof brassvaseexperiment.. . . . . . . . . . . . . . . . . 68
8.10 A photoof theplasticspeaker andthesubdivisionsurfacewhichrepresentsit. 70
8.11 Resultsof plasticspeaker experiment. . . . . . . . . . . . . . . . . . . . . 71
8.12 Refinementresultsof plasticspeaker experiment. . . . . . . . . . . . . . . 72
8.13 A photoof thetoy drumandthesubdivision surfacewhich representsit. . . 73
8.14 Pivot of thedrum’s metalbars. . . . . . . . . . . . . . . . . . . . . . . . . 73
8.15 Resultsof toy drumexperiment(metal). . . . . . . . . . . . . . . . . . . . 74
8.16 Resultsof toy drumexperiment(plastic).. . . . . . . . . . . . . . . . . . . 75
8.17 Refinementresultsof toy drum.. . . . . . . . . . . . . . . . . . . . . . . . 76
A.1 Bendingschedulefor soundeffectormountingbracket. . . . . . . . . . . . 86
A.2 Schematicfor solenoidcontrolcircuit. . . . . . . . . . . . . . . . . . . . . 87
B.1 Effectof white noiseonspectrograms.. . . . . . . . . . . . . . . . . . . . 90
C.1 Algorithm to find uniquefrequency mappingbetweentwo models. . . . . . 92
C.2 Differencematrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
ix
C.3 Orderarray. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
C.4 Index array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
C.5 Matchedarray . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
C.6 Mappingarray. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
C.7 FindUniqueMappingPseudo-code. . . . . . . . . . . . . . . . . . . . . . 96
x
Acknowledgements
“Great discoveries and improvementsinvariably involve the cooperationofmany minds.” - AlexanderGrahamBell
This thesisis the result of encouragementand supportof dozensof my friends.Without the contributions listed below, this thesismay have never proceededbeyond thecourseprojectfrom which it began.
I owe ACME memberspastandpresenta hugethankyou! In particular, JochenLangandJohnLloyd contributedthebulk of theACME softwarewithin whichmy experi-mentsresided.Thankyou for theseeminglyendlessdiscussionsregardingthesensorclass,mysteriousabortsandassortedoddACME behaviours.
DerekDiFilippo, Paul Kry andDougJamescontributedmany focusedideaswhenminewerefuzzy. Thanksfor all theinsightinto sound,surfacesandlife!
My supervisor, DineshPai, playedan enormousrole in the developmentof thisthesis. His thoughtfulsuggestionsandadvicesteeredme out of dark cornerswhenever Igot lost. Thankyou for beinga fantasticsupervisorandfriend!
Many otherfriendssuppliedadviceandsupportwhenI neededit mostover thesepasttwo years.JimGreenandJacobOfir wereconstantbarometersof my progress.BonnieTaylor, KristenPlayfordandAndreaBuntkeptmegoingthroughtoughtimes.Kelly Croningave meagoodreasonto finish. And of course,therestof thefaculty, staff andstudentsofthedepartmentalwaysmademefeelwelcome.
This thesiswould not have beenpossiblewithout thegenerousfinancialcontribu-tionsof NSERC,BC AdvancedSystemsInstituteandIRIS.
JOSHUA L. RICHMOND
TheUniversity of British ColumbiaAugust2000
xi
Dedicatedto myfamily: Gary, CarolandEricaRichmond.
Thankyou for yourunwaveringencouragementandsupport.Youareterrific examplesof
successin work, loveandlife.
xii
Chapter 1
Intr oduction
Whenradiodramatistsbeganaddingsoundeffectsto their productionsin the1920’s, they
demonstratedthe importantrole of soundin our perceptionof everydayeventsandenvi-
ronments[20]. Without visual aid, listenerscould easily identify unspoken actions,e.g.,
a character’s entranceinto a roomby thesoundof a dooropening,followed thesoundof
footsteps.Evenwith theadventof motionpicturesin 1896,it wasquickly realizedthatthe
rattles,pingsandknocksof eventswereneededto cementtherealismof theviewing expe-
rience1. What theentertainmentindustryhaslearnedis thatsoundis a critical component
of oureverydaylives.
Theubiquity of soundin our livesis not surprisingwhenthephysicsof contactis
considered.Whencontactoccursbetweentwo objects,the energy of the impactis trans-
ferredto eachobject. This energy propagatesthroughtheobjects,producingvibrationsof
their surfaces.The frequency, amplitudeanddecayof thesevibrationsdepend,in part,on
theshapeandmaterialof theobject[31]. Thesesurfacevibrationscreatefluctuationsin the
air pressurearoundeachobjectandareperceivedassound.Such“contactsounds”provide
a listenerwith muchinformation: locationof contact,contactforce,materialcomposition
of theobjectaswell asits shape,sizeandsurfacetexture.Humanscanusetheseaudiocues1Theemergenceof soundin film in 1926wasdelayedonly by thetechnicaldifficultiesof syn-
chronisingrecordedsoundandfilm. In fact,ThomasEdisonviewedmoving picturesasanaccessoryto thephonograph![29]
1
Draft of 3:03am,Friday, August18,2000 2
to recogniseeventsanddiscriminatebetweenmaterials[10, 12, 16].
Given the importanceof soundto real-life interactions,it is clear that soundis a
requiredcomponentof any successfulvirtual environmentor simulation.Oneapproachto
including soundin virtual environmentsis synthesisfrom physics-basedmodels[30]. A
soundmodelof anobjectcansynthesiseappropriatecontactsounds,givena contactforce
andcontactlocation,by direct computation.By “soundmodel” we meana mathematical
representationthatcanbeusedfor simulationof theobject’s sound.This is in contrastto an
approachwhererecordedsamplesaresimply replayed,or modulatedto accountfor various
forcesandlocations.An advantageof themodel-basedapproachis thatonemodelcanbe
usedto generatesoundsfor any numberof different interactions(e.g.,scraping,pinging,
rolling) by substitutingtheappropriateforceprofile.
Theparametersof asoundmodelcanbederivedanalyticallyfor objectswith simple
geometryandmaterialcomposition[30]. Most everydayobjects,however, have complex
geometryandmaterialcompositionwhich complicateanalyticalsolutions. For suchob-
jects,modelparameterscanbeestimatedfrom empiricalmeasurements[30].
Recentwork on robotic perceptionhasdeterminedthat soundmodelscanalsobe
usedto automaticallyidentify materials[8, 17].
A soundmodelparameterisedover the surfaceof an objectcanbe viewed asan
acousticmapof theobject,analogousto a texturemapin graphics.To createsucha sound
modelof an object from empiricalmeasurementsrequiresrecordingsoundsat hundreds
of locationsover theobject’s surface.To populateevena simplevirtual environmentwith
sonifiedobjectscouldthereforerequirethousandsof measurements.Thesystemdescribed
in this thesisautomatesthesoundmeasurementprocedure.To our knowledge,this system
is the first to automaticallycreatecompletesoundmodelsof objects. This soundmodel
canbe registeredto othermodelsof an object (e.g.,surface,deformation)to representa
completereality-basedmodel.Partsof thiswork werepublishedpreviously: [25], [26].
A review of soundmodelsandmeasurementtechniquesis presentedin Chapter2.
The requirementsof a soundmeasurementsystemandan overview of our systemarein-
Draft of 3:03am,Friday, August18,2000 3
cludedin Chapter3. Thecomponentsof thesystemaresubsequentlydescribed:thesound
acquisitiondevice (Chapter4), asynchronousdataserver (Chapter5), prototypicalmodel
generation(Chapter6) andadaptivesamplingalgorithm(Chapter7). Resultsof typicaldata
collectionsarepresentedin Chapter8, andconclusionson thework arestatedin Chapter9.
Chapter 2
Contact Sounds
2.1 Overview
Contactsoundsarethesoundsproducedby oureverydayinteractionswith theenvironment.
Weusethephrase“everyday”in thecontext of Gaver’sdefinitionof everydaylistening: the
perceptionof eventsfrom thesoundsthey make [9]. Justashecontrastseverydaylistening
to musicallistening, we distinguishcontactsoundsfrom musicalsounds.Musicalsounds
are themselves the intendedresult of someaction; contactsoundsare the unintentional
consequenceof an action. In general,musicalsoundsareharmonicwhile contactsounds
areinharmonic. Thesedistinctionsarenot strict, but serve to separatethe domainof our
taskfrom themodellingof musicalinstruments.
As anillustrativeexampleof contactsounds,imaginethesoundproducedby setting
your coffee mug onto your desk,or pushingit acrossthe desk. The soundof the mug
striking thedesk,andthescrapingsoundof it beingpushed,arebothcontactsounds.The
easewith which thesesoundsarebroughtto mind emphasisesthe importanceof soundto
our everydayexperiences;thoughyou mayhave never consciouslyattendedto thesounds
in reallife, they canbeeasilyrecalled.
Muchusefulinformationis conveyedby contactsounds.Gaverwasthefirst to enu-
meratethis informationin [9]. Heclassifiedtheinformationinto threecategories:material,
4
Draft of 3:03am,Friday, August18,2000 5
Structure
Type ForceDensity Damping Internal
Cavities
Interaction Configuration
Shape Size Resonating
Material
Restoring
Force
Figure2.1: Attributesof contactsounds.Gaver identifiedthreecategoriesof informationconveyedby contactsounds:material,interactionandconfiguration.Eachcategory is com-posedof moreprimitive dimensionsasshown. Adaptedfrom[9].
interactionandconfiguration.Thesecategoriesareexpandedin Figure2.1. Suchinforma-
tion is beneficialto many applications.For instance,it is usefulfeedbackin teleoperative
systems.As mentionedpreviously, theadditionof contactsoundsto virtual environments
andsimulationsenhancestherealismof theenvironment.DurstandKrotkov alsodemon-
stratedtheuseof contactsoundmodelsfor automaticmaterialclassification[8]. Further-
more,recentwork hasrelatedtheparametersof acontactsoundmodelto humanperception
of material[16, 12].
The systemdescribedby this thesisautomaticallyacquiresthe measurementsre-
quiredto createcontactsoundmodels.By “contactsoundmodel”,wemeanamathematical
representationthat canbe usedto synthesisethe soundproducedby contactbetweentwo
objects.Themodelis parameterisedover thesurfaceof anobject,muchlike a texturemap
in graphics.
This chapteris anoverview of themathematicalmodelwe useto representcontact
sounds.A brief descriptionof the modelandits parametersis given. The algorithmfor
estimatingthemodel’s parametersfrom recordedsamplesis alsoexplained.
2.2 RelatedWork
Much researchhasbeenconductedin the field of soundmodelling. In particular, many
peoplehave modelledmusicalinstruments.Physicalmodelling is possiblefor a variety
of musicalinstruments[1]. As oneexample,ChaigneandDoutautmodelledtheacoustic
Draft of 3:03am,Friday, August18,2000 6
responseof woodenxylophonebarsusingaone-dimensionalEuler-Bernoulliequationwith
the additionof two dampingtermsanda restoringforce [3]. This formulationis derived
from thegeometryof thexylophonebars,with empiricalvaluesfor thedampingparameters.
Anotherexampleis Cook andTruemanwho usedprincipal componentanalysis,Infinite
ImpulseResponse(IIR) filter estimationandwarpedlinearpredicationmethodsto model
thedirectionalimpulseresponseof stringedinstrumentsfrom empiricaldata[4].
Gaver introducedthe modellingof everydaysounds[9]. His modelof metaland
woodbarsis basedon thewave equationwith anexponentialdampingterm for eachfre-
quency mode. The fundamentalmode,its initial amplitudeanddampingfactoraredeter-
minedempirically. Partialmodesarecalculatedasratiosof thefundamentalmode.
Recently, a physics-basedmodelfor contactsoundsof everydayobjectswasdevel-
opedby van denDoel [30, 31]. The modelof van denDoel wasselectedfor usein our
work, andwill bedescribedin thesubsequentsectionsof this chapter.
The modelof van denDoel is appealingfor several reasons.First, it is similar in
structureto modelsusedby Gaver [9], Hermes[12] andKlatzky et al. [16] in their per-
ceptionexperiments. Suchperceptualstudiesmay yield resultsuseful to evaluatingthe
performanceof our system.Furthermore,thesestudiesprovide informationapplicableto
ouradaptive samplingalgorithmasdescribedin Chapter7. Secondly, themodelis suitable
for synthesisingsoundsin real-time— a necessarycomponentof interactive simulations.
Finally, it wasselectedfor its effectivenessat representingcontactsoundsof everydayob-
jects.
It shouldbenotedthatthemeasurementsystemdescribedhereinis not restrictedto
producingonespecificsoundmodel.Any soundmodelwhoseparameterscanbeestimated
from recordingsof acousticimpulseresponsesmaybecreatedusingthissystem.
2.3 A Contact SoundModel
Formulatinga generalsoundmodel is complex becauseit dependson many parameters:
material,geometry, massdistribution, etc. This sectionfollows the developmentof the
Draft of 3:03am,Friday, August18,2000 7
contactsoundmodelpresentedin [30, 31].
If we assumelinear-elasticbehaviour1, thevibrationof anobject’s surfaceis char-
acterisedby thefunction . Here representsthedeviation of thesurfacefrom equi-
librium point at time . obeys a wave equationof the form in Equation2.1, where is aself-adjointdifferentialoperatorand is aconstantrelatedto thespeedof soundin the
material[31].
(2.1)
In theabsenceof externalforces,Equation2.1canbesolvedby theexpression
"!$# % & ' (2.2)
where and # aredeterminedby boundaryconditions, arerelatedto theeigenvaluesof
operator and ' arethecorrespondingeigenfunctions.
Whentheobjectis sufficiently far from the listener, thesoundpressuredueto this
solutioncanbe approximatedby the impulseresponsefunction in Equation2.3. The ex-
ponentialtermis addedto modelmaterialdamping.Completedetailsof this derivationare
providedin [30].
( ) )*"+ " ,.- / 021 3 4 5 6 7 8. , - (2.3)
This modelexpressesthe soundpressure( at time , asthesumof 9;: frequency
modes.Here <" is the locationof theforceimpulse.Eachmode = in themodelhasa fre-
quency ",.- , initial amplitude ,.- andexponentialdampingfactor> , - . Thedampingfactor
modelsmaterialdampingdueto internalfriction; in theliterature[32], this is parameterised
by aninternalfriction parameter? asexpressedin Equation2.4.
> ,.- @ , - A B ?" (2.4)1A reasonableassumptionfor light contacts.
Draft of 3:03am,Friday, August18,2000 8
1. ComputethewindowedDFT spectrogramof therecordedsample.
2. Identify thesignal.
3. Estimatethefrequency modes.
4. Estimatethedampingparameters.
5. Estimatetheinitial amplitudes.
Figure2.2: ParameterEstimationAlgorithm.
For eachlocation C on thesurfaceof an object,theparametersDE F G , H E F G and I E.F Gmustbeestimatedfor all JLK modes.
Becausethe model is an impulseresponsemodel,soundsproducedby any linear
forceinteractioncanbesynthesisedby asimpleconvolution of theforceandthemodel.
2.4 Empirical Parameter Estimation
Theparametersfor thesoundmodelasit is expressedin Equation2.1 canbe determined
analyticallyby Equation2.2for objectsof simplegeometryandmaterialcomposition[30].
Of course,to solve this expressionfor arbitraryeverydayobjectsis not feasible.Thealter-
native is to estimatetheparametersof a similar model(i.e., Equation2.3) from empirical
measurements.Thatis, theobjectis struckat position C , theresultingsoundrecorded,and
theparametersestimatedfrom therecording.Examplesof suchtechniquesare[4, 8, 30].
Sinceweareusingthemodelof vandenDoel,wewill usetheparameterestimation
algorithmdescribedin the samework. An overview of the algorithmis describedin this
section.Figure2.2 lists thestepsof thealgorithm;abrief explanationof eachstepfollows.
Readersrequiringmoredetailarereferredto [30].
In the first step,a spectrogramis computedfor the entire recordedsample. The
spectrogramis computedby calculatingthe discreteFourier transform(DFT) on fixed-
Draft of 3:03am,Friday, August18,2000 9
width (in time) segmentsof the recording. The segmentsareselectedusingoverlapping
Hanningwindows. For detailson Hanningwindows and the DFT, a good introduction
is [28].
The signalmustnow be isolatedfrom thebackgroundnoise(Steptwo). Theseg-
mentof thespectrogramwith maximumintensityis thestartof thesignal. This peakcor-
respondsto the onsetof the impact. The endof the signal is the first segment M whose
intensity NPO falls below QN$RTSVUW NYX , where QN is theaverageintensityof theregion before
thesignal’s start, U)W NYX is thestandarddeviation of that region and S is a constantwith a
typical valueof 10.
To estimatethefrequency modes(Stepthree),ahistogramis created.Eachsegment
(in time)of thespectrogramwithin thesignalregioncastsvotesfor the Z;[ frequencieswith
greatestamplitudewithin thatsegment.The Z [ frequenciesthatobtainthemostvotesover
all thesegmentsareselectedasthedominantfrequency modes(\"].^ _ ) of themodelat that
location.
For eachmode,the ` a b of thespectrogramis fit to the linear function cPd"_ MR$e_ ,where M is an index into thesegmentsof thespectrogram,andthesignalstartsat segment
Mfg . Thedampingcoefficientsarethencalculatedby Equation2.5.
h ] ^ _fjiYk l d"_ m Zon (2.5)
where i is thewindow overlapfactorof theDFT, k l is thesamplingfrequency and Z is
thesizeof theDFT window.
The initial amplitudeof eachfrequency modecan then be calculatedby Equa-
tion 2.6.
p ] ^ _frq s t u ] ^ _v c q wx y z t n (2.6)
with u ].^ _"f h ] ^ _ Zm k l .As stated,thealgorithmassumestherecordingis theresponseto animpulsive im-
pact.Thealgorithmcanbeusedwith responsesto otherforcesby de-convolving thesignal
Draft of 3:03am,Friday, August18,2000 10
with theforceprofile.
Again, themeasurementsystemdescribedby this thesiscoulduseany suitablepa-
rameterestimationtechnique,and is not restrictedto the methodoutlined above. This
techniquewaschosenbecauseit is designedspecificallyfor theselectedsoundmodel.
2.4.1 PerformanceEvaluation
An evaluationof theestimationalgorithmwasperformedusingsyntheticdatato determine
its robustnessto noise. Sincethe real sampleswill be recordedin a relatively noisy envi-
ronment,thisevaluationis necessaryto estimatetheexpectedperformanceof thealgorithm
with realdata.
To evaluatetheestimationalgorithm,a testsignalis synthesisedusingthemodelin
Equation2.3. Fifty frequency modesarerandomlyselectedfrom a Gaussiandistribution
(| = 7 kHz, = 4 kHz). For eachfrequency mode,dampingfactors( ~ ) andinitial am-
plitudes( ) arerandomlyselectedfrom auniform distribution of ranges and respectively. Theinitial amplitudesarescaledsothat V" " " .
Noise is addedto this testsignalat a specifiedsignal-to-noiseratio. Most of the
noisein the ACME environmentoriginatesfrom fansusedto cool equipment. A quick
spectralanalysisof theroomrevealedthehighestconcentrationof noisein abandfrom 0 to
200Hz. Ambientwhite noisewaspresent,thoughat lower energy levels. This roomnoise
is approximatedin oursimulationby low-passGaussiannoise,band-limitedat200Hz by a
fourth-orderfilter. Broadbandwhite noiseis addedat of thenormalisedamplitude
of thelow-frequency noise.
Onehundredtrialsof theevaluationwereexecuted.For eachtrial, a testsignalwas
synthesisedwith noiseaddedateightsignal-to-noiselevels: (i.e.,nonoise),100,50,30,
20,15,10and52. Thenoisysignalwasthenprocessedby theestimationalgorithmoutlined
in Section2.4.Theentiretestwasimplementedin Matlab.2The signal-to-noiseratio is calculatedasthe maximumsignalamplitudedividedby the max-
imum noiseamplitude.This yields an appropriatemeasureof signal-to-noisefor our experimentssincethesignaldecaysto zeroamplitude.
Draft of 3:03am,Friday, August18,2000 11
Theestimatedmodelparametersarecomparedby thefollowing metrics.Eachesti-
matedfrequency mode is assumedto betheestimationof thegeneratedfrequency mode thatminimisesthedifference $ o) ¡ . A logarithmicerror ratio (Equa-
tion 2.7)wascomputedfor eachestimatedmodelparameter. Themeansof theseerrorratios
areplottedin Figure2.3.
¢¤£ ¦¥ § ¨j©£ ª£ «¢¤¬ ¦¥ § ¨ ©¬ ª¬ «¢¤ ¦¥ § ¨ © ª «(2.7)
where is chosento minimise 2® ;) ¡ .As illustratedby Figure2.3, theparameterestimationalgorithmis extremelysen-
sitive to noise. For estimatesof initial amplitudeandfrequency, the error increasescon-
sistentlywith increasinglevels of noise(Figure2.3 (a,b)). The trendis not asconsistent
for thedampingparameterestimate(Figure2.3 (c)). However, asthe dashedline in Fig-
ure2.3(c) indicates,thevarianceof estimationerrorincreasesdramaticallywith increasing
noise.Most importantly, it shouldbenotedthat frequency estimateshave anerror ratio of
almost30 atevenmodestlevelsof noise(SNR= 100).
Draft of 3:03am,Friday, August18,2000 12
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
noise/signal
Ea
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.210
15
20
25
30
35
40
45
50
55
60
noise/signal
Ef
(a) Meaninitial amplitudeerrors( ¯°P± ) (b) Meanfrequency errors( ¯°P² )
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20
0.5
1
1.5
noise/signal
mea
n E
d
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20
50
100
150
200
250
300
350
400
450
std
dev
of m
ean
Ed
Mean Std Deviation
(c) Meandampingerrors( ¯°¤³ )Figure2.3: Meanerrorsof theparameterestimationalgorithmonsyntheticdata.Themeanerror over 100 trials is plotted for eight signal-to-noiseratios: ´ , 100, 50, 30, 20, 15,10 and5. For convenience,the inverseratio (noise/signal)is usedasthe abscissa.Sub-figure(c) alsoplotsthestandarddeviation of errorin thedampingparameterestimate.SeeEquation2.7for definitionsof
°P²,°P±
and° ³
.
Chapter 3
A Systemfor Automatic
Measurementof Contact Sounds
3.1 Objectives
Thegoal of this work is to producea systemwhich automaticallyacquirescontactsound
measurementsover the surfaceof an arbitraryobject for the purposeof creatinga sound
model. This systemis one componentof a larger project focusedon creatingcomplete
reality-basedmodels.A reality-basedmodelis onewhoseparametersarecalculatedfrom
empirical measurementsof real objects. A completereality-basedmodel is inherently
multi-modal, with soundcomprisingonly one part. Other modescould include surface
texture,shapeanddeformation.
As part of the reality-basedmodelling project, our soundmeasurementsystem
sharesits developmentplatform: theActive Measurement(ACME) facility at theUniver-
sity of British Columbia(UBC). TheACME facility is a fifteendegree-of-freedom(DOF)
robotdesignedto acquiremeasurementsfor reality-basedmodelcreation[21]. Thefacility
is equippedwith a 5-DOF gantry, 6-DOF robot arm on a linear stagestageanda 3-DOF
test stage. Theseactuatorsare usedto position sensorsand make measurementsof ob-
jectsmountedon the teststation. Sensorsincludea 3-CCD colour camera,Triclops [22]
13
Draft of 3:03am,Friday, August18,2000 14
(x,y,z)model
model(x,y,z)
Scope of thesis
(x,y,z)Force
Simulation
mix
er
Position
estimateSound Measurement
Sound Measurementestimate
(x,y,z)Convolve Environmental Effects
(reverb, spatialisation, etc.)
Figure3.1: Thesoundmeasurement-to-production pipeline. Measurementsat many loca-tionson anobject’s surface(x, y, z) areacquiredto createa modelat eachlocation.Forceandposition input from a simulationareusedto synthesisesoundsproducedby interac-tions with the object. A mixer algorithminterpolatesbetweensoundmodelsat differentlocations. Environmentaleffectssuchasreverb andspatialisationenhancethe realismofthesynthesisedsound.Ourscopeis restrictedto measurementandmodelling(shadedbox).
trinocularvisionsystemanda 6-DOFforce/torquesensor.
Thework that is thesubjectof this thesisincludesthedesignandimplementation
of acontactsoundmeasurementsystemfor ACME. As describedin Section3.3,thesound
measurementsystemincludesall the hardware and software neededto meet the design
requirementslistedin Section3.2.
The scopeof the project is restrictedto a systemfor the measurementof contact
sounds.Relatedresearch,includingsoundsynthesis,interpolationbetweensoundmodels
andmodellingof theenvironmentis outsidethescopeof this thesis.Figure3.1placesour
projectin thecontext of thesoundmeasurement-to-production pipeline.
3.2 SystemRequirements
To be successful,a soundmeasurementsystemmustbe ableto deliver low-inertia, near-
impulsive impactsover thesurfaceof anarbitraryobject.Modelsgeneratedby thesystem
mustberegisteredto thesurfaceof theobjectfor integrationwith othermodesof thecom-
pletereality-basedmodel.
The forceprofile of eachimpactmustbeknown sothat theparametersof a sound
Draft of 3:03am,Friday, August18,2000 15
modelcanbeestimatedasdescribedin Section2.4.
The systemshouldacquiremodelsautomatically. Enoughsamplesshouldbe col-
lectedto adequatelyrepresentthe variation in soundover the entireoutersurfaceof the
object.
Sincethe systemwill be a componentof ACME, all devicesusedby the system
mustbe controllableunderthe ACME framework. That is, an interfaceto every device
mustexist in theACME Java architectureandbeoperablefrom aremoteworkstation.
It is assumedthatasurfacerepresentationof thetestobjectis providedto thesound
measurementsystemby theACME trinocularcamera.This aspectof thesystemis not an
objective of thecurrentresearchandis notdiscussedherein.
More detailedrequirementsof eachcomponentof thesoundmeasurementsystem
arepresentedin therespective chapters.
3.3 SystemOverview
A soundmeasurementsystemwas designedand implementedto meetthe requirements
listedin Section3.2.Thesystemcanbeviewedasfour modules:asoundacquisitiondevice,
anasynchronousdataserver, analgorithmfor computingprototypicalsoundmodelsandan
adaptivesamplingalgorithm.Eachof thesemoduleswill bediscussedin thefollowing four
chapters.
Chapter4 outlinesthe requirementsandimplementationof the soundacquisition
device. This includesthedesignof a soundeffectorfor theACME robotarm,thesoftware
to controlit andtheselectionandlocationof microphones.Thesoundeffectoris asolenoid
thatmountson theendof thePuma260robotarmanddeliversnear-impulsive impactsto
objectsundercomputercontrol.
A descriptionof the asynchronousdataserver is provided in Chapter5. The data
serverarchitectureis adaptableto many typesof data.Its designandspecialisationto sound
dataarediscussed.
Oneadvantageof usinga robotic systemis the ability to recordmultiple samples
Draft of 3:03am,Friday, August18,2000 16
at thesamesurfacelocation. A collectionof samplescanbeusedto generatea prototyp-
ical modelwhich bestrepresentsthesoundat that location. Onealgorithmfor generating
prototypicalmodelsis explainedin Chapter6.
Thefinal moduleis anadaptive samplingalgorithmfor selectingpointson theob-
ject’s surfaceat which to sample.Thesamplingalgorithmusestheobject’s surfacerepre-
sentationto createa meshof samplelocations.It is adaptive becauseit usesdifferencesin
soundmodelsto adjustthe granularityof the samplingmesh. The surfacerepresentation
andadaptive algorithmaredescribedin Chapter7.
The generalprocedurefor creatinga completesoundmodelusing this systemis
diagrammedin Figure3.2. First, a testobject is positionedat the centerof the test sta-
tion (Figure3.3). Typically the object is attachedsuchthat it doesnot move from light
contact,yet remainsfree to vibrate. A surfacemodelof theobject is thenacquiredusing
the trinocularstereocamera.From this surfacemodel,a coarsesamplingmeshis created
usingthe surface’s verticesassamplelocations. Next, the soundeffector is positioned5
mmawayfrom eachsamplelocation.This is donewith aguardedmove: thesoundeffector
approachesthesamplelocationuntil contactis sensed,thenretracted5 mm. Oncein po-
sition, thesoundeffector is actuatedmultiple times,theresultantsoundsamplesrecorded,
andmodelscomputed.From the multiple samples,a prototypicalmodel is producedfor
eachsamplelocation. If two prototypicalmodelsat adjoiningverticesof thesurfacemesh
differ by toomuch,thesamplingmeshis refined,andanew modelis acquiredataposition
betweenthetwo coarselocations.Thisprocedurecontinuesuntil nofurtherrefinementsare
required.
Thecompletesoundmodelproducedby this procedureis easilyregisteredto other
modesof theobjectmodelsinceit createssoundmodelsat verticesof theobject’s surface
model.
Resultsof sometypical datacollectionsarepresentedin Chapter8. Theaccompa-
nying CD containsaudiotrackscomparingrecordedsoundsto soundssynthesisedfrom a
model,andaJava applicationfor browsingthesoundmodelof anobject.
Draft of 3:03am,Friday, August18,2000 17
neighbouring sample locations
Compute acoustic distance to
> threshold?
Is distance
times
Repeat M
Create prototypical
model
End
Yes
Yes
Yes
No
No
No
Refine sample
grid
sampled?
Is grid completely
model
Acquire surface
model
Select next
Position test
object
Move sound effector
sample location
Start
to sample location
and record sample
Create sound
Actuate sound effector
Move sound effector
back 5 mm
sensed?
Is contact
Figure3.2: Generalprocedurefor soundmodelcreation.
Draft of 3:03am,Friday, August18,2000 18
Figure3.3: A testobjecton theACME teststation.
Chapter 4
A SoundAcquisition Device
4.1 Overview
In orderto estimatethe parametersof an object’s soundmodel,a recordingof the object
beingstruckwith aknown forceis required.Theprocessof estimatingparametersfrom the
recordingis describedin Section2.4. Sinceour goal is to createsoundmodelsautomati-
cally, werequireadevicewhich is capableof strikingarbitraryobjectsin amannersuitable
for theestimationprocess.A device is alsorequiredto recordtheresultantsounds.More
detailedrequirementsof thesedevicesarelistedin Section4.3.
Thischapterdescribesadevicethatwasdesignedto strikeobjectsfor thepurposeof
soundmeasurement.Thedeviceis anendeffectorfor thePuma260robotarmwhichis part
of theActive Measurementfacility at UBC. TheActive Measurementfacility is described
in Section4.4. Detailsof theendeffector andsoundcapturinghardwarearepresentedin
Sections4.5and4.6.
Section4.7describesthesoftwarerequiredto controltheendeffector. Thesoftware
usedto recordthesoundsamplesis thetopic of Chapter5.
19
Draft of 3:03am,Friday, August18,2000 20
4.2 RelatedWork
Althoughthis is thefirst work to automaticallycreatesoundmodelsover theentiresurface
of anobject,otherpeoplehave investigatedsoundmodelcreationfrom recordingsof real
objects.Themechanismsby which theserecordingswereproducedarevaried.
Cook andTruemanrecordeddirectionalimpulsesfrom a numberof acousticin-
strumentsby striking theinstruments’stringswith aModalShopModel086C80miniature
force hammer[4]. The soundwasrecordedusinga twelve-microphoneicosahedralgrid
assembly. Recordingswerestoredusingtwo TascamDA-88 digital audiotaperecorders.It
wasnot mentionedhow preciselythe impactlocationswereregisteredin space,sincereg-
istrationwasnot importantto thestudy. Impactswereperformedmanually, andthe force
profileof eachimpactwasusedto filter “bad” impacts.Thecommercialforcehammerused
by CookandTruemanis not suitedto our taskbecauseit cannoteffect animpactunderits
own power. While it couldbemountedat theendof a robotarm,theswingingof ahammer
is difficult to control.
DurstandKrotkov createdimpulseresponsemodelsof differentmaterialsby drop-
ping an aluminumcanewith a plastic tip onto eachmaterial[8]. The canewasdropped
from a constantheightthrougha cylindrical guide[17]. Recordingsweremadeusingan
omni-directionalcondensermicrophoneconnectedto a PC-basedA/D board.This method
is clearlynot suitablefor our task.
van denDoel createdsoundmodelsfrom impactswith a non-instrumentedham-
mer [30]. Again, thehammermechanismis not suitedto our application;this experiment
is mentionedbecausesatisfactoryresultswereobtainedwith only acrudeapproximationof
theimpactforce.
Huangusesa tappingdevice to position and orient objectsin a plane[13]. His
requirementsof thedevice aresimilar to ours,excepthedoesnot requirethedevice to op-
erateoff thehorizontalplane.Themechanismheproposesis similar to apinballplunger;a
spring-loadedrod is releasedby anelectriclatch, thenreloadedautomatically[W. Huang,
personalcommunication,April 28, 2000]. Unfortunately, his device maynot operatecor-
Draft of 3:03am,Friday, August18,2000 21
rectly with any verticalinclination.
4.3 Requirements
The striking device must deliver low-inertia impactsto objectsat any position on their
surface.This requiresoperationthrougha µY¶ · ¸ verticalrange.
Theforceof impactshouldnotbesostrongasto movetheobject,yetstrongenough
to producea recordablesound. The quantityof force is dependenton the materialbeing
measured.
The force must be near-impulsive. A true impulseis not realisable,but may be
approximatedby a force that is well localisedin spaceandtime. As demonstratedby van
denDoel,suchapproximateimpulsesyield satisfactorysoundmodels[30].
To permitfutureexperiments,thedeviceshouldbeusablefor scrapingobjects.This
will permitacquisitionof soundsfor granularsynthesis,anothermodelfor soundsynthesis.
For integrationwith othermodelsproducedusingtheActive Measurementfacility,
thelocationsof impactmustberegisteredto acommonframe-of-reference.
4.4 The ActiveMeasurementFacility
As mentionedin the first chapter, our systemis a componentof a larger projectat UBC:
theActiveMeasurementfacility (ACME). TheACME facility providesa rich environment
for our measurementsystem. Currently, ACME consistsof threemain subsystems:the
field measurementsystem(FMS),teststationandcontactmeasurementsystem(CMS)[21].
Thesesubsystemsareusedasabaseplatformfor thehardwarerequiredby thesoundmea-
surementsystem.Therolesof eachsubsystemareexpandedbelow.
Thefield measurementsystemis a five degree-of-freedom(DOF) robotconsisting
of a 3-DOF gantry and a pan/tilt unit. Currently a 3 CCD colour cameraand Triclops
trinocularstereocameraareattachedto thepan/tilt unit. TheFMS is usedto acquiremea-
surementsat a distancefrom thetestobject(i.e., in “the field”). A microphoneis addedto
Draft of 3:03am,Friday, August18,2000 22
theFMSfor makingsoundmeasurements.
Theteststationis a3-DOFrobotwhichcanposition(x, y) andorientobjectsbeing
measured.Its accuracy is ¹Yº.» º º º ¼ ½ ¾ ¾ and ¹L¿ º arc-min[21].
Thecontactmeasurementsystemconsistsof a6-DOFPuma260robotarmwith an
ATI force/torquesensormountedat its tip. TheCMScanmoveanendeffectorinto contact
with the testobjectin a working volumearoundthe teststation. To preparetheCMS for
soundmeasurement,thethreefansof thePumacontrolboxwerereplacedby whisperfans.
ThePumaarmcannotbeusedto directlystrike theobjectsbecauseits inertiapreventslight,
impulsive impacts.Instead,aspecialendeffectoris attached.Theendeffectoris described
in Section4.5.
Eachof thesesubsystemsis controlledremotelyusingJava-basedcontrolsoftware.
For moredetailson thisarchitecture,referto [21].
UsingtheACME facility asa developmentplatformprovidesregistrationof sound
samplesto other models(e.g., deformationmodels)sinceall ACME sensorsand actua-
torssharea commonframe-of-reference.Thesoundmodelis specificallyregisteredto the
surfacemodelof thetestobject.
4.5 SoundEffector
Wereferto theendeffectordesignedto strike objectsfor soundmeasurementasthesound
effector (seeFigure4.1). This device is centeredon anelectricpush-solenoid(Ledex STA
model195025-227).Thesolenoidis augmentedwith areturn-springandmountingbracket.
A retainingmechanismis designedinto themountingbracket to provide somerigidity in
thede-energisedstateof thesolenoid.This rigidity enablesthesoundeffector to beused
for scrapingobjects.Themountingbracket is constructedfrom aluminumplate.
Thecurrentdesignis minimal soasto reducetheweightof theeffector. For exam-
ple,giventhata threadedrod is availableon therobot’s interfaceplate,thedesignincorpo-
ratestherod for bothattachmentpurposesandastheaforementionedretentiondevice. The
total torqueappliedto the rod by theeffector is approximately0.0653N Àm (total weight:
Draft of 3:03am,Friday, August18,2000 23
Figure4.1: Thesoundeffector is a spring-returnpushsolenoidmountedon analuminumbracket. A condensermicrophoneis attachedto thebottomof thebracket.
106 g) — within acceptablelimits of the force/torquesensor(500 g). A blueprintof the
mountingbracket anddetailsof its constructionareincludedin AppendixA.
An interfacecircuit is requiredto activatethesolenoidfrom software. Thecircuit
schematicis includedin AppendixA. Theinterfacecircuit is connectedtoadigital outputof
a PrecisionMicroDynamicsMC8 board.Theboardprovidesa 5 VDC outputcontrollable
from its on-boardSHARCDSPor thehostcomputer. CurrentlytheMC8 boardis alsoused
to run a PID control loop for theFMS andteststation.A descriptionof thesoftwareused
to controlthesoundeffector is presentedin Section4.7.
4.6 SoundCaptureHardware
Thesoundcapturehardwareconsistsof two condensermicrophonesandaPCsoundcard.
The microphonesareOptimusomni-directionallapel microphoneswith a flat fre-
quency responsefrom 70 to 16 000Hz [14]. Onemicrophoneis attachedto thebottomof
thesoundeffector’s mountingbracket (Figure4.1); theotherto thepan/tiltunit of theFMS.
This placementenablesnearand far-field recordingof impactsounds. The microphone
Draft of 3:03am,Friday, August18,2000 24
on the FMS canbe moved to any location aroundthe object for experimentsevaluating
directionalimpulseresponses.
A Creative SoundBlasterLive! card is usedto recordthe soundsdigitally. This
card is commerciallyavailable and cansampleat up to 46 kHz [5]. Using a PC sound
cardfacilitateseasyandaffordableupgradesastechnologyimproves.Onedisadvantageof
thecardis thatonly two channelsof soundmayberecordedsimultaneously. Furthermore,
both channelsmustusethe line-in connection,sincethe microphoneconnectionis single
channelled.Thisrestrictsoursystemto usingtwo microphones,bothof whichmustbepre-
amplifiedto line levels. If moreinput channelsarerequired,a high-endsoundcardcould
bepurchased.
4.7 SoundEffector Software
Activation of the soundeffector requiresa unit stepsignal from the digital outputof the
MC8 board. The stepwidth determinesthe stroke distanceof the solenoid. Initially, the
stepfunctionwasto begeneratedin theinterfaceelectronics.A softwaresolution,however,
enablesusto changethewidth of theunit step,andhencestroke length,at runtime.To con-
trol this outputusingACME requiresa Java interfacecompliantwith theACME Device
interface. This sectiondescribesthedesignof theACME DigitalOutputConnect-
ionServer (DOCS)— aninterfacebetweenACME andthedigital outputsof theMC8
board. This interfaceis usedby thesoundeffector, but canalsobeusedby otherdevices
requiringsimilar control(e.g.,lights).
TheDigitalOutputConnectionServer hasthreemain components(Fig-
ure 4.2): a ConnectionServer, OutputServer andoneor moreDigitalOut-
putDevices (e.g.,soundeffector, or spotlight). The MC8 boardhas32 digital output
linesavailablefor externaldevices[23]. TheConnectionServer managesthealloca-
tion of eachoutputline to a specificDigitalOutputDevice. TheOutputServer
is a ‘C’ program(with a Java native interface)which controlstheinitialisation,timing and
output of the signal to the MC8 board. The DigitalOutputDevice is an abstract
Draft of 3:03am,Friday, August18,2000 25
H/WRMI
MC8
Server
Connect’n
ACME Server
Experiment
Device
DigOut
Server
Output
Figure4.2: Digital OutputConnectionserver architecture.
implementationof theACME Device interface. Eachdevice usingtheDigitalOut-
putConnectionServer is controlledby a classextendingDigitalOutputDev-
ice. Eachdevice mustlock onechannelof the MC8 for its exclusive useby registering
with theConnectionServer at initialisation.
TheOutputServer andConnectionServer run asseparateprocessesfrom
theACME server. Although theOutputServer andConnectionServer resideon
theSolariscomputerhostingtheMC8 board,devicescanbecontrolledfrom ACME exper-
imentsrunningon any computerbecausetheDigitalOutputDevices communicate
to theConnectionServer usingtheJava RemoteMethodInvocation(RMI) interface.
As mentionedabove, timing of theunit stepis controlledby theOutputServer.
This ‘C’ programprovides resolutionbetterthan a millisecond,limited only by the So-
laris operatingsystem.Isolatingthetiming from theACME experimentensuresconsistent
timing of theoutputsignal.
Chapter 5
An AsynchronousData Server
5.1 Overview
Thenext moduleof thesystemis thesoftwareusedto recordsoundsproducedby striking
the test object. Previously, no genericarchitectureexisted within ACME for capturing
streamingdata.As with theDigitalOutputConnectionServer (seeSection4.7)a
genericdataserver wasdesigned,andthenspecialisedto sounddatafor this research.The
genericdataserver framework becametheSensor classof theACME project.
This chapterdescribesthe designand implementationof an asynchronousdata
server for ACME, including its specialisationto sounddata. The next sectionlists the
requirementsof suchsoftware. Section5.3 describesthe architectureof the dataserver.
Implementationdetailsarediscussedin Section5.4.
5.2 Requirements
This sectionoutlinestherequirementsof a genericdataserver (Section5.2.1)andits spe-
cialisationto sounddata(Section5.2.2).
26
Draft of 3:03am,Friday, August18,2000 27
5.2.1 GenericData Server Requirements
Onedataserver mustexist for eachsensordevice. If multiple sensordevicesof thesame
datatypeexist, eachsensormusthave its own dataserver process.
Eachdataserverwill runasaseparateprocess.Thisdivisionenablesdistribution of
theserverprocessesovermultiplecomputers.If thesensorhardwareresidesonacomputer
that is not the main ACME host, the sensorprocessshouldalsoresideon that computer.
Sincedatacollectioncanbeanintensive operation,distribution increasesthedataservers’
ability for real-timecollectionby allocatingmoreprocessingresources.Distribution also
reducestheamountof datawhichmustbestreamedover thenetwork in real-time.
If platform-dependentsoftwareis requiredfor a dataserver, it mustnot preventthe
ACME experimentfrom accessingthedatafrom adifferentoperatingsystem.
The dataserver mustbe asynchronous.That is, the dataserver processcontrols
thestartingandstoppingof datacollectionindependentlyof themainACME server. This
autonomyeliminatesthe needfor the dataserver to streamdatato the ACME server in
real-time.Sincesensordatacanbelarge(e.g.,imagedatafrom cameras)or frequent(e.g.,
44.1kHz for sound)it is implausibleto transmiteachframeof datato theACME server for
real-timemonitoring.
Thecriteriafor startingandstoppingdatacollectionmustbedefinableby theACME
experiment.
A methodmustbe provided to the ACME experimentthat indicateswhendatais
beingcollectedandwhenit hasfinished.
5.2.2 SoundServer Requirements
Thesoundserver mustcapturedataat at leasttwo samplingrates:44 100Hz and22 050
Hz. If the capturinghardware supportshigher samplingrates,the soundserver should
accommodatethemby userdefinableproperties.Theformatof thedatamaybe8 or 16-bit,
andoneor morechannels(to thelimitationsof thecapturinghardware).
SincePCsoundcardsarereadilyavailableandof sufficientqualityfor ourpurposes,
Draft of 3:03am,Friday, August18,2000 28
thesoundserver will capturedatafrom acommercialsoundcard.
5.3 Server Ar chitecture
Thedataserver architecturehasfour maincomponents:sensorhardware,aSensorSer-
ver, SensorDevice, and a datastreamconnectingthe SensorServer andSen-
sorDevice (Figure5.1). The sensorhardwarecomponentis an abstractionof both the
physicalhardwareanddevice drivers. Typically, a native interfaceis createdto allow the
device to communicatewith otherJava components.Also, althoughincomingdatais typ-
ically bufferedat thehardwarelevel, it cannotbeguaranteedto not drop framesof dataif
readtoo slowly. TheSensorServer is aprocesswhich resideson thecomputerhosting
thesensorhardware. This processqueriesthe sensorhardwareat a prescribedrate,starts
andstopsdatacapture,andbuffers the incomingdata. TheSensorDevice is a remote
interfaceto theSensorServer. It communicatesto theSensorServer via the Java
RemoteMethodInvocation(RMI) protocol. TheSensorDevice residesin the ACME
serverandis theACME experiment’s link to theSensorServer. A datastreamconnects
the SensorServer andSensorDevice. The streamusessocket communicationto
passdatafrom theSensorServer to theSensorDevice. Datais writtento thestream
by theSensorServer andbuffereduntil it is readby theSensorDevice. Thus,data
canbereliablystreamedto theACME experimentata rateslower thanits acquisitionat the
hardwaredevice.
This architectureenablesus to acquiredatafrom any computer, regardlessof its
platform. Only the SensorServer usesplatform-dependentnative code; the Sen-
sorDevice canresideon any remotecomputerthathostsa Java Virtual Machine.Since
eachSensorServer is written for a specificdevice operatingon a fixed platform, this
dependenceis not restrictive.
Not all datais passedfrom thesensorhardwareto theACME experiment.Usersde-
fine thecriteria for startingandstoppingdatacaptureby creatingcustomSensorTrig-
gers. OneSensorTrigger is createdto start capturing,and one to stop capturing.
Draft of 3:03am,Friday, August18,2000 29
Data Stream
RMI
control flow
data flow
Experiment
ACME Server
Sensor
Hardware Server
Sensor
Device
Sensor
Figure5.1: Thearchitectureof thedataserver is dividedinto four maincomponents:sensorhardware,SensorServer, SensorDevice anda datastream.TheSensorServerrunsasa separateprocessandis not necessarilyrun on thesamecomputerastheACMEserver.
TheseSensorTriggerscanbe monitoredby otherprocessesat runtimeby registering
aTriggerListener object. TheTriggerListener objectwill benotifiedwhena
SensorTrigger is activated.This mechanismprovidesa roughsynchronisationtool to
the ACME experiment. The next sectionclarifies the role of the SensorTriggers in
server execution.
5.3.1 Server Execution
This sectiondescribestheoperationof thedataserver. Its operationcanbestbeviewedas
thestatediagramin Figure5.2. Theserver beginsin aLIMBO state.This is a default state
indicatingthat it hasbeencreated,but not yet initialised. TheACME server initialisesthe
dataserver at startup.Thedataserver theninitialisessensorhardwareandcreatesrequired
databuffers.After initialisation,thedataserver waitsin aSTOPPED state.
The RUNNING stateis enteredby a requestfrom the ACME experimentto start
spoolingdata. In the RUNNING state,the dataserver executesa control loop at a rate
prescribedby the ACME experiment. The control loop is listed in Figure5.3. At each
Draft of 3:03am,Friday, August18,2000 30
START
startSpooling()
Trigger
init()STOPPEDLIMBO
CAPTURING RUNNING
Trigger
STOP
Figure5.2: Dataserver statediagram.
1. Readnext framefrom sensorhardware.
2. Evaluateframein START trigger.
3. If START triggerfires,changestateto CAPTURING.
4. Otherwise,loop to step1.
Figure5.3: Controlloop for RUNNING state.
iteration,the next frameof datais requestedfrom the sensorhardware. Here,a frameis
definedasa datameasurementfor onetime period(e.g.,oneimagefrom a camera,or one
6-valuereadingfrom a force/torquesensor).Thecurrentframeis passedwith a window of
previous datato the START trigger. If the conditionsfor the START trigger aresatisfied
by thecurrentframe,thedataserver is placedinto theCAPTURING state.Otherwise,the
RUNNING controlloop repeats.
OncetheSTART triggerhasbeenactivated,thedataserver is in theCAPTURING
state.TheCAPTURING staterunsacontrolloopsimilarto thatfor theRUNNING state,with
theexceptionthatdatais passedontothedatastream(Figure5.4). Datais readoneframe
Draft of 3:03am,Friday, August18,2000 31
1. Readnext framefrom sensorhardware.
2. Evaluateframein STOPtrigger.
3. If STOPtriggerfires,changestateto STOPPED.
4. Otherwise,sendframeto datastream.
5. Loop to step1.
Figure5.4: Control loop for CAPTURING state.
Trigger1 frame1 frame
Buffer
Transfer
RUNNING loop stops here
Buffer
HardwareBuffer
Data
Stream
Ring
reference
copy
Figure5.5: Server dataflow.
ata time from thesensorhardware,thenanalysedby theSTOPtrigger. If theconditionsof
theSTOP trigger aremet, thedataserver enterstheSTOPPED state.Otherwise,the data
is addedto thestreamandthecontrol loop repeats.Section5.3.2elaborateson theflow of
datathroughoutthisprocess.
5.3.2 Data Flow
Theprecedingdiscussionof server controlloopspresentedasimplifiedview of theflow of
datathroughthesystem.More detailsareincludedin thissection.Figure5.5 illustratesthe
flow of datathroughtheserver. Both theRUNNING andCAPTURING controlloopsusethe
samedataflow structureto thepoint indicatedin Figure5.5.
At eachiterationof the control loops,one frameof datais readfrom the sensor
hardware. As mentionedin Section5.3, the sensorhardware typically provides a data
Draft of 3:03am,Friday, August18,2000 32
buffer; this occursat either the hardware or device driver level. This buffer is assumed
to overwrite old dataasthe buffer fills. Thus, if datais not readquickly enough,frames
maybelost. It is assumedthateachframereadhasnotbeenreadpreviously.
This frameof datais copiedinto a ring buffer thatresidesin theSensorServer.
TheSensorServer guaranteesthat the datain the ring buffer will not be overwritten
until it hasbeenexamined.
The ring buffer is passedto the trigger object(eitherSTART or STOP depending
on thesystemstate).The trigger hasa definedwindow sizefor looking at thedata. This
window sizemustbe smallerthanthe sizeof the ring buffer andgreaterthanor equalto
oneframe.By definingawindow sizegreaterthanoneframe,thetriggercanperformtime-
domainfiltering of the signal,or usea time-dependenttrigger condition. An exampleof
sucha trigger is onethat is activatedby a signalthat is greaterthantheaverageof thepast
five frames.
If theserver is in theRUNNING state,thedataflow endshere.In theCAPTURING
state,thenew frameof datais appendedto a transferbuffer. Oncethetransferbuffer is full,
it is written to thedatastream.Thesizeof thetransferbuffer is controlledby theuser. For
datawhich is sampledat a high rate,it maybedesirableto buffer severalhundredframes
beforewriting to thestream.This reducestheamountof communicationoverheadneeded
to transmiteachframeover thenetwork.
TheACME experimentcanmonitor thedatacaptureby queryingthedatastream.
Oncethedatacaptureis complete,thedatastream’s end-of-fileflag is raised.
5.3.3 Specialisationto SoundData
The ACME soundsensorserver is a specialisationof the genericdataserver model. The
specialisationis straightforwardwith oneexceptionthatis explainedbelow.
Sounddatais typically capturedfrom the soundcard at a frame rate of 44 100
Hz. Oneframeof sounddatacontainsoneor two channelsof 16 or 8-bit soundsamples
(Figure5.6).
Draft of 3:03am,Friday, August18,2000 33
One AudioPacket
One Audio Frame (2 x 16-bit channels)
HL
LR
LRÁ Á Á Á Á Á Á ÁÁ Á Á Á Á Á Á ÁÁ Á Á Á Á Á Á ÁÁ Á Á Á Á Á Á ÁÁ Á Á Á Á Á Á Á
       Â       Â       Â       Â       Âà à à à à à à Ãà à à à à à à Ãà à à à à à à Ãà à à à à à à Ãà à à à à à à ÃÄ Ä Ä Ä Ä Ä Ä ÄÄ Ä Ä Ä Ä Ä Ä ÄÄ Ä Ä Ä Ä Ä Ä ÄÄ Ä Ä Ä Ä Ä Ä ÄÄ Ä Ä Ä Ä Ä Ä Ä
LH
Figure5.6: An AudioPacket containsmultiple framesof audiodata.Eachaudioframecontainsoneor morechannelsof data.Eachchannelcontainsoneaudiosample.Here,twochannels(L, R) of 16-bit (2-byte)samplesareillustrated.
As explainedin Section5.3.2,a lot of processinganddatamovementoccurswith
the receiptof eachframeof data. Although datais passedby referencewherepossible,
thereareinevitably datacopiesandmemoryallocationswhennew datais passedontothe
datastream.Thereis alsosomecomputationaloverheadeachtime thetriggeranalysesthe
data.
The genericserver modelcannotprocessincomingdataat audioframerates;be-
causedatais not readquickly enoughfrom thehardware,framesaredropped.To achieve
desiredframerates,audiodatamustbeprocessedin groupsof morethanoneaudioframe.
A simpleredefinitionof “frame” sufficesto achieve real-timeperformancewith thegeneric
dataserver model. Herewe defineanAudioPacket to be an arrayof oneor moreau-
dio frames(Figure5.6). TheAudioPacket becomestheconceptual“frame” thatis read
from thesoundsensorhardwareby thesoundserver. Performanceof thesoundserverusing
AudioPackets is discussedin Section5.4.1.
5.4 Implementation Details
Thegenericdataservermodelis anabstractJavaclass,with interfacesdefinedfor Sensor
andTrigger objects.Thesoundserver is alsowritten in Java with theexceptionof the
codethatreadsdatafrom thesoundcard.
Draft of 3:03am,Friday, August18,2000 34
Originally, datawasto bereadfrom thesoundcardusingtheJavaSoundAPI from
SunMicrosystems.At the time of development,JavaSoundwasa Beta release(version
0.86). Not all requiredfunctionality wasavailable,andfrequentchangesto the API ren-
deredthisoptionundesirable.
Currently, datais readfrom thesoundcardusingtheMicrosoft DirectX API (ver-
sion6.0). A Java wrapperwaswritten to interfacetheDirectX ‘C’ codewith therestof the
soundserver. Everyeffort wastakento createaJavawrapperthatparallelledtheJavaSound
API sothatit couldbesubstitutedata laterdate.
Two triggersareusedto capturesoundfor modelling:aThresholdTriggerand
FixedDurationTrigger. TheThresholdTrigger is usedasa START trigger to
begin capturingdataafter the amplitudeof the signalexceedsa threshold.TheFixed-
DurationTrigger is usedasa STOPtrigger to make eachrecordingthesamelength.
Thiscombinationof triggersyieldsrecordingsof theimpulseresponsesthatareidenticalin
lengthandcontainapproximatelythesamelengthof silencebeforeeachimpact.
5.4.1 PerformanceEvaluation
The soundserver wassubjectedto a performanceevaluationto guaranteethe integrity of
therecordings,andestimateboundsonthesizeof theAudioPacketsandtransferbuffer.
A one-second1 kHz sinusoidaltone was usedas the test stimulus. A cableconnected
the line-level outputof onecomputer1 to the line-level input of a secondcomputer2. The
soundserver ran on the secondcomputerand transferreddatato a client runningon the
first computerover an Ethernetconnection.A FixedDuration trigger wasusedasa
STOPtrigger to record2.5 secondsof soundfor eachtrial. In eachtrial, thesizeof either
the transferbuffer or AudioPackets wasvaried. The testwasrun five timesfor each
parametersettingwith the one-secondtoneplayedat a randomstart time within the 2.5
secondrecordingwindow. TherecordedsoundwassavedasaPCMwavefile andexamined
visually andaudibly for evidenceof degradation.Recordingswerecapturedat a 44.1kHz1A dualPII 350MHzrunningWindowsNT with a CreativeSoundBlasterLive! soundcard.2A Pentium120MHz runningWindows98 with a CreativeSoundBlasterLive! soundcard.
Draft of 3:03am,Friday, August18,2000 35
Time
Fre
quen
cy0 0.5 1 1.5 2
0
0.5
1
1.5
2
x 104
Figure5.7: Spectrogramof soundcontaining“pops”. Three“pops” arevisible in thespec-trogramas vertical lines at approximately1.4, 1.5 and 1.65 seconds.These“pops” arecausedby droppedaudioframes.This samplewasrecordedusinga transferbuffer of onesecond,andanAudioPacket sizeof 200. Therecordingparametersweretwo-channel,16-bit soundsampledat44.1kHz.
samplingrate.
Twocriteriaareusedtosubjectively measureperformance:thelengthof therecorded
tone,andthepresenceor absenceof “pops”. Thelengthof therecordedtoneis computed
manuallyusinga graphicalsoundeditor. If the recordedtoneis lessthanonesecond,the
parametersettingsaredesignatedunsatisfactory. Smallergapsin the recordingareexpe-
riencedas “pops” when heard,and observed as spikes in the spectrogram(Figure 5.7).
Parametersettingsproducing“pops” arealsounsatisfactory.
Resultsof this evaluationaretabulatedin Table5.1. From this table,it is evident
that theaudiosignalwill loselargechunksof dataif theAudioPacket sizeis not great
enough.Similarly, “pop”s will bepresentin therecordeddataunlessthetransferbuffer is
large enough.Fromthis data,it is concludedthata transferbuffer of two secondsandan
AudioPacket sizeof 100framesarerequiredfor adequateperformance.
Draft of 3:03am,Friday, August18,2000 36
TransferbufferChannels Bits AudioPacket size size(secs) Tooshort “Pop”s
1 16 1 1 Y N1 16 10 1 Y N1 16 100 1 N Y1 16 200 1 N N2 16 1 1 Y N2 16 10 1 Y N2 16 100 1 Y N2 16 200 1 N Y2 16 1000 1 N Y2 16 1 2 Y N2 16 10 2 Y N2 16 100 2 N N2 16 200 2 N N
Table5.1: Resultsof thesoundserver performanceevaluation. Highlightedrows indicateparametersettingsthataresuccessful.Recordingswerecapturedatasamplingrateof 44.1kHz on Pentium120MHz runningWindows 98.
5.4.2 Limitations
While performanceof the soundserver is acceptable,one limitation remains. Using the
DirectX v6.0 API preventsus from runningthesoundsensorserver on operatingsystems
other than Microsoft Windows 95/98/2000. As mentionedin Section5.3, this doesnot
prevent us from streamingdatato clientson other computingplatforms,but restrictsus
to Windows-compatiblesoundcards. Currently, Windows is the most widely-supported
operatingsystemfor soundcards,sothis is not a major limitation. Also, theDirectX Java
interfacemaintainsthe JavaSounddesignand classstructureto enableeasysubstitution
oncetheJavaSoundAPI is complete.
Chapter 6
Building a Prototypical Model
6.1 Overview
The hardwareandsoftwaredescribedin the previous two chaptersenablesus to acquire
multiplesamplesatany locationon thesurfaceof anobject.Thesesamplesareeachrepre-
sentedby asoundmodel.Thesoundmodelmaybeany oneof theimpulseresponsemodels
asdiscussedin Chapter2. Theparametersof themodelarecomputedfor eachsampleus-
ing theappropriateestimationtechnique.In our currentimplementation,themodelof van
denDoel [30] representseachsample;the parametersof the modelarecomputedby the
techniquepresentedin [30]. Referto Section2.4for abrief descriptionof this technique.
Theadvantageof acquiringmultiplesamplesat thesamelocationis thatinstrumen-
tation andbackgroundnoisecanbe minimisedby averagingthe samples.A prototypical
model, representative of multiple models,can be createdat eachsamplelocation. The
prototypicalmodelshouldbe lessaffectedby noisethaneachindividual model. Sucha
prototypicalmodelis usedfor comparingtheacousticdistancebetweentwo sampleloca-
tions; this useis discussedin Chapter7. A prototypicalmodel is alsousedfor synthesis
whentheobjectis simulated.
This chapteroutlinesone approachto generatinga prototypicalmodel from the
available data. The approachis an intuitive oneand producessatisfyingresults. While
37
Draft of 3:03am,Friday, August18,2000 38
thereareothersolutionsto this problemwhich mayproducebetterresults,our intentionis
to provide oneexampleby which to demonstratetheutility of suchmodels.
6.2 RelatedWork
Althoughour exactproblemappearsto beunique,two otherfieldsof audioresearchhave
producedrelatedwork: speechandspeaker recognitionandaudiomorphing. Neitherof
thesefieldssharesourexactgoal,but eachis similar in somerespect.
Speechandspeaker recognitionresearchershave investigatedmethodsfor cluster-
ing setsof audiodata.Their problemis oneof classicpatternrecognition:givena setof N
categories(e.g.,differentspeakersor words),andM trainingexemplarsfor eachcategory,
classifya new datasampleinto oneof the categories. At a high level, this is mostoften
accomplishedby creatingaprototypefor eachcategory from thetrainingset,thencomput-
ing a distancefrom thenew datato eachof theprototypes.Theseprototypesaretypically
formedusingamodifiedK-meansalgorithmonthevectorsresultingfrom LinearPredictive
Coding(LPC)analysis[24]. Similar methodswereimplementedon oursoundmodels,but
thevariability of ourmodels(dueto noise)causedunsatisfactoryresults.
Audio morphingis a processthat smoothlyblendsonesoundinto another. If you
imaginea continuumbetweentwo sounds,an audiomorphingalgorithmcangeneratea
soundatany pointonthecontinuum.Thissoundcontainsaproportionof eachsound,yet is
perceivedasasinglenew sound.Slaney etal. describeonesuchmethodin [27]. They claim
thatcross-fadingconventionalspectrogramsis not convincing if thetwo sourcesoundsare
notsimilar in pitch. Theirapproachis to representsoundsusingasmoothspectrogram(de-
rived from themel-frequency cepstralcoefficients[24]) anda residualspectrogramwhich
encodesthepitch. Interpolationoccursin thishigher-orderspectralrepresentationwhich is
theninvertedto producetheresultantsound.If theirmorphingalgorithmcouldbeextended
to morphbetweenmorethantwo sounds,thisapproachcouldbeusedto generateourproto-
typical soundmodelsby estimatingmodelparametersfrom thecompositesoundproduced
by themorph. Indeed,a morphingalgorithmis a powerful tool in its ability to weight the
Draft of 3:03am,Friday, August18,2000 39
contribution of eachsoundto themorphedresult.
6.3 Spectrogram Averaging
Theapproachwe useis similar in spirit to themorphingalgorithmof [27], but muchless
sophisticated.Weareableto useasimpleralgorithmbecausewearedealingwith anarrow
classof sounds:singleimpactsoundswhich decayexponentiallyandaresimilar in pitch.
Our approachis to computethe “average”spectrogramof M samplesat onesampleloca-
tion, thenestimatethemodelparametersfrom this “average”spectrogram.Theapproachis
intuitively satisfyingandproducesreasonablesoundmodels.
Computingthe averagetime signalof soundsamplesis non-trivial. Becausethe
modesof theimpulseresponsemaynotbein phaseacrosssamples,exactalignmentin time
is difficult. Phasedifferencesintroducedby inexact alignmentwill createa signalwhich
soundslike many separatesoundsplayedtogether.
In contrast,computingtheaveragespectrogramis easiersincethespectrogramcon-
tainsno phaseinformation. Furthermore,theaveragespectrogramis a naturalrepresenta-
tion for our task,giventhat themodelparametersareestimatedfrom spectrograms.Since
thesamplesarerecordedatthesamesamplelocation,weassumethesampleshavethesame
pitch, therebyavoiding theproblemsindicatedby Slaney et al. [27].
Aligning theM spectrogramsis straightforward. Sinceeachspectrogramrepresents
an impactsound,they canbe alignedby matchingtheonsetof impact. This onsetis rep-
resentedin the spectrogramasthe time framewith the maximumtotal amplitude. Once
aligned,theaveragespectrogramis computedby themeanamplitudeof eachfrequency in
eachtime frame.
Since the energy of eachimpact is not exact, the energy of eachsignal is nor-
malisedprior to computingits spectrogram(Equation6.1). Energy normalisationensures
that ÅÇÆÈ"É Ê Ë ÌPÍÎ .
Draft of 3:03am,Friday, August18,2000 40
ÏÐ"Ñ Ò ÓÕÔ×Ö Ø Ù ÚÛ Ü Ö Ø Ù Ú Ý (6.1)
Mathematically, our approachis alsosatisfying. If we assumewe arerecordinga
signal Þ ß Ñ Ò Ó which is composedof thetruesignal Ð"Ñ Ò Ó anda randomadditive noiseprocessà ß Ñ Ò Ó , thedevelopmentin Equation6.2shows thattheaveragespectrum áâ Ñ ãÓ is identically
equalto thetruespectrumä Ñ ãÓ if wealsoassumeazero-meannoiseprocess1.
Þ ß Ñ Ò ÓåÔ¦Ð"Ñ Ò Ó"æTà ß Ñ Ò Óâ ß Ñ ãÓÕÔ ä Ñ ãÓ"æ®ç ß Ñ ãÓáâ Ñ ãÓèÔ ÜTéê ë.ì í ê Ø î.Úï
Ô Ü éê ë.ì ð Ø î.Úï æ Ü éê ë.ì ñ ê Ø î.ÚïÔ ä Ñ ãÓ"æ áç$Ñ ãÓ
(6.2)
6.4 PerformanceEvaluation
To evaluatethe effectivenessof spectrogramaveraging,a testwasconductedusingsyn-
thetic data. This test is similar to the evaluationof the parameterestimationalgorithm
(Section2.4.1).
For thisevaluation,M sampleswerecreatedfor eachtrial by addingnoiseto sounds
synthesisedusingfifty randommodes.A descriptionof thenoiseandsynthesisedsoundsis
foundin Section2.4.1.Eight signal-to-noiseratioswereused: ò (i.e., no noise),100,50,
30, 20, 15, 10 and5. Averagingover theM samplesproduceda spectrogramfrom which
modelparameterswereestimated.Theexperimentwasconductedusingfivevaluesof M: 1
(control),2, 5, 10 and20. Onehundredtrials wereconductedfor eachpairingof SNRand
M values.
The resultingsoundmodelsareevaluatedby thesamemeasuresasSection2.4.1;
themetricsarerepeatedin Equation6.3.1This characterisationof noiseis a commonassumption,thoughunlikely to be correctin our
situation.
Draft of 3:03am,Friday, August18,2000 41
ó¤ôöõ¦÷ ø ùjúô ûô üó¤ýöõ¦÷ ø ù úý ûý üó¤þöõ¦÷ ø ù úþ ûþ ü
(6.3)
whereÿ is chosento minimise ÿ õ .Themeanerror for eachsignal-to-noiseratio (SNR) is plottedby thedashedlines
in Figure6.1. Thesolid linesarethemeanerrorover all eightSNR’s. Thecolourof each
line indicatesthenumberof samples(M) usedin thespectrogramaverage.As Figure6.1
shows,spectrogramaveragingsubstantiallyreducesthemeanerrorfor initial amplitudeand
frequency estimates.Even averagingtwo samplesyields improvementsof 20% and12%
onó¤þ
andóPô
respectively.
The meanerrorof thedampingparameteris not significantlyreducedby spectro-
gramaveraging.Thestandarddeviationof theerroris, however, dramaticallyreducedwith
increasingvaluesof M. As Figure6.2 illustrates,thestandarddeviation of error is reduced
by anaverageof 61%usingonly fivesamples.This resultimpliesthatspectrogramaverag-
ing yieldsamoreconsistentestimationof thedampingparameter.
Draft of 3:03am,Friday, August18,2000 42
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
noise/signal
Ea
1 sample
2 samples
5 samples
10 samples
20 samples
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.210
15
20
25
30
35
40
45
50
55
60
noise/signal
Ef
1 sample
2 samples
5 samples
10 samples
20 samples
(a) Meaninitial amplitudeerrors( ). (b) Meanfrequency errors( ).
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20
0.5
1
1.5
noise/signal
Ed
1 sample
2 samples
5 samples
10 samples
20 samples
(c) Meandampingerrors( ).Figure6.1: Meanerrorsof the parameterestimationalgorithm on spectrogram-averagedsyntheticdata.Themeanerrorover100trials is plottedfor eightsignal-to-noiseratios( ,100,50, 30, 20, 15, 10 and5) by thedashedlines. A solid line is themeanerrorover allsignal-to-noiseratios.Thecolourof eachline signifiesthenumberof samples(M) usedintheaverage.For convenience,the inverseratio (noise/signal)is usedastheabscissa.SeeEquation6.3for definitionsof
,
and
.
Draft of 3:03am,Friday, August18,2000 43
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.20
50
100
150
200
250
300
350
400
450
noise/signal
Ed
1 sample
2 samples
5 samples
10 samples
20 samples
Figure6.2: Standarddeviation of error for estimateddampingparameters.The standarddeviation of the meanerror over 100 trials is plotted for eight signal-to-noiseratios( ,100, 50, 30, 20, 15, 10 and 5) by the dashedlines. A solid line is the meanstandarddeviation over all signal-to-noiseratios. The colour of eachline signifiesthe numberofsamples(M) usedin theaverage.For convenience,the inverseratio (noise/signal)is usedastheabscissa.SeeEquation6.3for thedefinitionof .
Chapter 7
An AdaptiveSampling Algorithm
7.1 Overview
To completelyautomatethecreationof soundmodelsfrom measurements,theselectionof
samplinglocationsmustalsobeautomatic.By samplinglocationwe meanthelocationon
thesurfaceof an objectwhereit is to be struckby the soundeffector. Themost intuitive
approachis to selecta uniform grid over thesurfaceof theobject. Without knowledgeof
the object’s shape,a uniform grid in Cartesianspacecould alsobe considered.This was
demonstratedin [25] and[26].
As partof theActive Measurementfacility, we assumethata surfacemodelof test
objectsis attainableusingtheTriclopstrinocularcamera1. With a surfacemodelavailable,
auniform grid canbeprojected,andthesamplelocationsuniformly distributed.
Onequestioninevitably ariseswhengeneratingtheuniformgrid: how finely should
we sample?Many everydayobjectsarecomposedof differentmaterials(e.g.,a glassjar
with a metal lid), or have density, volume and thicknessvariationsacrosstheir surface.
Thesepropertieswill causethe soundof the objectto changeperceptuallyover small re-
gionsof the surface. Otherobjects(e.g.,an eraser)may have a relatively constantsound1Currently, softwareto generatesurfacemodelsfrom the Triclops’ rangedatais not available.
This is ongoingwork of theACME project. For theexamplesin this thesis,surfacemodelsof testobjectsareconstructedmanually.
44
Draft of 3:03am,Friday, August18,2000 45
1. Selectanunsampledvertex asthenext samplelocation.
2. Strike theobjectat thesamplelocationandcreateasoundmodel.
3. Comparethenew modelto themodelsatall adjoiningsamplelocations.
4. If the acousticdistancebetweentwo adjoiningmodelsis greaterthana perceptualthreshold,addanew vertex betweenthesetwo.
5. Otherwise,repeatfrom Step1 until noverticesareunsampled.
Figure7.1: Adaptive samplingalgorithm.
over theentiresurface.Consequently, theselectionof a multi-purposesamplingdensityis
non-trivial. It cannotbeconstant,nor canit bepre-determinedby objectgeometryalone.
The algorithmpresentedin this chapterchangesthe densityof a samplingmesh
basedon perceived differencesin soundover the surfaceof an object. This algorithmis
straightforward andis listed in Figure7.1. Here, the samplelocationsarechosenas the
verticesof a surfacemodelrepresentingthe testobject. At eachvertex, a soundmodelis
created.If theacousticdistancebetweenthismodelandoneatanadjoiningsamplelocation
is greaterthanaperceptualthreshold,anew samplelocationis addedbetweenthetwo (see
Figure7.2). Thenew samplelocationshouldbea vertex in the refinementof thesurface.
This procedurecontinuesuntil the acousticdistancebetweenall adjoiningmodelsis less
thanaperceptualthreshold.
Selectionof unsampledverticesin thefirst stepof thealgorithmis conductedby any
heuristicrule. Currently, thevertex nearestthepreviouslysampledvertex is selected.Other
possibleheuristicsincludeselectingverticesby their connectivity in themeshor selecting
verticesto minimisetheamountof movementby therobots.
Thealgorithmin Figure7.1 is simplistic,but relieson threecomplex components:
aprocedurefor addingnew verticesto thesurfacemodel,anacousticdistancemetricanda
perceptually-relevant thresholdfor this metric. This chapterdescribesoneimplementation
Draft of 3:03am,Friday, August18,2000 46
Figure7.2: Resultof adaptive sampling. The soundmodelsat two adjoiningsamplelo-cationsarecompared(left). If theacousticdistancebetweentheverticesis too great,thesurfaceedgeis refinedby insertinganew vertex (right).
of eachof thesecomponents.Notethat thesecomponentsareresearchtopicson their own
and it is not presumedthat the implementationsdiscussedhereinare optimal solutions.
Thealgorithmin Figure7.1 is applicableto any vertex insertionrule, distancemetricand
threshold,andshouldbeconsideredtheprincipalresultof this chapter.
7.2 RelatedWork
To ourknowledge,this is thefirst algorithmfor adaptively selectinglocationsonthesurface
of anarbitraryobjectfor purposesof soundmodelling.Previouswork onmodellingof mu-
sicalinstrumentsfrom empiricalmeasurementsis vaguein its selectionof samplelocations;
two examplesare[2] and[4]. In [2] aguitaris “thumped”on thebridgeby a “sharpinstru-
ment” to recordmeasurementsfor estimatesof a bodymodel.A physics-basedmodelwas
usedto estimatethesamplelocationof pluckedstringsfrom therecordings.In [4], strings
of musicalinstrumentswerestruckusinga forcehammerat thepoint on the instruments’
bridgewerethestringsmadecontact.
Rulesfor the addition of new verticesto a surfacehave beenexplored in multi-
resolutionsurfaceresearch.Multi-resolutionsurfacesareappealingin computergraphics
becausetheir level-of-detailnaturesimplifieseditingandprovidesscalablerenderquality
Draft of 3:03am,Friday, August18,2000 47
whichcanspeedcomputeranimationdevelopment.Wehave selectedasubdivision surface
representation,using the Loop [19] schemefor vertex insertion. A brief descriptionis
providedin Section7.4.
Distancemetricsfor soundhave beenexploredin severaldomains.Early efforts in
this researchwerein thefield of speechrecognition.A classicsurvey of techniquesis [11].
Dubnov andTishby exploredcomparingmusicalsoundsusinga spectralandbis-
pectralacousticdistortionmeasure[7]. Their approachis to measurestatisticalsimilarity
usingtheKullback-Lieblerdivergencebetweenmodels.
More recentwork, which focusedon creatinga “content-based”soundbrowser
(MuscleFish),usestwo levelsof featuresto form acomplex parameterspacewithin which
distancesarecomputed[15, 33]. Onesetof featuresis extractedfrom eachframeof audio
data:loudness,pitch,brightness,bandwidthandmel-filteredcepstralcoefficients(MFCCs).
Thetime seriesof theseframevaluesprovidesa secondsetof features:themean,standard
deviation andderivative of eachframe-level parameter.
To our knowledge,noneof the metricslisted above have beenusedto determine
perceptualthresholdson similarity. By “similarity” we meanin thecontext of perceiving
two soundsasbeingproducedby thesameobject. In our application,we requirea metric
by which we may decideif two similar soundsare perceived as separateobjectsand/or
materials. Someresearchhasexaminedthe factorsby which differencesin materialand
geometryareperceivedfrom audition[8, 9, 12, 16]. While this researchdoesnot examine
acousticdistancesdirectly, it doesprovideempiricalresultsindicatingwhichparametersof
a soundmodelarerelevant to materialandgeometricperception.Specifically, theanalysis
in [16] suggestsa metric by which modelparametersarerelatedto perceptualsimilarity.
Becauseempirical thresholdsareavailable for this metric, it wasselectedfor usein this
implementationof the adaptive samplingalgorithm. Details of the metric are statedin
Section7.5; Section7.6 containsa discussionrelatingthe resultsof [16] to thresholdson
thismetricusefulto ourapplication.
Draft of 3:03am,Friday, August18,2000 48
7.3 Requirements
The samplingalgorithmmustselectpointsover the surfaceof the testobject in a dense-
enoughmeshso as to captureperceptuallyrelevant differencesin soundover the entire
surface.Satisfactionof this requirementis difficult to quantifyandrequiresfuturepercep-
tual studies.
7.4 SurfaceRepresentation
As mentionedin Section7.2,asubdivisionsurfacerepresentationis usedto facilitatevertex
insertion. A subdivision surfacerepresentsa smoothsurfaceasthe limit of a sequenceof
successive refinementsof acoarsemesh[6]. Thatis, it representsacoarseapproximationto
thetrue(smooth)surfacewith afinite numberof vertices,yet canbesystematicallyrefined
by addingnew verticesuntil, in thelimit, thesmoothsurfaceis produced.
Severaldifferentschemesexist for specifyingthelocationof verticesaddedduring
refinement.Typically, the locationsof new verticesareweightedcombinationsof neigh-
bouringcoarsevertices’positions. We usetheLoop schemeasfirst proposedby Charles
Loop[19]. TheLoopschemeusesthevertex masksin Figure7.3to calculatethepositionof
verticesin thenext level of refinement.Thesemasksareappliedto a coarsesetof vertices
to calculatethepositionof verticesin thenext level of refinement.Themaskon theleft is
usedto inserta new vertex on theedgebetweentwo coarsevertices.Themaskon theright
is usedto refinethepositionof eachcoarsevertex. Thelimit positionof interior verticesis
computedby replacing in Figure7.3with ! " #$ [6]. For verticeson theboundary
of a surface,thelimit positionis computedby changingthecoefficientsof theeven-vertex
boundarymaskto % , % and % [6].
Thedetailsof therefinementscheme,andits effectonthegeometryof themesh,are
not importantto our application;therefinementschemesimply definesa hierarchyof suc-
cessive edgerefinements.Interestedreadersshouldconsult[6] for a thoroughintroduction
to subdivision surfaces.
Draft of 3:03am,Friday, August18,2000 49
8
3
8
3
8
1
1
8
β
β
βInterior
ββ
β
1
8
13
82
1 1
2 4
Crease & Boundary
Masks for edge refinement
(odd vertices)
1-k
(even vertices)
Masks for vertex refinement
Figure7.3: Loop schemerefinementsmasksfor subdivision. Maskson the left areusedfor edgerefinement;maskson the right areusedfor vertex refinement.& is chosento be'() * + ,.-/) 012 '354 6 798 :(<; ; 8 .
Draft of 3:03am,Friday, August18,2000 50
In our implementation,webegin by samplingthesoundat thelimit positionof each
vertex in the coarsestrepresentationof the surfacemesh. The acousticdistancebetween
verticesjoined by an edgeof the meshis thencomputed. If the acousticdistanceis too
great,thevertex which is therefinementof thejoining edgeis addedto the list of vertices
to besampled.
Theminimum vertex-spacingrequiredfor adequatesurfacerepresentationandre-
finementdictatesthe maximumspacingof the soundsamplingmesh. For somesimple-
soundingobjects,this may producean overly densesamplingmesh. This is not a large
concernfor applicationssuchassimulationsincethedensermeshwill rely lesson interpo-
lation betweensoundmodelsduringsynthesis.
7.5 AcousticDistanceMetrics
As prescribedby theadaptive samplingalgorithm(Figure7.1), anacousticdistancemust
becalculatedbetweeneachpairof verticesjoinedby anedgein thesurface.Thecalculated
distancewill be comparedto a perceptualthresholdto make a refinementdecision. The
selecteddistancemetric must thereforebe indicative of the perceptualproximity of two
sounds.
Most applicationsof acousticdistancemetricsdiscussedby the literaturein Sec-
tion 7.2usedistanceto make relative comparisons.For example,classifyinga new sound
by choosingthegroupof exemplarsoundsto which it is nearest.Our applicationis differ-
entbecausewe needto determineif two modelswill beperceivedasdifferentmaterialsor
shapesoncesynthesised.The realismof a soundmodelrequiresthat thesoundproduced
over thesurfaceof anobjectis notdiscontinuousin regionswhereit shouldvary smoothly.
The distancemetric must thereforebe applicableto an appropriateperceptualthreshold.
Furtherdiscussionof sucha thresholdis delayeduntil Section7.6.
Most soundmetricscalculatea distancedirectly from waveforms.Wechoseto use
a metricderivedfrom modelparametersinstead.Much of thework on materialperception
anddiscriminationdiscussesdiscriminantsin termsof modelparameters.It is therefore
Draft of 3:03am,Friday, August18,2000 51
convenientto expressthedistancemetricon theseparametersaswell.
Specifically, we usea logarithmicratio of the frequency-independent dampingco-
efficient anda ratio of modalfrequenciesasmetrics.Thesemetricsweresuggestedby the
researchof Klatzky etal. in theirpaperonauditorymaterialperception[16]. Sections7.5.1
and7.5.2describethesemetricsin greaterdetail.
7.5.1 Frequency-IndependentDamping Coefficient
As statedpreviously, and repeatedin Eq. 7.1, the soundmodel is composedof =<> fre-
quency modes,eachwith a distinct frequency (?A@ B C ), initial amplitude( D @EB C ) anddamping
coefficient ( F @EB C ).
GAH IJ K LMNPOQC RPS D @EB C T UV W X Y Z [ \E] ^ _
H ?A@ B C K L (7.1)
It is theorisedthatthedampingcoefficient in Eq. 7.1( F @EB C ) is relatedto amaterial’s
internalfriction parameter( ` ) by Equation7.2[9, 12, 16].
F @EB C M/a ?A@ B C b c _ H ` L (7.2)
Furthermore,it hasbeendemonstratedthat theinternalfriction parameter( ` ) is an
approximateshapeandfrequency-independent materialproperty[18]. A recentstudy[16]
hasshown thata frequency-independent dampingfactor(d Mfe b c _ H ` L ) is a perceptually
usefuldiscriminantof material.An estimateof thisparameter( gd @EB C ) canbecalculatedfrom
theestimateddampingcoefficient ( gF @EB C ) of eachfrequency mode:
gd @EB C M e gF @EB Ca ? (7.3)
gF @ B C is only anestimateat onefrequency andis subjectto noiseandvariability. A
betterestimateof the frequency-independent dampingcoefficient ( gd @ ) is computedasthe
medianof gdE@ B C for all frequency modesh . Given multiple samplesat a singlevertex, the
Draft of 3:03am,Friday, August18,2000 52
estimateof thedampingcoefficient is improvedby calculatingi j asthemedianof kiEj over
all samples.
Thedistancebetweentwo vertices’dampingcoefficients(iEl and inm ) is expressed
asa logarithmicratio:
oqp r s t uAv w xEy z| q~9 <iEliEm (7.4)
7.5.2 FrequencySimilarity
Using frequency similarity asa distancemetric is supportedby recentperceptualstudies.
Klatzky et al. found frequency anddampingto be “independentdeterminantsof similar-
ity” [16]. Theresultsof their studyshow thatfrequency anddecayareusedindependently
whenjudgingthesimilarity of sounds, but in combinationwhenclassifyingmaterial. Ad-
ditionally, experimentsby Hermes[12] andGaver [9] supportthetheorythat frequency is
animportantperceptualfeaturefor material,shapeandsizeestimation.
We definethe frequency distancebetweentwo modelsby Equation7.5. Modes
arematchedbetweenmodelsusingthealgorithmin Figure7.4. This algorithmguarantees
thateachmodein modelA will beuniquelymatchedto a modein modelB. Detailsof the
implementationareincludedin AppendixC.
oqp r s t uPv w Ay z| q~9P E n Py E n ~
P E P n (7.5)
where AlP and AmA are the frequenciesof modesmatchedin modelsA and B by the
algorithmin Figure7.4.
Thefrequency distancebetweenonemodelandacollectionof modelsis computed
as the frequency distance(Equation7.5) betweenthe single model and the prototypical
modelof thecollection.Chapter6 describestheprocessof creatingaprototypicalmodel.
Draft of 3:03am,Friday, August18,2000 53
For eachfrequency modeAP in modelA . . .
1. Find thefrequency modeAA nearestto AP 2. If the AA is unmatched,matchit to AP 3. Or, if A is matchedto anothermodewhich is fartherthan , matchA
to AP instead.Mark 9| asmatched,andthemodeit replacesasunmatched.
4. Otherwise,loop to step1, picking thenext nearestfrequency mode
Repeatuntil all frequency modesin modelA arematched.
Figure7.4: Algorithm to find uniquefrequency mappingbetweentwo models.
7.6 PerceptualThr esholds
As mentionedin Sections7.2and7.5,thedistancemetricsselectedfor this implementation
weresuggestedby theanalysisin [16]. Themotivationfor thisselectionwastheavailability
of empiricaldataratingtheperceptualsimilarity of soundsby thesemetrics.
Thestudyof Klatzky et al. askedsubjectsto ratethesimilarity of two synthesised
sounds.In their first two experiments,subjectsratedhow likely two soundswereto have
beenproducedby thesamematerial,regardlessof shape.In thethird experiment,subjects
ratedthe relative lengthof the barsthatwerepurportedto have producedthe synthesised
sounds.While thesetasksarenot identicalto theours,we feel they aresimilar enoughto
providesuitablethresholdsin theabsenceof moreappropriatedata.Theappropriatenessof
themetricsandthresholdsto ourapplicationcanbeproperlyevaluatedonly by aperceptual
studyof thesynthesisedmodels.
The resultsof Klatzky et al.’s first two experimentsaresummarisedin Table7.1.
The averageof the two experimentsis listed in the last row of the table. The regression
coefficientsexpressthechangein perceivedsimilarity for a unit changein thelogarithmic
ratiosof fundamentalfrequency anddecayconstant.Thesimilarity ratingsareonascaleof
0 to 100,but theperceivedratingof identicalsoundswasfoundto beapproximately92.45
Draft of 3:03am,Friday, August18,2000 54
Freq.Diff. DecayDiff. ProductExperiment Intercept Coeff. Coeff. Coeff. ¡.¢
1 88.7 -0.56 -1.02 0.30 0.762 96.2 -0.50 -1.06 0.31 0.80
mean 92.45 -0.53 -1.04 0.305 0.78
Table7.1: Regressionof similarity on frequency anddecaydifference[16]. Themeanofthevaluesfor thetwo experimentsis alsolisted.
(meanintercept).Referto [16] for moredetailsof theexperimentsandresults.
Themeanvaluesin Table7.1arescaledby a constantsimilarity factor(S) to form
the thresholdsfor our distancemetrics. The similarity factoradjuststhe amountof per-
ceivedmaterialdissimilaritypermittedbeforea refinementis required.Table7.1 suggests
threevaluesfor a threshold:£q¤ ¥ ¦ § ¨P© ª «E¬ |® ¯q°±³² ´ µ ¶E· , £q¤ ¥ ¦ § ¨P© ª ¸A¬ |® ¯q°±µn´ ¹ º · , and
£q¤ ¥ ¦ § ¨A© ª «E¬ |® ¯q°A»£q¤ ¥ ¦ § ¨A© ª ¸¬ |® ¯q°|±¼µn´ º µ ¹E· . In our implementation,if any of these
thresholdsis exceeded,a refinedvertex is added.
Currently, a empiricalvalueof S = 0.75is usedasthesimilarity factor. This value
wasestimatedby reviewing coarsedatacollectionsandmanuallyselectingwhich models
requiredrefinement.As anexample,Table7.2summarisesthecalculatedsimilarity factors
of modelscollectedfrom threeobjects:a brassvase,glasswine bottleandplasticspeaker.
The last row of Table7.2 comparestwo locationson the brassvase. Only ten frequency
modeswereusedto modelthewine bottleandplasticspeaker. Forty modeswereusedto
modelthebrassvase.Most of thedistancesreportedagreewith expectation.Althoughthe
distancebetweenthebrassvaseandwine bottlearelow, it is not a surprisingresultwhen
thebrassvaseis synthesisedwith only tenmodes,sinceit thensoundsvery similar to the
winebottle.Futureperceptualstudiescouldbeusedto determinea lessadhocvalueof S.
It shouldbenotedthatthefrequency distanceof Equation7.5 is not identicalto the
expressionusedto computethecoefficientsin Table7.1. In their experiments,Klatzky et
al. comparedonly thefundamentalfrequency of their synthesisedstimuli. Herehowever,
we comparetheaverageof all frequency modesof themodel,but weighttheir contribution
to theaverageby their initial amplitude.This approximationhasprovedsuccessful,aswill
Draft of 3:03am,Friday, August18,2000 55
Similarity Factor(S)ObjectA ObjectB Frequency Decay ProductBrassvase Plasticspeaker 1.14 0.781 1.61Plasticspeaker Wine bottle 1.05 0.585 1.11Wine bottle Brassvase 0.267 0.196 0.0943Brassvase(I) Brassvase(II) 0.0919 0.000125 0.0000208
Table7.2: Examplesof calculatedsimilarity factors.
beillustratedby thesamplecollectionsin Chapter8.
Chapter 8
SampleData Collections
8.1 Overview
This chapterpresentsthe resultsof four sampledatacollections. The first collection, a
tuningfork, is meantasacalibrationexperiment.A surfacemodelof thetuningfork is not
used,nor is the adaptive samplingalgorithmof Chapter7. For the otherthreeobjects,a
brassvase,plasticspeaker andtoy drum,theentiresystemis usedto build asoundmodel.
The objectswere selectedto provide examplesof a variety of materials. Also,
sincetheshape-acquisitioncomponentof ACME is notcompleted,werequiredobjectswith
geometriesthat could be easilymodelledmanually. Eachsurfacemodelwasconstructed
from manualmeasurementsusing3D modellingsoftware.
Eachtestobjectwasmountedon theACME teststation.Objectswhosediameters
aresmallerthanthediameterof the teststationcouldnot besampledbelow heightsof 30
mmdueto collisionsbetweenthesoundeffectorandtheteststation.For futurecollections
requiringcompletecoverage,objectscouldberaisedon anarrow pedestal.
The following four sectionsdiscussthe testobjects,experimentalsetup,and the
resultsof thedatacollections.
56
Draft of 3:03am,Friday, August18,2000 57
Coarse Microphone Numberof Similarity NumberofObjectName Vertices Distance(mm) Modes Threshold SamplesTuningfork 1 5 5 N/A 5Brassvase 136 190 40 0.75 5Plasticspeaker 98 40 10 0.75 5Toy drum 31 130 40 0.75 5
Table8.1: Summaryof setupparametersfor testobjects.
Figure8.1: Setupfor acquiringmodelof tun-ing fork.
Time
Fre
quen
cy
0 0.5 1 1.5 2 2.5 30
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Figure8.2: Recordedspectrogram.
Time (sec)
Fre
quen
cy (
Hz)
0 0.5 1 1.5 2 2.5 30
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
50
100
150
200
250
300
350
400
450
500
550
(a) Synthesisedspectrogram. (b) Closeup.
Figure8.3: Thefigureontheleft (a)showsthespectrogramof asoundsynthesisedfrom theprototypicalmodelof thetuning fork. Thefigureon theright (b) is a closeupof thespec-trogramshowing theestimationof thefundamentalfrequency at 430.7Hz anda harmonicat215.3Hz.
Draft of 3:03am,Friday, August18,2000 58
8.2 Tuning Fork
An A-440 tuning fork wasusedasa calibrationobject. The tuning fork wasmountedon
theACME teststationasillustratedin Figure8.1. As is alsoshown, themicrophonewas
located5 mm from thenearesttyne. Thetuningfork wasstruckfive timesat onelocation
to producefive three-secondrecordings.A five-modeprototypicalmodelwascreatedfrom
thefiverecordings.Table8.1summarisesthesetupparametersfor all of thetestobjects.
8.2.1 Estimation Results
Figure8.2 displaysthespectrogramof onerecording.The440Hz toneis present,asis a
harmonicat220Hz andseveralovertonesat9 430Hz and10120Hz. Whenthetuningfork
is struck,only theovertonesareaudiblefrom a distance.At its closeproximity, however,
themicrophonewasableto recordthelow amplitude440Hz tone.
Figure8.3 containstwo spectrogramsproducedby a soundsynthesisedfrom the
prototypicalmodel.As bothspectrogramsshow, theprototypicalmodelcontainsrelatively
accurateestimatesof the two harmonicsandtheovertonespresentin theoriginal spectro-
gram. Themoderepresentingthe440Hz tonewasestimatedat 430.7Hz. The frequency
wasestimatedfrom a 1024-pointDiscreteFourierTransform(DFT), with a frequency res-
olution of 43 Hz. Sincetheestimationis within 43 Hz of thetruefrequency, thetestwasa
success.
8.3 BrassVase
The brassvasedisplayedin Figure8.4 (a) wasthe first testobject for which a complete
soundmodel was generated.The subdivision surface in Figure 8.4 (b) representedthe
vasefor the adaptive samplingalgorithm. This coarsemeshcontains136 vertices. Forty
frequency modeswere estimatedat eachsamplelocation,andprototypicalmodelswere
producedusingfiverecordingsateachlocation.
Thevasewassecuredto theACME teststationusingthin double-sidedtape. The
Draft of 3:03am,Friday, August18,2000 59
(a) (b)
Figure8.4: A photo(a) of brassvaseandthesubdivision surfacewhich representsit (b).
Figure8.5: Setupfor acquiringsoundmodelof brassvase.
Draft of 3:03am,Friday, August18,2000 60
microphoneon thefield measurementsystem(FMS) wasusedto recordthesamples,and
was located190 mm behindthe vase(seeFigure8.5). Early experimentsusing the mi-
crophonemountedon thesoundeffectorproducedpoorsoundmodelsdueto thetransient
effectsof clipping andechoes.Theseeffectsarediminishedby recordingin thefar field.
At eachsamplelocation, the vasewasstrucknormal to the surfaceby the sound
effector.
8.3.1 Estimation Results
Figure8.6 is a comparisonof spectrogramsof synthesisedsoundsandrecordedsamplesat
threepositionsonthevase.Whitenoisewasaddedto thesynthesisedsoundsatasignal-to-
noiseratio approximatingthesignal-to-noiseratio of therecording.Theadditionof noise
createsspectrogramsthataremorecomparableto theoriginals.AppendixB discussesthis
techniquewith examples.
Thefrequency modeswereestimatedquiteaccuratelyateachlocationin Figure8.6.
Audibly, thesynthesisedsoundsat mostsamplelocationson thevasewerecomparableto
therecordings.
Most often,any differencein thesoundswasa lower perceivedpitch. Evidenceof
this is presentin thespectrogramsof Figure8.6,particularlyatZ = 90 mm. A narrow band
of highenergy noiseis visible in therecordedspectrogramfrom approximately0 to 300Hz
(Figure8.7). This bandof noisewasestimatedasa modein themodelat 215 Hz with a
very smalldampingconstant(0.452).In fact,thismode’s dampingconstantis smallerthan
any otherof themodesby at leastoneorderof magnitude.
Thoughbackgroundnoiseis concentratedbetween0 and300Hz, a moderatelevel
of noiseis alsopresentin a bandfrom 300 to 600 Hz (Figure8.7). This bandof noise
artificially reducedtheestimateddampingconstantsof frequency modeswithin thatrange.
As an example,two modeswereestimatedat 646.0Hz and473.7Hz with dampingcon-
stantsof 8.6 and4.5 respectively. Thoughrecordedmodesin this rangetypically lasted
approximately0.1 seconds,theseestimatedmodesremainat significantamplitude( ½ -3
Draft of 3:03am,Friday, August18,2000 61
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
(a)Recordedspectrogram(Z = 90 mm). (b) Synthesisedspectrogram(Z = 90 mm).
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
(c) Recordedspectrogram(Z = 61 mm). (d) Synthesisedspectrogram(Z = 61 mm).
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
(e)Recordedspectrogram(Z = 45 mm). (f) Synthesisedspectrogram(Z = 45 mm).
Figure8.6: Resultsof brassvaseexperiment.Spectrogramsof recordedsamplesandthosesynthesisedfrom prototypicalmodelsarecomparedat threepositionson the brassvase:Z = 90, 61 and45 mm. White noisewasaddedto the syntheticsoundsto producemorecomparablespectrograms.
Draft of 3:03am,Friday, August18,2000 62
Time (sec)
Fre
quen
cy (
Hz)
0 0.01 0.02 0.03 0.04 0.05 0.06 0.070
200
400
600
800
1000
1200
1400
1600
Figure 8.7: Detail of narrow-band low-frequency noise. A high-energy bandofnoiseis visible between0 and 300 Hz.Moderatelevelsof noisearealsovisiblefrom 300to 600Hz.
Time (sec)
Fre
quen
cy (
Hz)
0 0.2 0.4 0.6 0.80
200
400
600
800
1000
1200
1400
Time (sec)
Fre
quen
cy (
Hz)
0 0.2 0.4 0.6 0.80
200
400
600
800
1000
1200
1400
Figure 8.8: Effect of noise on low-frequency modes.Thoughrecordedlow-frequency modesareaudiblefor approx-imately0.1seconds,moderatenoiselev-els sustainthe estimatedmodesto ap-proximately0.44and0.75seconds.
dB) for 0.44and0.75seconds(Figure8.8). This artificial sustainof low-frequency modes
alsocontributesto thelower perceivedpitch.
Thesignal-to-noiseratio for theserecordingswasin therangeof 30 to 40.
Numberof verticesObjectName Coarse Rejected Missed Added SampledTuningfork 1 0 0 N/A 1Brassvase 136 56 14 93 173Plasticspeaker 98 41 14 28 85Toy drum 31¾ 0 4 163 190¿
Numberof verticesof thetopsurface.
Table8.2: Summaryof refinementresults. This tablesummarisesthe numberof coarse,rejected,missed,addedandsampledverticesfor all of thetestobjects.A rejectedvertex isonewhich is out of theworking envelopeof thePumaarm. Missedverticeswerecountedwhenno forcewassensedatasamplelocation(i.e.,ahole).
Draft of 3:03am,Friday, August18,2000 63
(a) (b)
Figure8.9: Refinementresultsof brassvaseexperiment.Therefinedsamplelocationsareplotted on the surfacemodel. Despitegeometricsymmetry, refinementpatternsare notsymmetricas the two views (a) and(b) illustrate. Verticesarecolour-codedby revisionnumber;black,grey andwhite representcoarse,first andsecondrefinements.
8.3.2 RefinementResults
Refinementof thesamplingmeshby theadaptive samplingalgorithmis illustratedin Fig-
ure8.9.Table8.2summarisestheresultsof refinementfor all thetestobjects.An interesting
resultof the vase’s refinementis its unusualasymmetry. Sincethe vaseis approximately
circularly symmetric,it wasexpectedthat the refinementpatternwould alsobe symmet-
ric. As shown in Figure8.9,however, refinementvariedgreatly. The imageon the left (a)
shows many refinedsamplinglocations,while the imageon theright (b) is almostvoid of
refinements.Therearetwo possibleexplanations.First, it is possiblethat the vaseis not
acousticallysymmetric. If so, this exampleis a strongargumentfor the necessityof an
adaptively samplingalgorithm.Alternatively, it maybethat theacousticdistancebetween
coarsemodelsis very closeto the threshold.If so, thevariability of parameterestimation
maybesufficient to occasionallyincreaseacousticdistancesabove thethreshold.Giventhe
regularpatternof revision apparentin Figure8.9(a), thisexplanationis unlikely.
Oneaspectof thevase’s geometryintroduceda difficulty for thesystem:thereare
threerows of small holesaroundthe mouthof the vase. The experimentis programmed
to identify missedsamplelocationsif no contactis sensed.With this particularobject,
Draft of 3:03am,Friday, August18,2000 64
though,theoutercaseof thesolenoidoften contactedthesidesof a holeeven thoughthe
plungerpassedthrough.Whenthis occurred,thesystemacquiredsamplesof nothing. Of
course,thesedegeneratesamplesintroducedrefinementsaroundtheseholes.Althoughthis
mayresultin overly densesamplingof thetop rim, it alsoincreasesthelikelihoodthat the
surfacessurroundingtheholeswill besampled.
8.4 Plastic Speaker
A completesoundmodelwasalsogeneratedfor thesmallspeakershown in Figure8.10(a).
Thespeaker is completelyplastic,with theexceptionof ametalgrill coveringthefront face.
A cubemeshwith 98 verticesrepresentedthespeaker for theadaptive samplingalgorithm
(Figure8.10(b)). Althoughthespeaker’s surfacecouldbeadequatelydescribedby fewer
vertices,interior verticeswereaddedto seedthe refinementof the adaptive samplingal-
gorithm. Sincethe soundsat the cornersof the speaker aresimilar, refinementwould be
unlikely if only cornerverticeswereusedto representthesurface.
Tenfrequency modeswereestimatedateachsamplinglocation.Preliminaryexper-
imentsdeterminedthat ten modessufficiently representthe soundof the speaker at most
locations.Five recordingsateachsamplelocationwereusedto createprototypicalmodels.
Thesoundeffectorstruckthespeaker normalto thesurfaceateachsamplelocation.
The speaker wassecuredto the ACME teststationusingdouble-sidedtapealong
the bottomedges.The microphoneon the FMS recordedthe samplesfrom a distanceof
approximately90mm. Becauseof thelow amplitudeof theimpactsounds,thesignalwould
bedominatedby roomnoiseat largerdistances.
8.4.1 Estimation Results
The speaker was a problematictest object for two reasons.First, the contactsoundsit
producesare quiet and decayquickly. Low amplitudeis a concernsince it decreases
the signal-to-noiseratio. As proven by the evaluationof the estimationalgorithm(Sec-
Draft of 3:03am,Friday, August18,2000 65
(a) (b)
Figure8.10: A photo(a) of plasticspeaker andthesubdivision surfacewhich representsit(b).
tion 2.4.1),estimationaccuracy degradesdramaticallywith increasingnoiselevels. Also,
becausethesounddecaysquickly, thesoundof thesolenoid’s returnis sometimespresent
in therecordings.Whenthesolenoidreturnsafterimpact,theplungeroftenstrikestheside
of its exit hole. Normally, this chatteris quietenough,or themicrophonefar enough,that
it is not recorded.Becausethemicrophoneneedsto be so closeto the speaker, however,
thesolenoid’s soundis recordable.Choiceof anacceptablerecordingdistanceis therefore
a trade-off betweengoodsignalamplitude,andrecordingthesoundof thesolenoid.
Theproblemof microphonedistancewasfurthercomplicatedby thehardwaresur-
roundingthe microphoneon the FMS (i.e., the camera,Triclops andpan/tilt unit). Fre-
quently, themicrophonecouldnotbemovedcloserto thestrike locationbecausethePuma
armwouldcollidewith theFMShardware.
Thesecondproblemwith usingthespeaker asa testobjectis registration.Because
surfacemodelcreationis not yet automatic,theobjectmustbemanuallyregisteredto the
stagefor correspondencewith the surfacemodel. With objectsthat are circularly sym-
metric, small imprecisionis tolerable. A squareobject,however, mustbe moreprecisely
positioned.During thetestit wasnotedthatthespeaker wasnot accuratelypositionedand
the edgesof eachfacewere not reliably struck at a normal angle. It is hopedthat this
Draft of 3:03am,Friday, August18,2000 66
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
(a) Recordedspectrogram. (b) Synthesisedspectrogram.
Figure8.11:Resultsof plasticspeaker experiment. A spectrogramof a samplerecordedat the top of the speaker (a) is comparedto a spectrogramof a soundsynthesisedfrom aten-modeprototypicalmodel(b).
problemwill beeliminatedby thedevelopmentof ashapeacquisitionmodulefor ACME.
Despitethesedifficulties,goodsoundmodelswereproducedat many sampleloca-
tions. The spectrogramsin Figure8.11 illustratethe similarity betweenthe recordedand
synthesisedsounds.Themodelis clearlyan accuraterepresentationof the recordedsam-
ple. Somemodelssufferedthesamepitch-lowering effectsasdiscussedin Section8.3.1.
Additionally, The soundmodelsof metal grill were generallynot audibly similar to the
recordings,sincetheir signal-to-noiseratioswerepoorer.
In comparisonto the brassvase,the speaker’s soundmodelhasa narrower band-
width andsharperdecay. This resultsupportstheperceptualstudiesof Klatzky etal. [16].
8.4.2 RefinementResults
The soundof the plastic speaker is mostly uniform, thoughit doesvary from the edge
to the middle of eachface. Sincethe middle of eachfaceis unsupported,the soundis
generallylower in frequency thanthe edges.We expectedthis variation to trigger some
refinementof thesamplingmesh.As is illustratedin Figure8.12,very little refinementwas
required. Futureexperimentscould investigatelower similarity thresholdsanda coarser
surfacemodelasstimulantsof refinement.
Draft of 3:03am,Friday, August18,2000 67
Figure8.12:Refinementresultsof plasticspeaker experiment.Very few refinementsweremadeon the speaker, with most refinementsoccurringnear the edges. The refinementpatternfor the top of the speaker is shown here. Verticesare colour-codedby revisionnumber;black,grey andwhite representcoarse,first andsecondrefinements.
8.5 Toy Drum
Thefourth testobject,a toy drum,wasselectedfor its diversityin soundacrossits surface.
Thedrum(Figure8.13(a)) is achild’s toy, madeof plasticwith threemetalbarssuspended
acrossa slot in the top face. Eachmetalbarhasa differentlengthandthereforedifferent
frequency.
A completemodelof thedrumcould not be createdfor theexperimentdueto re-
strictionsof our subdivision surfaceloader. Instead,thedrumis approximatedby a simple
cylindrical mesh(Figure 8.13 (b)). Becausethe handleof the drum is not modelledby
thesurface,we wereunableto createa soundmodelfor theentiredrum. We insteadcre-
ateda soundmodelof only the top facein orderto show the resultsof refinementon an
acousticallycomplex object.
Five sampleswere recordedat eachsamplelocation,and forty modeswereesti-
matedfor eachmodel. Similarly to the brassvaseandspeaker, the drum wasaffixed to
the ACME teststationusingdouble-sidedtapeon its bottomedges.The microphoneon
Draft of 3:03am,Friday, August18,2000 68
(a) (b)
Figure8.13: A photo(a) of toy drumandthesubdivision surfacewhich representsit (b).
Figure8.14: Whenmeasuringneartheedgeof thebars,theplungeroftencausedthebarstopivot on their supports.Whentheplungerretracted,thebarswould returnto their nominalpositions,reducingthedistancebetweentheplungerandthebar.
theFMS wasagainusedto recordthesamplesat a distanceof 130mm,andthedrumwas
strucknormalto its surface.
8.5.1 Estimation Results
The toy drum’s constructionintroducedtwo problemswhich affectedthe quality of the
soundmodels. Sincethe metalbarsaresupportedonly alongtheir centralaxis, they are
ableto pivot aroundthataxis (Figure8.14). Unfortunately, whenthesoundeffectormea-
suredlocationsneara bar’s edge,thebarmoved a few millimetresuponcontact,thenre-
turnedoncethesoundeffectorwasretractedto strike. Thismovementreducedthedistance
betweentheplungerandthemetalbarsandproduceduncharacteristically dampedsounds.
The seconddifficulty arosefrom the spacesbetweenthe metalbars. If a samplelocation
Draft of 3:03am,Friday, August18,2000 69
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
(a)Recordedspectrogram. (b) Synthesisedspectrogram.
Figure8.15: Resultsof toy drumexperiment(metal).Thespectrogramof arecordedsampleof themiddlemetalbar is shown on the left (a). Thespectrogramof a soundsynthesisedfrom theprototypicalmodelat thesamelocationis shown on theright (b).
lay in oneof thosespaces,nosoundmodelwascreated.Thisabsencepreventedany refine-
mentbetweenthat samplelocationandadjoiningvertices.Sincetheseholeslay between
thebars,adaptive refinementbetweenthebarsdid notalwaysoccurasexpected.
Apart from the effectsjust listed,mostof thesoundmodelsweresuccessful.Ex-
ceptionallygoodmodelswereproducedfor themetalbarswhenthey werestruckneartheir
centers.For example,Figure8.15 illustratesthe fidelity of the model for the middle bar.
With theexceptionof thenoiseeffectsmentionedpreviously, thespectrogramsarenearly
identical.
Resultsof modellingtheplasticsurfacewereacceptable,thoughnotassuccessfulas
themetalbars.As Figure8.16demonstrates,thefrequency spectrumwastypically correct,
but the dampingparameterswere often inaccurate. One additionalconsequenceof the
constructionof thedrumis thatthemetalbarsoftenresonatedwhentheplasticwasstruck.
Thougha minor effect, it mayhave contributedto thesustainof somemodes.More likely,
theprimaryreasonfor poorerestimationis theloweramplituderesponseof theplastic.
Draft of 3:03am,Friday, August18,2000 70
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
(a) Recordedspectrogram. (b) Synthesisedspectrogram.
Figure8.16: Resultsof toy drumexperiment(plastic).Thespectrogramof a recordedsam-pleof drum’splasticsurfaceisshown of theleft (a). Thespectrogramof asoundsynthesisedfrom theprototypicalmodelat thatlocationis shown on theright (b).
8.5.2 RefinementResults
Resultsof the refinementare plotted in Figure 8.17. As expected,the model was re-
finedsubstantially, especiallyaroundtheinterfacebetweenthemetalbarsandplastic(Fig-
ure 8.17 (b)). Unfortunately, several missedsamplelocationsoccurredat gapsbetween
metal barson the right side. Slight inaccuraciesin the manualregistrationof the drum
causedsomesamplelocationson theright sideto lie betweenbars,but noton theleft side.
Regardless,it is clearfrom the left sideof thediagramsthat thesamplingmeshis denser
nearmaterialboundaries.This resultis convincing evidenceof thesuccessof theadaptive
samplingalgorithm.
Draft of 3:03am,Friday, August18,2000 71
−0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08
−0.06
−0.04
−0.02
0
0.02
0.04
0.06
(a)Samplelocationsplottedon surface. (b) Diagramof refinement.
Figure8.17: Refinementresultsof toy drum.Therefinedsamplelocationsareplottedonthesurfacemodelin (a). Verticesarecolour-codedby revision number;black,grey andwhiterepresentcoarse,first andsecondrefinements.A simplifieddiagram(b) alsoshows samplelocationsthatweremissed(diamonds).The locationof themetalbarsis indicatedby thevertical rectangle.Hererefinementlevels arerepresentedas’o’, ’+’ and’x’ in ascendingorderof refinementlevel.
Chapter 9
Conclusions
9.1 Overview
A systemto automaticallycreatesoundmodelsof everydayobjectswasdesignedandcon-
structed. This systemusesa surfacemodel of a test object to adaptively selectsample
locations. At eachof theselocationsa device, calleda soundeffector, strikes the object
to elicit an acousticimpulseresponse.Multiple impactsareusedto createa prototypical
soundmodelwhich bestrepresentsthesoundat that location.Thehardware,softwareand
algorithmsrequiredfor this systemwereimplementedandtested.
Overall, thesoundmodelsproducedof thetuningfork, drum,speakerandtoy drum
areencouraging.Theseresultsdemonstratethatexcellentsoundmodelscanbeconstructed
underfavourablenoiseandimpactconditions.For example,thefundamentalfrequency of
thetuningfork wasaccuratelyestimatedwithin thelimits of thediscreteFouriertransform.
Constructionssuchas the drum’s metal barsposea difficult problem. Still, the
systemperformsreliably for “rigid” objects– acharacteristicof many everydayobjects.
The Achilles Heel of the systemis noise. Evaluationof the parameterestimation
algorithmin Chapter2 revealedlarge errorsin estimationfor evenmodestlevelsof back-
groundnoise.Evenmodelswith relatively accuratefrequency estimatescontainedspurious
low-frequency modesand incorrectdelayconstantsdue to noise. Theseeffects areper-
72
Draft of 3:03am,Friday, August18,2000 73
ceivedasa lowerpitch of thesynthesisedsounds.
Prototypicalmodelsareoneway to reducethe effect of noise. Evaluationof the
spectrogramaveragingtechniqueshoweda significantreductionis meanestimationerror.
The adaptive samplingalgorithm presentedin Chapter7 suffers a sensitivity to
noisymodelsandmissedsamplelocations.Its refinementof thevaseanddrum’s sampling
meshes,however, areevidenceof its potential. Improvementof the modelsandacoustic
distancemetricshouldimprove futureresults.
9.2 Futur e Work
As with all research,this thesishascreatedmany new opportunitiesfor futurework. This
sectionsuggestsavenuesfor futureresearchanddevelopmentof thesystem.
Ultimately, environmentalnoisemust be removed at its source. Either a sound-
proofenclosuremustbeconstructedaroundACME, or it mustbelocatedin aroomwithout
machines.
Solenoidnoisealsopresenteda problemfor “low-amplitude”materials.Sincethe
soundis producedby chatterbetweentheplungerandthesolenoidcasing,it is presumed
thatthesolenoidmustbereplacedby aspecial-purposedevice. An alternateapproachis to
coatthesolenoidplungerwith rubber.
The original designof the soundeffector suggesteda replaceabletip. Currently,
a hard steel tip is usedto producea sufficiently impulsive impact. Other tips could be
constructedfrom othermaterialsto investigatetheir effect.
Anotherusefuladditionto thesystemwouldbeaforcesensoror loadcell for thetip
of theeffector’s plunger. Knowing theforceprofileof theimpactcouldyield moreaccurate
soundmodels.Recordingtheback-currentthroughthesolenoidcoil mayalsobeasolution.
The problemof microphoneplacementcould be addressedby repositioningthe
microphoneontheFMSfor eachimpact.A sophisticatedmotionplannerwouldberequired
to preventcollisionswith theobject,teststationandcontactmeasurementsystem.
Oneissuenot addressedby thecurrentdesignis theeffect attachingobjectsto the
Draft of 3:03am,Friday, August18,2000 74
ACME teststationhason theboundaryconditionsof thesoundmodel. By attachingone
sideof anobjectto theteststation,thatsideis preventedfrom vibratingfreely. Theresulting
soundmodel is thereforeonly applicableonly to synthesisof the object’s soundin this
configuration.Oneusefulconfigurationis an “all-free” boundaryconditionwhereall but
the impulselocationarefree to vibrate. This is usefulfor simulationsof droppedobjects.
Onepossiblesolutionis to holdtheobjectin placeby aminimumnumberof pointcontacts.
A setof rubberconesmaybeusedfor thispurpose.
Thoroughevaluationof the system,particularly the adaptive samplingalgorithm,
requiresaplaybackdevice. Intelligentaudiomorphingalgorithmsfor synthesismayreduce
a model’s dependenceon adaptive sampling. At the very least,software which morphs
betweensamplelocationsto provide a continuousaudio map of objectswill encourage
perceptualstudiesevaluatingtheeffectivenessof theacousticdistancemetricandperceptual
thresholds.This researchwill hopefully resultin an iterative improvementof theadaptive
samplingalgorithm.
Otherexperimentsshouldalsobeconductedto investigatetheeffectof strike angle
relative to surfacenormalon thesoundmodel. It is clearthat theamplitudeof thesound
will changeasthestrike angleapproaches0À . It is not immediatelyclearwhetherobjects
requireanisotropicsoundmodels. The currentsystemis easilyprogrammedto perform
theseexperiments.
Bibliography
[1] VariousAuthors.ComputerMusicJournalSpecialIssuesonPhysicalModeling. 16(4)and17(1),MIT Press,1996and1997.
[2] Kevin Bradley. Synthesisof anacousticguitarwith a digital stringmodelandlinearprediction.Master’s thesis,Carnegie Mellon University, 1995.
[3] AntoineChaigneandVincentDoutaut.Numericalsimulationsof xylophones.Journalof AcousticalSocietyof America, 101(1):539–557,1997.
[4] Perry R. Cook and Dan Trueman. A databaseof measuredmusical in-strument body radiation impulse responses, and computer applications forexploring and utilizing the measured filter functions. Available Online:http://www.cs.princeton.edu/prc/ism98fin.pdf, 1998.
[5] CreativeTechnologyLtd. SoundBlasterLive! HardwareSpecifications, 2000.Avail-ableonline: http://www.soundblaster.com.
[6] Tony DeRose.Subdivision surfacecoursenotes.SIGGRAPHCourseNotes,1998.
[7] ShlomoDubnov, Naftali Tishby, andDaliaCohen.Clusteringof musicalsoundsusingpolyspectraldistancemeasures.In Proceedingsof the1995InternationalComputerMusicConference, pages460–463,1995.
[8] RobertS. Durst andEric P. Krotkov. Objectclassificationfrom analysisof impactacoustics. In Proceedingsof the IEEE/RSJInternationalConferenceon IntelligentRobotsandSystems, volume1, pages90–95,1995.
[9] W. W. Gaver. EverydayListeningand Auditory Icons. PhD thesis,Univeristy ofCaliforniain SanDiego,1988.
[10] W. W. Gaver. Synthesizingauditoryicons.In Proceedingsof theACM INTERCHI’93,pages228–235,1993.
75
Draft of 3:03am,Friday, August18,2000 76
[11] AugustineH. Gray, Jr. andJohnD. Markel. Distancemeasuresfor speechprocessing.IEEE Transactionson Acoustics,Speech and Signal Processing, ASSP-24(5):380–391,October1976.
[12] D. J. Hermes.Auditory materialperception.IPO AnnualProgressReport33, Tech-nischeUniversiteitEindhoven,1998.
[13] Wesley H. Huang. A tappingmicropositioningcell. In Proceedingsof the IEEEInternationalConferenceon RoboticsandAutomation, pages2153–2158,2000.
[14] InterTAN Inc. Optimusultra-miniature tie-clip microphonespecifications, 1996.
[15] DouglasKeislar, Thom Blum, JamesWheaton,andErling Wold. A content-awaresoundbrowser. In Proceedingsof the1999InternationalComputerMusicConference,pages457–459,1999.
[16] RobertaKlatzky, DineshK. Pai,andEric Krotkov. Perceptionof materialfrom contactsounds.Presence, (in press).
[17] Eric Krotkov. Roboticperceptionof material.In Proceedingsof theFourteenthInter-nationalJoint Conferenceon Artificial Intelligence, pages88–94,1995.
[18] Eric Krotkov, RobertaKlatzky, and Nina Zumel. Robotic perceptionof material:Experimentswith shape-invariant acousticmeasuresof materialtype. In O. KhatibandJ.K. Salisbury, editors,ExperimentalRoboticsIV, number223in LectureNotesin ControlandInformationSciences,pages204–211.Springer-Verlag,1996.
[19] CharlesLoop. Smoothsubdivision surfacesbasedon triangles.Master’s thesis,Uni-versityof Utah,1987.
[20] RobertL. Mott. SoundEffects:Radio,TV, andFilm. Butterworth Publishers,1990.
[21] DineshK. Pai, JochenLang,JohnE. Lloyd, andRobertJ.Woodham.Acme,a teler-oboticactive measurementfacility. In Proceedingsof theSixthInternationalSympo-siumon ExperimentalRobotics, 1999.
[22] PointGrey Research,Vancouver, Canada.TriclopsOn-lineManual. Availableonline:http://www.ptgrey.com.
[23] PrecisionMicroDynamicsInc. PrecisionMicroDynamicsInc.MC8-DSP-ISARegister
AccessLibrary andUser’s Manual, 1.3edition,1998.
[24] LawerenceRabinerandBiing-HwangJuang. Fundamentalsof Speech Recognition.PTRPrentice-Hall,Inc., 1993.
Draft of 3:03am,Friday, August18,2000 77
[25] JoshuaL. RichmondandDineshK. Pai. Active measurementof contactsounds.InProceedingsof theIEEEInternationalConferenceonRoboticsandAutomation, pages2146–2152,2000.
[26] JoshuaL. RichmondandDineshK. Pai. Roboticmeasurementandmodelingof con-tact sounds. In Proceedingsof the International Conferenceon Auditory Display,2000.
[27] Malcolm Slaney, Michele Covell, andBud Lassiter. Automaticaudiomorphing. InProceedingsof the IEEE InternationalConferenceon Acoustics,Speech and SignalProcessing, pages1001–1004,1996.
[28] KenSteiglitz. A Digital SignalProcessingPrimer with applicationsto Digital AudioandComputerMusic. Addison-Wesley, 1996.
[29] Mark Ulano. Moving picturesthat talk – the early history of film sound. Availableonline: http://www.filmsound.org/ulano/index.html.
[30] K. van denDoel. SoundSynthesisfor Virtual Realityand ComputerGames. PhDthesis,Universityof British Columbia,May 1999.
[31] Keesvan denDoel andDineshK. Pai. The soundsof physicalshapes.Presence,7(4):382–395,1998.
[32] RichardP. Wildes andWhitmanA. Richards. Recovering materialpropertiesfromsound.In WhitmanRichards,editor, Natural Computation. TheMIT Press,1988.
[33] Erling Wold, ThomBlum, DouglasKeislar, andJamesWheaton.Content-basedclas-sification,searchandretrieval of audio. IEEE Multimedia, 3(3):27–36,1996. Alsoavailableonline(www.musclefish.com).
Appendix A
SoundEffector Specifications
A.1 Mounting Bracket
Theconstructionof thesoundeffector’s mountingbracket deservesabrief descriptionhere
for future reference.Constructedfrom a singlepieceof Á|ÂÃÄ ” x ÃÅ ” aluminum,it was
formed on a bending-barfollowing the schedulein Figure A.1. To allow for the finite
bendingradiusof the material,an additional ÃÄ ” ( ÃÅ ” x 2) wasaddedto the lengthof the
material.Following thebendingspecificationsin FigureA.1, thespecifiedspacingbetween
thetwo ends(i.e.,2.5”) wasmaintained.Unfortunately, two artifactsof thebendingprocess
arepresentin thebracket: fatiguemarksandoff-centrealignment.Thefatiguemarkswere
producedon theexterior radiusof eachbend.Theseoccurredbecausethemetalwasbent
beyond its permissiblestresslimit. This might be avoided in future constructionsif the
metalwasfirst heated.Thealignmentof thesolenoidis alsoslightly off-centrefollowing
thebending.This is aflaw of thealignmentof thematerialin thebendingvise.
78
Draft of 3:03am,Friday, August18,2000 79
1/2"3/4"3/4"
1-1/4"
nominal spacing
bend lines
5/8"
2-1/2"
2-1/4"1/2"
0.25"0.5625"
FigureA.1: Bendingschedulefor soundeffectormountingbracket.
Draft of 3:03am,Friday, August18,2000 80
A.2 Control circuit
The circuit interfacing the soundeffector to the PrecisionMicroDynamicsMC8 boardis
diagrammedin FigureA.2. It is a simpleswitchingcircuit, with a 74F245Octal buffer
to isolatethe MC8 from the relay. This circuit may be duplicatedto control otherdigital
outputdevicessuchaslights.
+5 VDC
Pin J4-37
+12 VDC
V
To Solenoid
solenoid
100
1 k1/8 Q1
T1
Q1: 74F245 Octal BufferT1: P2N2222 NPN
Ω
Ω
FigureA.2: Schematicfor solenoidcontrolcircuit.
Appendix B
Effect of White Noiseon
Spectrograms
Thetwo spectrogramsin FigureB.1 arepresentedto comparetheeffect of white noiseon
theappearanceof aspectrogram.Withoutbackgroundnoise(FigureB.1 (a)),thefrequency
modesappearaswide bands,andappearto besustainedlonger. It thereforebecomesdif-
ficult to comparesynthesisedspectrogramsto measuredones. By addinglow amplitude
(e.g.,SNR= 100)whitenoiseto thesynthesisedsound,themappingof coloursto intensity
valuesis scaledmorecomparablyto the original recording. For this reason,all spectro-
gramsof synthesisedsoundsin Chapter8 includewhite noiseaddedat a signal-to-noise
ratio approximatelythe sameasthe measuredsamples.The white noiseis addedto the
synthesisedsignalprior to computingits spectrogram.
81
Draft of 3:03am,Friday, August18,2000 82
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
Time (sec)
Fre
quen
cy (
Hz)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 10
4
(a)Purespectrogram. (b) Spectrogramwith additive white noise.
FigureB.1: Effect of white noiseon spectrograms.Thespectrogramin (a) is producedbya soundsynthesisedfrom a forty-modesoundmodel. Thespectrogramin (b) is thesamespectrogram,but with whitenoiseaddedatanSNRof 100.Whitenoisescalesthemappingof colourto intensitymorecomparablyto spectrogramsof recordedsounds.
Appendix C
Detailsof Unique
Frequency-MappingAlgorithm
A simplealgorithmwasdesignedto matchfrequency modesbetweentwo soundmodels
suchthata one-to-onemappingexists.That is, giventwo models(A andB), eachwith Æ|Çfrequencies,a mappingfunction ÈÉ Ê Ë is producedsuchthat ÌAÍÎ Ï mapsto ÌAÐAÎ Ñ when ÒÔÓÈÉ Ê Ë . Thealgorithmis summarisedin Chapter7 (Figure7.4) andrepeatedin FigureC.1
for convenience.This appendixdescribesthe implementationof the algorithmin greater
detail.
Five arraysareusedto track the matchingof modes.The first is a matrix of fre-
quency differencesbetweenall modesin thetwo models(FigureC.2). Thisdifference
matrix is usedto createtheorder array. This arrayorderstheindicesof modesin model
B from nearestto farthestfor eachmodein modelA. (FigureC.3). Thethird arrayis a list
of indicesindicatingthenext modeto check(FigureC.4). A fourth array(FigureC.5) is
usedto trackwhichmodesin modelA arecurrentlymatched.Thefifth arrayis theresultof
thealgorithm: anarrayrelatingthemappingof modeindicesin modelA to modeindices
in modelB (FigureC.6). Theindex of themodein modelA matchedto modej in model
B is storedasmapping[j]. Theelementsof themapping arrayareinitialised to ’-1’,
indicatingunmatchedmodes.
83
Draft of 3:03am,Friday, August18,2000 84
Õ For eachfrequency modeÖA×PØ Ù in modelA . . .
1. Find thefrequency modeÖAÚ Ø Û nearestto ÖA×PØ Ù2. If the ÖAÚ Ø Û is unmatched,matchit to ÖA×PØ Ù3. Or, if ÖÚ Ø Û is matchedto anothermodewhich is fartherthan Ö×Ø Ù , matchÖÚ Ø Û
to ÖA×PØ Ù instead.Mark Ö9Ü|Ý Þ asmatched,andthemodeit replacesasunmatched.
4. Otherwise,loop to step1, picking thenext nearestfrequency mode
Õ Repeatuntil all frequency modesin modelA arematched.
FigureC.1: Algorithm to find uniquefrequency mappingbetweentwo models.
2 3B
1 4
3
4
1
2 3 4
8 1
1
1
A
6
6
1
4
A3 and B4 is 6.
0The difference
between modes5
6
1
7
2
FigureC.2: difference matrix. Dif-ferences in frequencies between allmodes in Model A and all modes inModel B are storedin this array. Forexample,difference[3][4] is thedifferencebetweenthe ßA×PØ à and ß Ú Ø á .
larg
er d
iffer
encefa
r
3
4
32A 4
4
3
mode A4.
third nearest to 3
4 4
3
1
Mode B1 is the1
11
1
22
2
2
FigureC.3: order array. Eachcolumnof theorder arraycontainsthe indicesof modesin ModelB from nearestto far-thestof a modein Model A. For exam-ple, order[3][4] is the index of theâ ã ä
nearestmodein ModelB to ßA×Ø á .
Draft of 3:03am,Friday, August18,2000 85
1
A
1
13
4
1
2
3
iteration.
against the
Check A1
mode next
3rd closest
Figure C.4: indexarray. Each elementcontains an indexinto the order ar-ray at which to selectthe next mode fromModel B for compar-ison. The elementsare non-decreasing,thus eliminating cycleswhere two modes arerepeatedly comparedagainstanotherpair.
2
3
A
1
4
false
false
true
false
Figure C.5: matchedarray. Indicateswhichmodes of Model Ahave beenmatchedto amode in Model B. Thealgorithm terminateswhen all elements ofthisarrayaretrue.
B
-1
-1
4
1
2
3
-1
to mode A2.
Mode B1
is mapped
2
Figure C.6: mappingarray. When the algo-rithm terminates, thisarray maps modes inModel A to modes inModel B. For examples,mapping[1] is con-tains the index of themodein ModelA whichmapsto åAæAç è .
Draft of 3:03am,Friday, August18,2000 86
The pseudo-codein FigureC.7 usesthesefive arraysto implementthe algorithm
of FigureC.1. Eachunmatchedmodei in modelA is examinedin sequence.Thej =
index[i] é ê modelisted in columni of theorder matrix is checkedagainstthemap-
ping array. If nomappingexists(i.e.,mapping[b] = -1, whereb = order[i][j]),
themappingis setto modei, modei is marked asmatched(i.e.,matched[i] = true)
andthe next modein modelA is examined(i.e., i = i + 1). If a mappingexists, the
differencebetweenthe currently mappedmodesis comparedto the differencebetween
modesi andb (usingthedifferencematrix). If thecurrentmapping’sdifferenceis less
thantheproposednew mapping,index[i] is incremented,andthenext nearestmodej
is examined. Otherwise,mapping[b] is set to i, andthe previously mappedmodeis
markedasunmatched.
This processcontinuesuntil all modesin modelA arematched.If thelastmodei
in modelA is examinedbeforethe mappingis complete,i is resetto thefirst unmapped
modein modelA andtheloop continues.
SincemodelsA andB containthesamenumberof modes,eachof whichis uniquely
mappedto oneothermodes,the algorithmis guaranteedto terminate.Theindex array
preventstwo pairsof modesfrom beingcomparedtwice, therebyeliminatingpotentialcy-
cles.
Draft of 3:03am,Friday, August18,2000 87
int[] FindUniqueMapping(double[] modesOfA, dou-ble[] modesOfB)
// the difference[a][b] is the difference be-tween frequency
// modes a (from Model A) and b (from Model B)difference = calculateDistance(modesOfA, modesOfB);
// order[a][i] is the index to the ith near-est mode in Model B
// to mode a in Model Aorder = sort(difference);
// index[a] is the index of the next mode in the order// array for mode aint[] index;
// initialise all elements of index to 0index is all 0;
// matched[a] indicates whether mode a in Model A has// been matched yetboolean[] matched;
// Initialise all elements of matched to falsematched is all false;
// mapping[b] is the index of the mode in Model A that best// maps to mode b of Model B.int[] mapping;
// Initialise all elements of mapping to -1mapping is all -1;
Draft of 3:03am,Friday, August18,2000 88
// Iterate until all modes in Model A are matchedwhile (matched is not all true)
// For each unmatched mode in Model A...for (i = 0 to modesOfA.length) if (matched[i] == false) // Find nearest unmapped mode in Model Bfor (j = index[i] to order[i].length)
// b is the index of the next near-est mode in Model B
b = order[i][j];
// a is the index of the mode in Model A to which b// is mapped (if any)a = mapping[b];
// if mode b is unmapped, or if it is mapped// to mode in Model A which is far-
ther, map it to// mode iif ((a == -1) OR (difference[a][b] > dis-
tance[i][b])) mapping[b] = i;if (a > -1) then matched[a] = false;matched[i] = true;index[i] = j + 1;break j loop;
// end j loop
// end i loop
FigureC.7: FindUniqueMappingPseudo-code