interactive learning of the acoustic properties of objects by a robot jivko sinapov mark wiemer...
TRANSCRIPT
Interactive Learning of the Acoustic Properties of Objects
by a RobotJivko SinapovMark Wiemer
Alexander Stoytchev{jsinapov|banff|alexs}@iastate.edu
Iowa State University
Motivation (2) Why should a robot use acoustic
information? Human environments are cluttered with
objects that generate sounds Help robot perceive events and objects
outside of field of view Help robot perceive material properties of
objects
Related Work Krotkov et al. (1996) and Klatzky et al. (2000):
Perception of material using contact sounds. Learned sound models for tapping aluminum, brass,
glass, wood, and plastic (one object per material)
Richmond and Pai (2000) Robotic platform for measuring co
ntact sounds between robot’s end effector and object surfaces
Models the contact sounds from different materials using spectrogram averaging [Richmond and Pai, 200]
Related Work (2) Torres-Jara, Natale and Fitzpatrick (2005)
Robot taps objects and records spectrogram of sound Recognize objects using spectrogram matching Recognized 4 test objects used during training.
Tapping objects Spectrogram of tapping
Our Study Demonstrate object
recognition using acoustic features from interaction
18 Different Objects 3 Different behaviors:
push, grasp, drop Evaluate different machine
learning algorithms
Sound Feature RepresentationStep 1: segment sound wave during interaction:
Step 2: Compute Discrete Fourier transform (DFT) of sound wave:
Step 3: Compute 2-D histogram of DFT matrix using block averaging:
10 temporal bins
5 frequency bins
Time
Fre
quen
cy
Object Recognition using Acoustic Properties of Objects
Problem: given robot’s behavior and detected sound features from interaction, predict the object.
Behavior: Sound Features: Object Class:
grasp
Example:
Problem Formulation Let be the set of
exploratory behaviors Let be the set of objects, Let be a data point such that:
, , and
For each behavior learn a model that can estimate
Learning Algorithms K-NN
Simple instance-based algorithm Uses Euclidean distance function
Support Vector Machine (SVM) Discriminative approach, uses Kernel trick
Bayesian Network Probabilistic graphical model Sound Features are discretized into bins
Learning Algorithms: k-NN, SVM, and Bayesian Network
k-NN: memory-based learning algorithm
? Test point
With k = 3:
2 neighbors
1 neighbors
Therefore, Pr(red) = 0.66 Pr(blue) = 0.33
Support Vector Machine: discriminative learning algorithm
Learning Algorithms: k-NN, SVM, and Bayesian Network
1. Finds maximum margin hyperplane that separates two classes
2. Uses Kernel trick to map data points into a feature space in which such a hyperplane exists
[http://www.imtech.res.in/raghava/rbpred/svm.jpg]
Bayesian Network: a probabilistic graphical model
Learning Algorithms: k-NN, SVM, and Bayesian Network
A
C D
E
B
1. Full power of statistical modeling and inference
2. Learning: learns both the structure of the network and the parameters (conditional probability tables)
3. Numerical features are discretized into bins
Using Multiple Behaviors Given trained models , ,
Given novel sounds , , from behaviors performed on the same object
Assign prediction to object class that maximizes:
Evaluation 6 trials recorded with each of the 18 objects
with each of the 3 behaviors Leave-one-out cross-validation Compared performance of learning
algorithms as well as behaviors Performance Measure:
4 - 2 - - - - - - -
- 5 - - - - - - - 1
- - 5 - - - - - - 1
- - - 0 6 - - - - -
- - - 3 3 - - - - -
- - - - - 4 1 - - 1
- - - - - - 6 - - -
- - - - - - - 6 - -
- - - - - - - 1 5 -
- - - - - 1 - - - 5
Confusion Matrix for model Mpush using Bayesian Network
Predicted →
Perfect classification and no false positives for:
Confusion Matrix for model Mcombined using Bayesian Network
6 - - -
1 5 - -
- - 5 1
- - 1 5
Conclusion: The errors made by models Mgrasp, Mpush and Mdrop are uncorrelated.
Predicted →
Learning rate of algorithms
Compare performance of the model Mgrasp as a function of dataset size for:
• k-NN• Support Vector Machine• Bayesian Network
Summary and Conclusions Accurate acoustic-based object recognition with
18 objects and 3 behaviors Using multiple behaviors improves recognition
regardless of learning algorithm Bayesian network performed best with given
feature representation Grasping and Pushing interaction produces
sound features that are more informative of the object than Dropping
Future Work Scaling up:
Increase number of objects Vary object and robot pose Autonomous interaction
Use unsupervised learning to form object sound categories
More powerful feature representations Temporal features (i.e. periodicity) of sounds
Use models to detect events in the world performed by others (humans or other robots)