embodied music meditation: a real-time interactive audio...

1
Embodied Music Meditation: A Real-time Interactive Audio-Visual System for Buddhist Mudras Exploration Motivation Problem Definition Models and Approaches (cont’) Experiment and Results (cont’) RAU, MARK ZHANG, YUN ZHOU, YIJUN CS 229 PROJECT, STANFORD UNIVERSITY Model and Approaches Experiment and Results 1 2 3 4 5 6 7 mean KNN 0.43 0.56 0.63 0.50 0.58 0.77 0.43 0.57 SVM 0.86 0.55 0.50 0.73 0.32 0.31 0.32 0.52 FANN 0.25 0.50 0.25 0.60 0.58 0.50 0.50 0.43 Table 2: comparison of different prediction models on different mudras. The result is on test set. Analysis Model Selection Hands out of range K-nearest Neighbors Algorithm (KNN) o k = 20 Support Vector Machine (SVM) Binary Decision Tree (BDT) K-means o Cluster outside trajectory Overlapping gesture classification KNN o k = 8 SVM o Reduced to 3 features to relieve overfitting. Fast Artificial Neural Network (FANN) o 8 inputs, 7 outputs, 3 layers and 64 hidden neurons. Collaboration: With J. Cecilia Wu, a PhD Candidate at UCSB, on her ongoing project “Embodied Music Meditation”. Goal: Transform Buddhist Mudras performed by hands to a real- time audio-visual performance. Challenge: Two internal problems of the Leap Motion sensor used for hand tracking. Problem 1: Hands out of range The Leap Motion will not record any data when a hand is out of range. We solved this problem by predicting the trajectory outside the range in real-time. Dataset: 292 examples, split into training set (80%) and test set (20%). Problem 2: Overlapping gestures classification The Leap Motion becomes inconsistent in tracking overlapping Buddhist Mudras. We used the motion recorded before hand overlapping to classify gestures in real-time. Dataset: 435 examples (~60 examples for each mudra), split into training set (70%) and test set (30%). mudra 1 mudra 2 mudra 3 mudra 4 mudra 5 mudra 6 mudra 7 Figure 1: examples of input gestures and output labels Feature Extraction Hands out of range Features (285 = 19*15) hand position, hand velocity, palm normal, finger altitude, finger pan over 15 frames. Overlapping gestures classification Features (8) average of palm normal and finger altitude over 10 frames (1/6 sec). Figure 2: sensor system setup Figure 3: num of clusters and num of nearest neighbors selection Hand out of range Train Accuracy Test Accuracy Predict Time SVM 0.49 ± 0.09 0.21 ± 0.08 0.038s BDT 0.80 ± 0.03 0.20 ± 0.03 0.0043s KNN 0.36 ± 0.02 0.30 ± 0.04 0.0069s Table 1: comparison of different prediction models Overlapping gesture classification Hand out of range KNN model with k=20 gave the best test accuracy of 30% which is still not great but is better than chance (10%). The response time of KNN model is 6.9ms, which is good for real-time performance. Overlapping gesture classification Confusion matrix KNN model with k=8 has the highest test accuracy of 57%, which improves a lot on chance (14%). The response time of KNN model is 3.9ms, which is good for real-time performance.

Upload: duongmien

Post on 12-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Embodied Music Meditation: A Real-time Interactive Audio ...cs229.stanford.edu/proj2016/poster/RauZhangZhou-EmbodiedMusic... · Embodied Music Meditation: A Real-time Interactive

Embodied Music Meditation: A Real-time Interactive Audio-Visual System for Buddhist Mudras Exploration

Motivation

Problem Definition

Models and Approaches (cont’) Experiment and Results (cont’)

RAU, MARK ZHANG, YUN ZHOU, YIJUN CS 229 PROJECT, STANFORD UNIVERSITY

Model and Approaches Experiment and Results

1 2 3 4 5 6 7 mean

KNN 0.43 0.56 0.63 0.50 0.58 0.77 0.43 0.57

SVM 0.86 0.55 0.50 0.73 0.32 0.31 0.32 0.52

FANN 0.25 0.50 0.25 0.60 0.58 0.50 0.50 0.43

Table 2: comparison of different prediction models on different mudras. The result is on test set.

Analysis

Model Selection Hands out of range • K-nearest Neighbors Algorithm (KNN)

o k = 20 • Support Vector Machine (SVM) • Binary Decision Tree (BDT) • K-means

o Cluster outside trajectory

Overlapping gesture classification • KNN

o k = 8 • SVM

o Reduced to 3 features to relieve overfitting. • Fast Artificial Neural Network (FANN)

o 8 inputs, 7 outputs, 3 layers and 64 hidden neurons.

• Collaboration: With J. Cecilia Wu, a PhD Candidate at UCSB, on her ongoing project “Embodied Music Meditation”.

• Goal: Transform Buddhist Mudras performed by hands to a real-time audio-visual performance.

• Challenge: Two internal problems of the Leap Motion sensor used for hand tracking.

Problem 1: Hands out of range • The Leap Motion will not record any data when a hand is out of

range. We solved this problem by predicting the trajectory outside the range in real-time.

• Dataset: 292 examples, split into training set (80%) and test set (20%).

Problem 2: Overlapping gestures classification • The Leap Motion becomes inconsistent in tracking overlapping

Buddhist Mudras. We used the motion recorded before hand overlapping to classify gestures in real-time.

• Dataset: 435 examples (~60 examples for each mudra), split into training set (70%) and test set (30%).

mudra 1 mudra 2 mudra 3 mudra 4

mudra 5 mudra 6 mudra 7Figure 1: examples of input gestures and output labels

Feature Extraction Hands out of range • Features (285 = 19*15) hand position, hand velocity, palm normal, finger altitude, finger pan over 15 frames. Overlapping gestures classification • Features (8) average of palm normal and finger altitude over 10 frames (1/6 sec).

Figure 2: sensor system setup

Figure 3: num of clusters and num of nearest neighbors selection

Hand out of range

Train Accuracy Test Accuracy Predict TimeSVM 0.49 ± 0.09 0.21 ± 0.08 0.038sBDT 0.80 ± 0.03 0.20 ± 0.03 0.0043sKNN 0.36 ± 0.02 0.30 ± 0.04 0.0069s

Table 1: comparison of different prediction models

Overlapping gesture classification

Hand out of range • KNN model with k=20 gave the best test accuracy of 30% which

is still not great but is better than chance (10%). • The response time of KNN model is 6.9ms, which is good for

real-time performance.

Overlapping gesture classification • Confusion matrix

• KNN model with k=8 has the highest test accuracy of 57%, which improves a lot on chance (14%).

• The response time of KNN model is 3.9ms, which is good for real-time performance.