bob igo cposc 2013 · 2020. 1. 4. · 1 open senses bob igo cposc 2013
TRANSCRIPT
![Page 2: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/2.jpg)
2
Robots Need Senses
● Vision– To detect
● Hearing– To hear and locate
commands● Touch
– To manipulate● Speech(*)
– To argue
![Page 3: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/3.jpg)
3
But I Don't Have a Robot!
● Many robots use commodity hardware.
![Page 4: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/4.jpg)
4
Vision
● Uses– License plate
recognition– Logo recognition– Motion detection
![Page 5: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/5.jpg)
5
Vision: Face Detection
● Uses– Keep screen awake.– Lock screen.– Pause video when not
watching.
![Page 6: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/6.jpg)
6
Vision: Face Detection
● Project: OpenCV– Computer vision suite– Tons of features– Linux, Android, OSX,
iOS, Windows● Demo: ./facedetect.py
– Angle and facial expression critical
● Tied to training data
![Page 7: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/7.jpg)
7
Vision: Face Detection
● Uses– Find human weak points
● Neck is positioned below the face area.
● Eye location often provided.
![Page 8: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/8.jpg)
8
Vision: Face Recognition
● Uses– Tagging/sorting of
photos– Custom doorbell
project● e.g. "Skippy is here."
instead of "ding-dong"
● Requires training
![Page 9: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/9.jpg)
9
Vision: Face Recognition
● Uses– Identify resistance leaders for target prioritization.
– Test disguise effectiveness.
![Page 10: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/10.jpg)
10
Hearing: Localization
● Trivial to detect sound– Nontrivial to figure out its
source.● Uses
– Determine room/zone occupancy
– Target PTZ camera● Projects
– ManyEars– HARK
![Page 11: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/11.jpg)
11
Hearing: Localization
● Uses– Locate living humans
![Page 12: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/12.jpg)
12
Localization: ManyEars
● Linux, OSX, Windows● Specialized hardware
– OpenHardware– 8 microphone inputs– Realtime constraints– CDN $1000 pre-made– CDN $670 DIY
8SoundsUSB
![Page 13: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/13.jpg)
13
Localization: ManyEars
![Page 14: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/14.jpg)
14
Localization: HARK
● Open Source– Only official support for
Ubuntu– Based on ManyEars
● Localization + Separation + Recognition
● Specialized hardware– Not open
MicroCone, USD $360
![Page 15: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/15.jpg)
15
Localization: HARK
● Each sound source can be localized.
● Simultaneous audio can be processed into separate audio channels.
● Speech recognition can be done on each channel.
![Page 16: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/16.jpg)
16
![Page 17: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/17.jpg)
17
![Page 18: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/18.jpg)
18
![Page 19: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/19.jpg)
19
Hearing: Speech Recognition● Uses
– Front-end to automation suite
– Occupancy detection● Project
– Julius
![Page 20: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/20.jpg)
20
Recognition: Julius
● Linux, Windows● Continuous
recognition● Great for
domain-constrained inputs.
● You need an acoustic model.
![Page 21: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/21.jpg)
21
Recognition: Julius
● Things to change– A dictionary
● Words and the phonemes that make them.
– e.g. [CALL] k ao l
– A grammar● What are the valid
sentences in the domain?
– e.g. SENT: CALL_V F_NAME_KENNETH
● Acoustic model:http://www.repository.voxforge1.org/downloads/Main/Tags/Releases/0_1_1-build726/
Example command:● julius-4.2.3 -input mic -C../julius_acoustic_models/julian.jconf
![Page 22: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/22.jpg)
22
Touch
● Uses– Avoid crushing delicate
objects.– Simply detect contact.
![Page 23: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/23.jpg)
23
Touch
● Uses– Crush delicate objects.
![Page 24: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/24.jpg)
24
Touch
● Project– TakkTile
● Schematics CC BY-SA● Firmware GPLv3+● NOTE:
– Terms of licenses may conflict with what they state on their website.
● Arduino, Ubuntu (via USB-I2C bridge ($44-$49)) DIY 3-sensor TakkTile
http://www.takktile.com/tutorial:thee-sensor-array(sic)
![Page 25: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/25.jpg)
25
Touch: TakkTile
TakkStrip pre-made: $149 with rubber; $49 without.
![Page 26: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/26.jpg)
26
Touch: TakkTile
● Technology– MEMS barometers
● robust and sensitive
![Page 27: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/27.jpg)
27
![Page 28: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/28.jpg)
28
Touch: TakkTile
![Page 29: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/29.jpg)
29
Speech Synthesis
● Uses– Give feedback without
occupying your eyes– Provide complex
information– Be one half of a speech
interface
![Page 30: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/30.jpg)
30
Speech Synthesis
● Uses– Communicate equipment needs to preuprising human population.
● e.g. "I need your clothes, your boots and your motorcycle."
![Page 31: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/31.jpg)
31
Speech Synthesis:OpenMary
● Project: OpenMary– Linux, OSX, Solaris,
Windows– client/server– "Emotional TTS"
![Page 32: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/32.jpg)
32
Speech Synthesis:OpenMary
● marytts-client.sh
![Page 33: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/33.jpg)
33
Speech Synthesis:OpenMary
![Page 34: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/34.jpg)
34
Speech Synthesis:OpenMary
● Get new voices– marytts-component-installer.sh
![Page 35: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/35.jpg)
35
Speech Synthesis:OpenMary
● Poppy (dfki-poppy) is awesome.
![Page 36: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/36.jpg)
36
Speech Synthesis:OpenMary
● Obadiah (dfki-obadiah) is super casual.
![Page 37: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/37.jpg)
37
Available Demos
● OpenCV– Face detection
● OpenMary– Speech synthesis
![Page 38: Bob Igo CPOSC 2013 · 2020. 1. 4. · 1 Open Senses Bob Igo CPOSC 2013](https://reader033.vdocuments.us/reader033/viewer/2022052813/6095ca907a114a75b97374cc/html5/thumbnails/38.jpg)
38
References● OpenCV project
– http://opencv.org/● OpenCV Face Recogition Training
– http://docs.opencv.org/trunk/modules/contrib/doc/facerec/facerec_tutorial.html● ManyEars
– http://sourceforge.net/apps/mediawiki/manyears/index.php?title=Main_Page● 8SoundsUSB
– http://sourceforge.net/apps/mediawiki/eightsoundsusb/index.php?title=Main_Page● HARK
– http://winnie.kuis.kyoto-u.ac.jp/HARK/● HARK video demo
– http://www.youtube.com/watch?v=xpjPun7Owxg● Julius
– http://julius.sourceforge.jp/en_index.php● TakkTile
– http://www.takktile.com/● Barometers as touch sensors
– http://www.youtube.com/watch?v=0EMi_pcG9rE● iRobot hand with takktile
– https://www.youtube.com/watch?v=WvjzSrMbfLk● OpenMary
– http://mary.dfki.de/