simon listens non profit organization for research and training simon-listens
Post on 14-Jan-2016
44 Views
Preview:
DESCRIPTION
TRANSCRIPT
Page 1 of 48
simon listensNon profit organization for research and training
www.simon-listens.org
Page 2 of 48
Development of simon listens
2005: Identifying the problem
2006/07: Conception and basic programming of the open source software simon by Grasch Peter and a group of students of the HTBLA
2007: Foundation of "simon listens"
2008/09: Programming of the first stable prototype financed by the BMfVIT
Page 3 of 48
Research projects
Page 4 of 48
simon – scientific network
HTBLA –Higher Technical School Kaindorf
Graz University of Technology -Signal Processing & Speech Communication Laboratory – Prof. Kubin
Graz University of Technology – Institute for Software Technology (Robocup) – Prof. Wotawa
University of Graz – Austrian German Research Center – Prof. Muhr
Installation über Workshops
Individueller Auftrag an Verein simon listens oder Fa. Cyber-Byte EDV Services
Page 5 of 48
Simon expertise
Reasons for the fascination of simon listens
•Open Source character
•Synthesis of AI technologies
•Definition of concrete use cases and promoter of research themes
•Professional conceptual work
•Interdisciplinarity
•Preference for pedagogical and social solutions
Page 6 of 48
• Open Source Speech recognition system
• Based on
o Julius, HTK
o KDE4 / C++
• „Use-case-packages“ for download
o Vocabulary, Grammar, Commands, Trainings texts
• Acoustic model: Base models
o Static, Adapted, User generated
Simon listens products:Simon
Page 7 of 48
Simon listens products:Simon
Who benefits from simon?
1.Sensorimotor disabled elderly persons2.Physically disabled people of every age3.Quadriplegic people after an accident
Minimum requirement
•Conscious articulation of words or phoneme constellations•Conscious determination of numbers from 0 to 9•No computer literacy necessary
Page 8 of 48
Simon listens products:SSC
SSC: simon sample collector
ssc is a tool for large scale sample acquisition. Using ssc multiple teams can gather training data from potential end users or professional speakers and collect them on the central sscd server.
Page 9 of 48
Simon listens products:SAM
SAM: Simon Acoustic Modeller
SAM is a tool to create and test acoustic models. It can compile new speech models, use models created by simon and produce models that can be used by simon later on.
Page 10 of 48
Use case: Basic Autonomy
Speech control of a media center
It is possible to listens to music, watch a slide show, TV or videos or listen to the radio just with a few – free eligible – words like “right”, “left ”, “up”, “down”,
“ok”, “stop” etc
Page 11 of 48
Use case: Basic Autonomy
Speech control of the firefox browser
Daily reading of newspapers and surfing the internet is easy and uncomplicated. The number Plug-In allows you to click links just by entering numbers.
Page 12 of 48
Use case: Basic Autonomy
Speech control of email clients
You can write to predetermined e-mail addresses using numbers and with the use of expandable text modules you can ask basic questions.
Page 13 of 48
Use case: Basic Autonomy
Speech control of skype
It is easy to establish connections to relatives and friends using Skype or other Voice-Over-IP solutions.
Page 14 of 48
Use case: Basic Autonomy
Desktop Grid
Navigate the mouse with voice control easy and fast and do simple clicks, double clicks and similar actions.
Calculator
You can do several arithmetic operations in your daily routine and print either the result or the operation with the voice controlled calculator.
Keyboard
The voice controlled keyboard allows easily to insert code words,
TAN-Numbers etc.
Page 15 of 48
• FFG – BENEFIT simon – verbal control of ICT-applications for elder people
http://www.youtube.com/watch?v=35tyZntA9j4
Research Projects
Page 16 of 48
Analysis of disfunctions
• Movement disorders are dominant
• Visual and auditive disorders are medial
• Speech ability remains longest
Page 17 of 48
Current Projects
ibi – I’m informed
Development of dialogues-integration of moving and speaking avatare (Persons, Comics! )
Page 18 of 49
Current Projects
App112 – Security Connections using Keyword spotting
Page 19 of 48
Planned Projects
•Voice control via dialogues for clinical rooms (beds, TV, light, etc.)
•Voice control via dialogues for home automation
•Smartphone apps for android, windows mobile and iOS
•Specific Austrian speech model for elder people
•Voice control via dialogues of set top boxes and television sets
Page 20 of 48
Project ASTROMOBILE
Page 21 of 48
Astro & one user
Page 22 of 48
ASTROMOBILE: simon tasks
To fulfil the mentioned task within the project ASTROMOBILE we had to work on the following different sub-tasks of different scientific requirements and not only technical issues like:
•Programming
•Development of scenarios
•Development of dialogues
•Speech modelling
•Signal processing
Page 23 of 48
Programming the D-Bus Interface
The current draft identifies seven dedicated components:
•Navigator: Provides high level navigation including obstacle avoidance and path planning•Locator: Locate the robot and the person using the sensory network•Sensors: Integration of Boolean sensors (bed sensor, smoke sensor, etc.)•Speech Recognition: Command and control system utilizing simon•Text-To-Speech: Synthesize a given text in German, Italian and English•AstroLogic: Logic layer
Page 24 of 48
Scenarios:User - Robot
General offers, when the robot stays in front of the User after calling him:
•Weather information•news based on RSS feeds with speech synthesis to listen the news •Multimedia offers like:
• Photos• Music• Videos
•Communication offers like Skype calls, Phone calls, SMS, Mail•Organization offers: scheduler•Calculator•Keyboard
Page 25 of 48
Scenarios:User - Robot
Control functions in the natural environment ordered by the user and configured feedback by the robot using the recording of a 10 second video and presenting it to the user, when the robot comes back like
• Control of the water in the bathroom• Control of the doors in the environment• Control of the cooker• Control of the gas and other critical functions
Request functions: With the help of the simon touch platform the user should be able to initiate some requests like
• Request of new medicine• Request of food• Request of acute help• Request of general help by the caregiver• Request of cargiver transport to the doctor or other events• Pre-established SMS-Service with the list PlugIn
Page 26 of 48
Scenarios:Robot - User
Reminder functions with request of help are prepared for the following situations like
• Alarm in the morning • Reminding of the hygiene and dressing in the morning• Reminding of the hygiene and facing in the evening• Reminding of taking the ordered drugs• Reminding of periodic drinking• Reminding of eating in the morning• Reminding of eating in the noontime• Reminding of eating in the evening• Reminding of coffee time• Reminding of periodic Skype calls
Page 27 of 48
Scenarios:Robot - User
Simple reminder functions without request of help are prepared for the following situations like:
• Reminding of events ( Based on calendar )• Reminding of birthdays• Reminding of appointments like
• Meeting with friends• Consultation with doctors• Visit of events• Personal appointments in the calendar
Dialogue-actions: ( skype and mailing ) (simple reaccion yes or no! )
Incoming Skype calls with the possibility to accept or refuse the callIncoming mails with the possibility to allow or refuse that the robot reads the
messageIncoming appointment requests with the possibility to allow or refuse the
appointment
Page 28 of 48
Scenarios:Caregiver – Robot - User
Control functions:• Caregiver have access to the information of the sensors in the
environment• Caregiver can administrate the dialogues, appointments and
reminder functions for the user on the calendar• Caregiver can activate the robot to transmit a visual impression of
the user in case of emergency
Communication functions• Caregiver can call the user using the Skype dialogue• Caregiver can sent an appointment to convene with the user using
the robot and the calendar• Caregiver can sent an information with E-Mail using the mail reading
dialogue
Page 29 of 48
Simon TouchArchitecture
Like the rest of the developed solution, Simontouch uses C++, Qt4 and the KDE libraries. In particular, we are using the Akonadi PIM service, the Nepomuk / Strigi search, the KLocale framework for localization and the Phonon multimedia system.
Page 30 of 48
Simon listens – Simon touch
Simon touch – voice controlled touchscreen interface
•Main screen
•Information center with
– Slideshow, Music, Video, News with speech output
•Optional functions
– Touchscreen keyboard– Touchscreen calculator– Touchscreen calendar
Page 31 of 48
Simon listens – Simon touch
• Communication center with
– Skype, Phone, SMS, Mail
• Control center with video recording and playback
– Control of water, doors, cooker, gas
– User can activate the control function
– Caregiver can activate the control function from outside and take a look using the integrated video stream
Page 32 of 48
Simon listens – Simon touch
• Request center with direct phone calls or mail order
– Shopping system, transport and support calls
Page 33 of 48
The dialogue system of Simon
Dialogues in the Astromobile project
Page 34 of 48
The dialogue system of Simon
The dialogue system of simon was implemented as a command plugin. You can basically speak of an ultimately robot.
StatesEvery state consists internal of:The current dialogue text. Every State can have several texts to give the dialogue a natural flow. Dialogue texts can use bounded values and templates (see below).
Page 35 of 48
The dialogue system of Simon
AvatarA state can be linked with an avatar (e.g. the face of a nurse, an icon, etc.)
Page 36 of 48
The dialogue System of simon
OptionsThrough triggering the options (e.g. by a speech-command) a state can go over into another or commands can be executed. Options have a trigger, a name, an optional icon and can be automatically initiated after some time from entering the state.
Page 37 of 48
The dialogue System of simon
Bound valuesVariables in the dialogue system will be shown as bound values. So for example the name of the user could be represented as $name$. The variable will be triggered to the duration with the list of configured bound values. There are four types of bound values:
StaticConnection of the variables with a text; e.g. Name of a patient
QtScriptThis variable takes the result of the given Qt-Script (ECMAScript; also known as “JavaScript”) at the evaluated run time.
Output optionsA dialogue can be shown graphically on the screen or through the integrated speech synthesis system (TTS) with the speaker.
Page 38 of 48
The dialogue System of simon
Implementation in simonThe dialogue states can be taken through the schematic diagram above. So it results in:
•Three states (Reminder, already taken or not taken)•One time-driven trigger (who starts the dialogue on a specific time)•2 speech transitions (“Yes”, “No”)•One time-driven transition (renewed reminder after time lapse)
Page 39 of 48
The dialogue System of simon
Page 40 of 48
The dialogue System of simon
Page 41 of 48
The dialogue System of simon
Schedule based appointments or dialogue actions
Page 42 of 48
Simon listens – simon speech modelling
Speech models in English, German and Italian
•English: Adaption of the open source speech model of Voxforge
•German: Adaption of a self produced speech model of elderly people
•Italian:
– Recording speech data of 46 persons of the
region of Pontedera
– Modelling of a specific Italian speech model forelderly people
Page 43 of 48
Simon listens – signal processing
Actual Solution
Calling Astro with a Nokia N9 MeeGo mobile phone from everywhere in the natural environment (A MeeGo client was developed within this project )
Controlling Astro with a mounted
gooseneck microphone in front of the robot
Page 44 of 48
Simon listens – signal processing
Page 45 of 48
Simon listens – signal processing
Natural speech communication from distance between humans and machines
A very good solution would need:
•A combination of speech recognition, tools of artificial intelligence and speech synthesis
•A combination of identification of the direction of sound, localization of the user, voice identification, face detection etc.
•A very good sound segmentation
•Different zones of communication like call, communication and comfort zone with special speech models and intelligent microphone management
The development of this solution would require a very great multidisciplinary project
Page 47 of 49
Astro & two user
Page 48 of 48
Synthesis of AI-Approaches
Multisensory natural speech communication between Humans and Robots/Ambient Assistant Living scenarios would need a high level of interdisciplinarity and
Synthesis of AI-technologies
Indoor localization Behaviour analysis
Microphone installation with logic of call/security, Communication and comfort zone
Specific sound segmentation Identification of direction of the voice
Voice activity detection Voice identification for user identification
Face detection Mouth movement detection
Context based speech recognition with specific Speech models for call/security,
communication and comfort zones Tools of artificial intelligence
Dialogue systems for clearing situations Intelligent context based user
orientated dialogue systems
R
O
B
O
T
S
S
M
A
R
T
H
O
M
E
S
top related