progress report reihaneh rabbany presented for nlp group computing science department university of...

Progress Report

Reihaneh Rabbany

Presented for NLP GroupComputing Science Department

University of AlbertaApril 2009

Agenda

• Project Proposal for Guiding Agent by Speech• Many to Many Alignment by Bayesian

Networks– Letter to Phoneme Alignment– Evaluation of phylogenetic trees

2

Quick RL overview• An agent interacting with environment– perceives state – performs actions – receive rewards

• Agent– Computes the value of each action in each state

• long term reward obtainable from this state by performing this action

– Performs action selection by choosing the best action or sometimes a random action• exploration-exploitation

3

10],,|...[),( 22

1 asrrrEasQ tttt

),(maxarg' asQa a

Project Proposal for Guiding Agent by Speech

• Accelerate learning using speech– The emotion in speech signal has considerable

amount of side information – Happiness or anger of a speech signal can provide

a shaping reinforcement signal

• Developing tools and methods to extract emotion from speech and designing a methodology to use it as a shaping signal

4

Approaches to use speech signal as a guide for learning

• Extracting prosodic features from speech• Associating meaning to these features– Supervised learning-based approach• data-set of (prosodic features, emotion) pairs

– excited, happy, upset, sad, bored• Assigns a reward to the recognized emotion

– Pure RL approach • inspired by the learning process of the parent-infant

– The infant gradually learns to associate value to perceived speech and how to use it to guide her exploration of the world

5

RL Approach

• Two ways for developing this idea– Augmenting the observation space to include the

prosodic features• Emotion will become state-dependent

– Learns a separate instructor module • Estimates the value of prosodic features• Instructions (learnt instructor values) would affect the

agent's action selection

6

Instructions

• Different ways that these instructions (learnt instructor values) could affect the agent's action selection– Balancing the exploration-exploitation

• When the speaker is not happy with what the agent is doing and it should explore other actions

– Use it directly in action selection by some weights• Motivates the agent to keep its previous action if the instructor is

satisfied with its current action

– Use it as a shaping reward to define a new reward function by adding it to the actual reward received from the environment

7

),,()1(),(maxarg' 1 ta aaeIasQa

)(eIrr

Agenda



8

Many to Many Alignment by Bayesian Networks

• Finding Alignment between two sequences – Assuming the order is preserved

• I’ve applied it into two applications– Letter to phoneme alignment • Aligning for a given dictionary

– Evaluating Phylogenetic trees• Shows how compatible the tree is with the given

taxonomy

9

Agenda


Networks– Letter to Phoneme Alignment– Phylogenetic trees evaluation

10

Model• Word: – sequence of letters

• Pronunciation: – sequence of phonemes

• Alignment: – sequence of subalignments

• Problem: Finding the most probable alignment

• Assumption: sub alignments are independent

mpppP ...21

iii PLa ,

nlllL ...21

),|(maxarg PLAPA Abest

2|||,| ii PL

11

kaaaA ...21

k

ii PLaPPLAP

1

),|(),|(

Many-to-Many EM

1. Initialize prob(SubAlignmnets)// Expectation Step2. For each word in training_set

2.1. Produce all possible alignments 2.2. Choose the most probable alignment// Maximization Step3. For all subalignments

3.1. Compute new_p(SubAlignmnets)

][

],[)(

i

iii lM

plMaP

12

Dynamic Bayesian Network

• Model

• Subaligments : hidden variables• Learn DBN by EM

li Pi

ai

k

iiii PLaPAP

1

),|()(

],[

][)(

ii

ii lpM

aMaP

13

),|(),|( iiii PLaPPLaP

Context Dependent DBN

• Context independency assumption– Makes the model simpler– It is not always a correct assumption– Example: P(<h,h>) in Chat and Hat

• Modelli Pi

aiai-1

k

iiiii PLaaPAP

11 ),,|()(

],,[

][)(

1 iii

ii lpaM

aMaP

14

),,|(),|( 1 iiiii PLaaPPLaP

Agenda



15

Evaluation of Phylogenetic Trees• Phylogenetic Trees– Show the evolution of species

• Taxonomy– Caninae; True dogs; Canis; Coyote – …– Caninae; True foxes; Vulpes; Kit Fox– Caninae; True foxes; Vulpes; Fennec Fox– …– Caninae; Basal Caninae; Otocyon ; Bat-eared Fox – ...

16

Tree Evaluation

• Labeling the inner nodes in the tree• For each species – A path in the tree • sequence of inner node labels

– A taxonomy description• taxonomy sequence

– There should be a many to many alignment between these two sequences

17

Tree Evaluation (Cont.)

• Finding alignment between these sequences for all the species– Finding the most probable alignments

• Measuring the mean probability of these alignment – How probable is this tree given this taxonomy

18

• Taxonomy and Trees

• Aligned result

19

Discussion

20

progress report reihaneh rabbany presented for nlp group computing science department university of...

Documents

guiding agent

shaping reward

best action

previous action

use speech signal

sequence of phonemes

agendaproject proposal

bayesian networksletter