layla el asri, research scientist, maluuba

24
A Microsoft company Teaching AI to Make Decisions and Communicate Layla El Asri, Research Manager with slides by Paul Gray, Harm Van Seijen, and Adam Trischler

Upload: mlconf

Post on 05-Apr-2017

90 views

Category:

Technology


0 download

TRANSCRIPT

PowerPoint Presentation

A Microsoft companyTeaching AI to Make Decisions and CommunicateLayla El Asri, Research Manager

with slides by Paul Gray, Harm Van Seijen, and Adam Trischler

Maluubas Vision: Solving AGI by Creating Literate Machines

Machine Reading Comprehension

Teaching artificial agents to read and understand natural languageAdvanced Conversational Systems

Building knowledgeable systems that can exchange information with users to help users accomplish tasks or gain knowledgeReinforcement Learning

Fundamental research in scalability of Reinforcement Learning to allow machines to perform complex tasks in the real world

Maluuba, a Microsoft company

Im Layla from Maluuba. Our vision is to solve artificial general intelligence by creating machines that can read, think and communicate like humans.We started in 2011 and operate a deep and reinforcement learning lab in Montreal.In January, Maluuba was acquired by Microsoft.

Our work focuses on three areasMRC / Dialogue / RL (quick intro for each)2

Teaching AI to Make Decisions and Communicate

Expectations of AILearning to LearnLearning to PerceiveLearning to Communicate

Maluuba, a Microsoft company

Expectations of AI

Nice, thanks

When is my appointment with Marc?You have a meeting with Marc Villeneuve tomorrow at 10am.Ok, where is it again?At Starbucks on Maisonneuve and Montagne so you should leave the office at 9:40.

Ok is it the same Starbucks when I met Harry last week?

Yes

I see. Do you know what Marcs been up to lately?

Yes, there was an article on MIT Tech review yesterday. His company will start commercializing affordable 3d printers.Learning to CommunicateLearning to LearnLearning to Perceive

Maluuba, a Microsoft company

4

Learning to LearnHuman beings decompose tasks into subtasks in an efficient way.

Subtasks are achieved without conscious awareness.

Maluuba, a Microsoft company

Learning to Learn: Separation of Concerns

Separation between performance metric and learning objective.Each agent has its own learning objective.The goal is to find a reasonable policy efficiently.

Maluuba, a Microsoft company

Example of Application

Maluuba, a Microsoft company

Collecting the fruitsGoalGet all fruits as quickly as possible

Reward+1 if all fruits are eaten0 otherwise

Number of fruits: n

State space: 100x100n = 102n + n

NP-complete problem

Using one agent per fruitState space reduced to nx100

Maluuba, a Microsoft company

Pac-Boy

Reward+1 for eating a fruit-10 for each collision with a ghost

The episode ends after all fruits are eaten or after 300 time steps.

State space Approximately 1028 states

Maluuba, a Microsoft company

Configuration1 agent per fruit1 agent per ghost75 fruit agents with 76 states2 ghost agents with 76x76 states

Maluuba, a Microsoft company

DemoDQNSoC

Maluuba, a Microsoft company

Results

Maluuba, a Microsoft company

Learning to Perceive

For living creatures, perception is adapted to task achievementFirst living creatures: ability to reactEvolution: ability to foreseeChallenge: correlate sensory inputs with eventsModern human beings: ability to focus

Maluuba, a Microsoft company

Learning to Perceive: Information GatheringGuessing Game tasks that progress in difficultyBattleship sink the enemys ships quicklyHangman guess the phrase quicklyBlockworld

We developed a model that achieves super-human performance on these tasks.

Maluuba, a Microsoft company

BlockworldEnvironmentObservationsModels World BeliefPeeking Policy

Models Answer BeliefIs the red sphere above the red cross?

Maluuba, a Microsoft company

Information Gathering Model

Maluuba, a Microsoft company

Learning to Communicate

Language is the most precise communication tool that we have but it is still very impreciseEasier to give orders and strictly define the meaning of words

Maluuba, a Microsoft company

How to Build a Goal-Driven Dialogue System?Inform(city = Rio)State trackerNatural Language Understanding(NLU)Natural Language Generation(NLG)Dialogue Management(DM)City = Rio, budget = $2000, hotel = Hilton, price = $1950Databasecity = Rio, budget = $2000Hotel = Hilton, price = $1950Offer(name = Hilton, price = $1950)You can book the Hilton for $1950.I want to go to Rio.

Maluuba, a Microsoft company

Going One Step Further: Modelling Memory

Maluuba, a Microsoft company

Frames Dataset Overview

15 Turns per Dialogue

268 Hotels

109 Cities

19,986 Turns

1369 Dialogues

Maluuba, a Microsoft company

Frame TrackingCuritiba, August 15th August 26th, 4 stars, $2877.68 Columbus, August 15th,Request(price)And how much if I were to go to Columbus?Curitiba, August 15thCuritiba, August 15thCuritiba, August 15th August 26th, 4 stars, $2877.68 Curitiba, August 15thAnd how much if I were to go to Columbus?Columbus, August 15th,Request(price)State TrackingFrame Tracking

Maluuba, a Microsoft company

Frame Tracking Model

InputThe NLU labels, the list of frames, the previous active frame, and the user utterance

OutputThe current active frame and the frames referred by the dialogue acts

Model

Maluuba, a Microsoft company

Thank you!Papers discussedImproving Scalability of Reinforcement Learning by Separation of ConcernsTowards Information-Seeking AgentsFrames: A Corpus For Adding Memory To Goal-Oriented Dialogue Systems

Maluuba, a Microsoft company

Were hiring!Research ScientistsResearch EngineersDevelopersProduct/Program Managers

www.maluuba.com/careers

Maluuba, a Microsoft company