x.ai // data driven nyc // november 2014

18
a personal assistant who schedules meetings for you DATA DRIVEN NYC NOVEMBER 2014 NEW YORK CITY x.ai

Upload: firstmark

Post on 12-Jul-2015

281 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: X.ai // Data Driven NYC // November 2014

a personal assistant who schedules meetings for you

DATA DRIVEN NYC NOVEMBER 2014 NEW YORK CITY

x.ai

Page 2: X.ai // Data Driven NYC // November 2014

Solution is middlewareinput

Client

EmailCalendar

Server

EmailCalendar

output

ai trainer

Cc : amy @ x.ai

ai

calendar insert

negotiation

Page 3: X.ai // Data Driven NYC // November 2014

ArchitectureIn order to understand automation we need to understand architecture

Natural Language

Processing

Architectural design

Meeting Preference

Analysis

Page 4: X.ai // Data Driven NYC // November 2014

Cerebellum. Conversation ModelArchitectural

Design

User CC Amy

Amy

Users

New Meeting request

Propose time to participant

Request location from host

Reject time / propose time Accept Time Location sent

Page 5: X.ai // Data Driven NYC // November 2014

Intermediate States Meeting Invite

Accept Time

Accept Location

New Meeting

preferences

Initial State

Architectural Design

Changing meeting states trigger actions

Meeting invite

Not really a tree structure but more “guided conversation structure”

Page 6: X.ai // Data Driven NYC // November 2014

Pituitary gland. Conversation ModelArchitectural

Design

- Reduce ambiguity with each step- Clear, geared at facilitating data detection from response- Designed to optimize convergence- Maximize happiness

Amy’s goals:

Page 7: X.ai // Data Driven NYC // November 2014

Architectural Design

Pituitary gland. Conversation Model*actual example

Page 8: X.ai // Data Driven NYC // November 2014

Natural language processing

Temporal Lobe. Natural Language Processing

When Amy receives an email:

Classification problem What is the intent of the email ?

Information extraction problemWhat is the relevant data ?

This email exemplifies the problem perfectly

Page 9: X.ai // Data Driven NYC // November 2014

The categorization problemNatural language

processing

person is accepting a time proposal

Need to detect the intent combination from possible intents- Similar to article topic modeling but with shorter and noisier documents- But stronger features…

Examples of features- Characteristic N-grams- Characteristic named entities- Characteristic POS rules- Correlation to previous emails

Page 10: X.ai // Data Driven NYC // November 2014

The categorization problemNatural language

processing

One-vs-all non-linear svm’s as a first stab:- Optimize kernel / kernel params / features per intent- Optimize for precision- Calculate Platt confidence level as a function of “prediction score”

Categorization automated after confortable confidence → further human verification/training

person is accepting a time proposal

Page 11: X.ai // Data Driven NYC // November 2014
Page 12: X.ai // Data Driven NYC // November 2014
Page 13: X.ai // Data Driven NYC // November 2014

Information extraction modelNatural language

processing

what time is the person accepting

1 Detection (SUTime CRF model – StanfordNLP + x.ai layers / re-training - Temporal expressions → “1-2 works”, “early next week”, “lunch”, ... - Locations → “my office”, “Starbucks”, “Peking”, “Wall Street 48 5th floor” - People → Marcos, Alex, Dennis, Matt

2 Resolution – x.ai logical email context layers - “1-2 works” → 2014-11-13T18:00:00 UTC Duration: 1h - “my office” → Wall Street 48 5th Floor - “Marcos” → [email protected] userId:u23687khje672876

*can be broken down into two problems of varying difficulty

Page 14: X.ai // Data Driven NYC // November 2014

Classifier

New Meeting

Accept Time

Decline Time

Set preferences

Information extraction modelNatural language

processing

Information model based on class

Etc..

Class Information Model

Look for time constraints

Find time in thread

Look for locations / times / preferences

Page 15: X.ai // Data Driven NYC // November 2014

Hippocampus: User PreferencesUser

Preferences

Different ways to obtain user preferences :- User tells Amy his/her preferences- We deduce it from calendar

Page 16: X.ai // Data Driven NYC // November 2014

Figure out “meeting density” - Distribution over the week- Per type of meeting : lunch, “afternoon business”, etc… - Preferred meeting locations

A beautiful probability model emerges when we combine preferences for several users which- Simultaneously makes everyone happy- Does not make anyone sad- Work in progress… :-)

UserPreferences

Hippocampus: User Preferences

Page 17: X.ai // Data Driven NYC // November 2014

More modules are involved Many more modules are involved and under development.

Actively looking for new smart people to join the team at x.ai / jobs :-)

- Marcos Jimenez (PhD, UC Berkeley, CERN)- Diedi Hu (PhD Columbia, CERN)- Nikhil Raju (Columbia)- Sid Chandra (Columbia)- Angela Zhou (Columbia)

Page 18: X.ai // Data Driven NYC // November 2014

marcos @ x.aichief data scientist and co-founder

48 Wall Street, 5th FloorNew York, NY 10005

E: [email protected]: @xdotai