ai for complex situations: beyond uniform problem …...e.g. sara magliacane, tom claassen, joris m....

AI for Complex Situations:

Beyond Uniform Problem

Solving

Michael Witbrock

DRSM, Learning to Reason

Cognitive Computing Research

[email protected]

© 2017 International Business Machines Corporation

2

Systems that Reason,

Learn and Understand

Cognitive Systems

Beyond Data, Beyond Programs

© 2016 International Business Machines Corporation 3

many, many problems

Can be done.

Worth doing.

Can be programmed.

Worth programming.Uniform

Div

ers

e

Machine

Reasoning

Targets

Goal: Professional Level Competence


4

Professional Competence in Organizations


• Warn and explain at time of compliance risk

• Contextual ID of Relevant Co-workers during meeting planning

• Personalized Medicine based on recent research

• Corporate forms that only ask for new, knowable information

• Compliant re-engineering based on supply chain

• Validity maintenance for documentation and processes

• Code analysis and synthesis based on intent

Kinds of Thought

Neural Networks• Humans and Animals

• Reactive

• Trained from Data

• Hard to Explain

• Particular Skills

• Data and Processing Driven

Advances

Deliberate Reasoning• Humans only (almost)

• Supervisory

• Trainable from Data or Language

• Explainable and Correctable

• Portable Skills

• Ubiquitous in Enterprise

• Ripe for Rapid Progress


6

Early, symbolic, small AI systems were impressive


SHRDLU: A program for

understanding natural

language, (Terry Winograd,

MIT) in 1968-70 that carried

on a simple dialog with a

user, about a small world of

objects on a display screen.

http://hci.stanford.edu/~winograd/shrdlu/

AARON - The First Artificial

Intelligence Creative Artist

(Harold Cohen, UCSD)

1973–2016)

The Aaron system

composes and physically

paints novel art work.

It is a rule-based expert

system using a declarative

language.

http://www.viewingspace.com/genetics_culture/p

ages_genetics_culture/gc_w05/cohen_h.htm

Carnegie Learning’s

Algebra Tutor (1999–

present): This tutor

encodes knowledge about

algebra as production

rules, infers models of

students’ knowledge, and

provides them with

personalized instruction.

http://www.carnegielearning.com

IBM has developed landmark game playing systems


IBM Researcher Gerald

Tesauro (1994) developed

a self-teaching

backgammon program

called TD-Gammon. Starting

from a random initial

strategy, and learning its

strategy almost entirely from

self-play, TD-Gammon

achieved a human world-

champion level of

performance.

On May 11, 1997, IBM’s Deep

Blue (manned by co-creator

Murray Campbell above) beat the

world chess champion Garry

Kasparov after a six-game match:

two wins for IBM, one for the

champion and three draws.

Playing checkers on the 701

On February 24, 1956,

Arthur Samuel’s Checkers

program, which was

developed for play on the

IBM 701, was demonstrated

to the public on television. It

is considered a milestone

for artificial intelligence, and

offered the public in the

early 1960s an example of

the capabilities of an

electronic computer

… and, of course, Watson for Jeopardy

9

Rise of Machine Learning

93.4%

“Machine learning models are machines for

creating entanglement and making the isolation of

improvements effectively impossible”

Machine Learning: The High-Interest Credit Card of Technical Debt

(Sculley et al. – via Doug Beeferman, Sift Science)BU

T

https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

https://deepmind.com/blog/wavenet-generative-model-raw-audio/

IBM Switchboard : https://arxiv.org/pdf/1604.08242v2.pdf

Parsing https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html

Image Synthesis http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf

1994: IBM Neural Network Utility Released

1991: Neural Networks in Banking

NNs in Finance

1985: Modern Neural Nets Invented

1990s: HNC Uses NN for Credit Score

1986: HNC founded

2002: Fair Isaac Company buys HNC

FICO Score

2017: almost no major applications

12

Hypothesis

Easy NN learnability depends on simple

underlying causal structure overlaid with

hard-to-describe variation of limited

complexity

13

Speech

Words Words Words

Phonemes Phonemes

Phones Phones Phones

Prosody

Intonation

Pitch Rate Volume

Textual Utterance

• Variation from:

‒ large vocabulary

‒ subtle interaction effects

• Perhaps: Similar shallow

semantic combination structure

14

I am not a computer

Je ne suis pas un ordinateur

VBP

VP

RB RB

NP

NP

DET NNPRP

S

VPNP

NP

PRP VBP RB DET NN

Shallow NLP

Go, Backgammon, … 80’s Video Games

Backgammon Rules (US Backgammon Federation), ~3 pages

15https://www.britgo.org/files/pubs/playgo.pdfhttp://usbgf.org/learn-backgammon/backgammon-rules-and-terms/rules-of-backgammon/

How to Play Go (British Go Association), 12 pages of instructions.

Object Recognition

16

• More or less rigid

• Slowly changing

• Recursive structure

• Surface properties

• Illumination

• Optics

17

Hypothesis

Easy NN learnability depends on simple

underlying structure overlaid with hard-to-

describe variation of limited complexity

Theory of

the World

Statistical

Model of

Variation

How do Humans Build Large Models of the World?

18

Explicit,

Symbolic,

Compositional

Theory of the

World

Implicit,

Tacit,

Statistical

Models of

Variation

Learned Mostly Learned

Statistical ML + Symbolic ReasoningStatistical Deep Learning has become

the engine of machine learning Rich knowledge graphs and KBs have become

the foundation for symbolic reasoning

+

Continuous

Online

Learning

Real-time

Knowledge

Fusion

Causal Inference is rapidly becoming

more practical and well-founded

e.g. Sara Magliacane, Tom Claassen, Joris M. Mooij Ancestral Causal Inference In Proceedings of

Advances in Neural Information Processing Systems 29 (NIPS 2016)

http://www.arxiv-sanity.com/search?q=Sara+Magliacane

http://www.arxiv-sanity.com/search?q=Tom+Claassen

http://www.arxiv-sanity.com/search?q=Joris+M.+Mooij

http://arxiv.org/abs/1606.07035v3

Methods for Symbolic Evidence Assembly

20

Inductive Logic Programming

Analogical Mapping

Program Synthesis

NL Text Synthesis

…

Persistent, minimally inconsistent,

general purpose knowledge

Task relevant knowledge and data

Probabilistic inference

Logical inference

Textual entailment

Knowledge Source

Task C

onte

xt

Watson

Cog to Cog

Cog to Human

Human to Human

Influence Diagram

Constructor

Consequence Table

Fact Checker

Rule Elicitation

Lighting

Critical Sites

Objective Identification

Sensitivity Analysis

Personal Avatar

Sentiment Analysis

Sequential Markov Decision Process

Smart Swaps

Collaborative Cognition Cognitive agents that collectively learn and leverage sophisticated models of users,

engaging with us via adaptive multi-modal interfaces

21


Beyond Data, Beyond Programs, Beyond Narrow Tasks


Humans want

to do and can

Can be programmed.

Worth programming.Uniform

Div

ers

e

Machine Learning

& Reasoning

Targets

Explicit, Symbolic,

Composable Theory

of the World

Implicit, Tacit,

Statistical Model of

Variation

most problems

Rich, Compositional Knowledge Representations

• Logic

• Probabilistic logics

• Trainable programs

• Distributed representations

• Trained Neural Network Reasoners

• logics over continuous mathematical structures

• differentiable logics to allow reliable approximate reasoning


Rethink Logic

Rethinking Computation with Reusable, Composional Learning

Symbolic & Trainable

Logic

Reusable (learned)

RepresentationsLanguage to

Representation to

Language

Explainable AI

Integration, testing,

deployment and

experimentation

Knowledge Extraction

Document

Understanding

Entities, Relations

Facts, Rules

Automated Knowledge Base Creation

Unstructured/Semi-structuredMassive Unstructured

Shallow Structured

Whole stack optimization

• Key Insight from the Machine Learning / Deep

Learning workloads:

• Machine Learning algorithms are inherently

more tolerant to approximations at every level in

the stack from algorithm down to the hardware

implementations.

• Approximations at the hardware level can be

embedded in the architecture• Reduced Precision for Computes (8-bit vs. 64-bit)

• Relaxed Synchronization between threads.

• Native Devices / Circuits that can add “noise” to

computations and help “regularize” parameters

• 10-100X over commodity CPU-GPU clusters can be

targeted in the foreseeable time-frame.

• Significant Hardware Speed-Up for Symbolic

Reasoning also likely attainable

Approximate Numerical Optimization, Stochastic Optimization Methods

Relaxed Synchronization in a Distributed Computing Model: Across nodes or across cores

Non-parametric Learning,

Deep neural networks, Kernel-based methods

Limited Supervision Learning

Active Learning, multi-modal learning

High-Dimensional Learning

Low-rank structure, Dimensionality reduction

Few, expensive iterations v/s Many, cheap iterations

Programming Interface: Language extensions, Probabilistic Programming Languages

Fast Numerical Linear Algebra via Randomized Algorithms: SVD, Eigen-decomposition, Matrix Multiplication etc.

Hardware Acceleration via Approximate Computing: Low precision arithmetic, stochastic computing circuits

Sub-10 nm Si-CMOS: Relax constraints on device variability

Beyond Si-CMOS and Emerging Device Technology: Carbon-based logic, Resistive RAM, PCM etc..

Approximations at the hardware level can tremendously improve the computational efficiencies of

Machine Learning Systems that are inherently more resilient to these approximations.

Cognitive BoardroomCognitive Design Studio

Cognitive Environment Scenarios

26

Diagnostic Theatre

Natural spoken language

understanding

Human and object

identification and tracking

Gesture understanding Multi-human user interfaces

Parallel, group cognitive

computing

Understanding

interpersonal dynamics

Wide-area sensing,

actuation, and visualization

Fast multimedia search and

understanding

Interpreting human

hierarchy and intention

Tracking conversation

history and context

Core Cognitive Services

Sensor- and actuator-rich physical immersive environment

Data & Information Services

Crawl, Semantic Indexing

Analytic Services

Modeling, Simulation

Learning Services

Context, Trends, Behavior

Creativity Services

Novelty, Aesthetic Principles

Discovery Services

Inferences, Semantics


Image and video comprehension


Anomaly

• NN Image Processing

• Shallow and Deep Text Processing

• Integrated Symbolically with

Medical Background Knowledge

Deep Understanding of Business Artifacts: Compliance

.pdf

Line PlotBulleted List

• Create representation for an obligation

• Models for “obligation language”

• Reason about list or data that refines the obligation

• Create document fragments by parsing out chunks

• Document structure models

• Reason about document chunks

Obligation

• Create representation for a fragments

• Document fragment models

• Reason about fragment constituentsfragment

Section

fragmentfragment

• Hierarchical Processing

• Machine-learned models and reasoning at all levels

• Learnability of artifacts, models

• Learn how to specify reasoners

Early Example :

29

7. All multilateral systems in financial instruments

shall operate either in accordance with the

provisions of Title II concerning MTFs or OTFs or

the provisions of Title III concerning regulated

markets.

Any investment firms which, on an organised,

frequent, systematic and substantial basis, deal on

own account when executing client orders outside

a regulated market, an MTF or an OTF shall

operate in accordance with Title III of Regulation

(EU) No 600/2014.

Without prejudice to Articles 23 and 28 of

Regulation (EU) No 600/2014, all transactions in

financial instruments as referred to in the first and

the second subparagraphs which are not

concluded on multilateral systems or systematic

internalisers shall comply with the relevant

provisions of Title III of Regulation (EU) No

600/2014.

Regulation (EU) No 600/2014: http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32014R0600Compliance Lead: Vijay Saraswat

http

://rese

arc

h.ib

m.c

om

/co

gn

itive

-co

mp

utin

g/c

og

nitiv

e-h

oriz

on

s-n

etw

ork

/

Cognitive

Horizons

Network

Join Us

Cognitive Systems

Knowledge Representation and Reasoning

Learning to Reason

Michael Witbrock

[email protected]


ai for complex situations: beyond uniform problem …...e.g. sara magliacane, tom claassen, joris m....

Documents