ai for complex situations: beyond uniform problem …...e.g. sara magliacane, tom claassen, joris m....
TRANSCRIPT
AI for Complex Situations:
Beyond Uniform Problem
Solving
Michael Witbrock
DRSM, Learning to Reason
Cognitive Computing Research
© 2017 International Business Machines Corporation
2
Systems that Reason,
Learn and Understand
Cognitive Systems
Beyond Data, Beyond Programs
© 2016 International Business Machines Corporation 3
many, many problems
Can be done.
Worth doing.
Can be programmed.
Worth programming.Uniform
Div
ers
e
Machine
Reasoning
Targets
Goal: Professional Level Competence
© 2016 International Business Machines Corporation
4
Professional Competence in Organizations
© 2017 International Business Machines Corporation 5
• Warn and explain at time of compliance risk
• Contextual ID of Relevant Co-workers during meeting planning
• Personalized Medicine based on recent research
• Corporate forms that only ask for new, knowable information
• Compliant re-engineering based on supply chain
• Validity maintenance for documentation and processes
• Code analysis and synthesis based on intent
Kinds of Thought
Neural Networks• Humans and Animals
• Reactive
• Trained from Data
• Hard to Explain
• Particular Skills
• Data and Processing Driven
Advances
Deliberate Reasoning• Humans only (almost)
• Supervisory
• Trainable from Data or Language
• Explainable and Correctable
• Portable Skills
• Ubiquitous in Enterprise
• Ripe for Rapid Progress
© 2016 International Business Machines Corporation
6
Early, symbolic, small AI systems were impressive
© 2015 International Business Machines Corporation
SHRDLU: A program for
understanding natural
language, (Terry Winograd,
MIT) in 1968-70 that carried
on a simple dialog with a
user, about a small world of
objects on a display screen.
http://hci.stanford.edu/~winograd/shrdlu/
AARON - The First Artificial
Intelligence Creative Artist
(Harold Cohen, UCSD)
1973–2016)
The Aaron system
composes and physically
paints novel art work.
It is a rule-based expert
system using a declarative
language.
http://www.viewingspace.com/genetics_culture/p
ages_genetics_culture/gc_w05/cohen_h.htm
Carnegie Learning’s
Algebra Tutor (1999–
present): This tutor
encodes knowledge about
algebra as production
rules, infers models of
students’ knowledge, and
provides them with
personalized instruction.
http://www.carnegielearning.com
IBM has developed landmark game playing systems
© 2015 International Business Machines Corporation
IBM Researcher Gerald
Tesauro (1994) developed
a self-teaching
backgammon program
called TD-Gammon. Starting
from a random initial
strategy, and learning its
strategy almost entirely from
self-play, TD-Gammon
achieved a human world-
champion level of
performance.
On May 11, 1997, IBM’s Deep
Blue (manned by co-creator
Murray Campbell above) beat the
world chess champion Garry
Kasparov after a six-game match:
two wins for IBM, one for the
champion and three draws.
Playing checkers on the 701
On February 24, 1956,
Arthur Samuel’s Checkers
program, which was
developed for play on the
IBM 701, was demonstrated
to the public on television. It
is considered a milestone
for artificial intelligence, and
offered the public in the
early 1960s an example of
the capabilities of an
electronic computer
… and, of course, Watson for Jeopardy
9
Rise of Machine Learning
93.4%
“Machine learning models are machines for
creating entanglement and making the isolation of
improvements effectively impossible”
Machine Learning: The High-Interest Credit Card of Technical Debt
(Sculley et al. – via Doug Beeferman, Sift Science)BU
T
https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
https://deepmind.com/blog/wavenet-generative-model-raw-audio/
IBM Switchboard : https://arxiv.org/pdf/1604.08242v2.pdf
Parsing https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html
Image Synthesis http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf
1994: IBM Neural Network Utility Released
1991: Neural Networks in Banking
NNs in Finance
1985: Modern Neural Nets Invented
1990s: HNC Uses NN for Credit Score
1986: HNC founded
2002: Fair Isaac Company buys HNC
FICO Score
2017: almost no major applications
12
Hypothesis
Easy NN learnability depends on simple
underlying causal structure overlaid with
hard-to-describe variation of limited
complexity
13
Speech
Words Words Words
Phonemes Phonemes
Phones Phones Phones
Prosody
Intonation
Pitch Rate Volume
Textual Utterance
• Variation from:
‒ large vocabulary
‒ subtle interaction effects
• Perhaps: Similar shallow
semantic combination structure
14
I am not a computer
Je ne suis pas un ordinateur
VBP
VP
RB RB
NP
NP
DET NNPRP
S
VPNP
NP
PRP VBP RB DET NN
Shallow NLP
Go, Backgammon, … 80’s Video Games
Backgammon Rules (US Backgammon Federation), ~3 pages
15https://www.britgo.org/files/pubs/playgo.pdfhttp://usbgf.org/learn-backgammon/backgammon-rules-and-terms/rules-of-backgammon/
How to Play Go (British Go Association), 12 pages of instructions.
Object Recognition
16
• More or less rigid
• Slowly changing
• Recursive structure
• Surface properties
• Illumination
• Optics
17
Hypothesis
Easy NN learnability depends on simple
underlying structure overlaid with hard-to-
describe variation of limited complexity
Theory of
the World
Statistical
Model of
Variation
How do Humans Build Large Models of the World?
18
Explicit,
Symbolic,
Compositional
Theory of the
World
Implicit,
Tacit,
Statistical
Models of
Variation
Learned Mostly Learned
Statistical ML + Symbolic ReasoningStatistical Deep Learning has become
the engine of machine learning Rich knowledge graphs and KBs have become
the foundation for symbolic reasoning
+
Continuous
Online
Learning
Real-time
Knowledge
Fusion
Causal Inference is rapidly becoming
more practical and well-founded
e.g. Sara Magliacane, Tom Claassen, Joris M. Mooij Ancestral Causal Inference In Proceedings of
Advances in Neural Information Processing Systems 29 (NIPS 2016)
Methods for Symbolic Evidence Assembly
20
Inductive Logic Programming
Analogical Mapping
Program Synthesis
NL Text Synthesis
…
Persistent, minimally inconsistent,
general purpose knowledge
Task relevant knowledge and data
Probabilistic inference
Logical inference
Textual entailment
Knowledge Source
Task C
onte
xt
Watson
Cog to Cog
Cog to Human
Human to Human
Influence Diagram
Constructor
Consequence Table
Fact Checker
Rule Elicitation
Lighting
Critical Sites
Objective Identification
Sensitivity Analysis
Personal Avatar
Sentiment Analysis
Sequential Markov Decision Process
Smart Swaps
Collaborative Cognition Cognitive agents that collectively learn and leverage sophisticated models of users,
engaging with us via adaptive multi-modal interfaces
21
© 2015 International Business Machines Corporation
Beyond Data, Beyond Programs, Beyond Narrow Tasks
© 2017 International Business Machines Corporation 22
Humans want
to do and can
Can be programmed.
Worth programming.Uniform
Div
ers
e
Machine Learning
& Reasoning
Targets
Explicit, Symbolic,
Composable Theory
of the World
Implicit, Tacit,
Statistical Model of
Variation
most problems
Rich, Compositional Knowledge Representations
• Logic
• Probabilistic logics
• Trainable programs
• Distributed representations
• Trained Neural Network Reasoners
• logics over continuous mathematical structures
• differentiable logics to allow reliable approximate reasoning
© 2017 International Business Machines Corporation 23
Rethink Logic
Rethinking Computation with Reusable, Composional Learning
Symbolic & Trainable
Logic
Reusable (learned)
RepresentationsLanguage to
Representation to
Language
Explainable AI
Integration, testing,
deployment and
experimentation
Knowledge Extraction
Document
Understanding
Entities, Relations
Facts, Rules
Automated Knowledge Base Creation
Unstructured/Semi-structuredMassive Unstructured
Shallow Structured
Whole stack optimization
• Key Insight from the Machine Learning / Deep
Learning workloads:
• Machine Learning algorithms are inherently
more tolerant to approximations at every level in
the stack from algorithm down to the hardware
implementations.
• Approximations at the hardware level can be
embedded in the architecture• Reduced Precision for Computes (8-bit vs. 64-bit)
• Relaxed Synchronization between threads.
• Native Devices / Circuits that can add “noise” to
computations and help “regularize” parameters
• 10-100X over commodity CPU-GPU clusters can be
targeted in the foreseeable time-frame.
• Significant Hardware Speed-Up for Symbolic
Reasoning also likely attainable
Approximate Numerical Optimization, Stochastic Optimization Methods
Relaxed Synchronization in a Distributed Computing Model: Across nodes or across cores
Non-parametric Learning,
Deep neural networks, Kernel-based methods
Limited Supervision Learning
Active Learning, multi-modal learning
High-Dimensional Learning
Low-rank structure, Dimensionality reduction
Few, expensive iterations v/s Many, cheap iterations
Programming Interface: Language extensions, Probabilistic Programming Languages
Fast Numerical Linear Algebra via Randomized Algorithms: SVD, Eigen-decomposition, Matrix Multiplication etc.
Hardware Acceleration via Approximate Computing: Low precision arithmetic, stochastic computing circuits
Sub-10 nm Si-CMOS: Relax constraints on device variability
Beyond Si-CMOS and Emerging Device Technology: Carbon-based logic, Resistive RAM, PCM etc..
Approximations at the hardware level can tremendously improve the computational efficiencies of
Machine Learning Systems that are inherently more resilient to these approximations.
Cognitive BoardroomCognitive Design Studio
Cognitive Environment Scenarios
26
Diagnostic Theatre
Natural spoken language
understanding
Human and object
identification and tracking
Gesture understanding Multi-human user interfaces
Parallel, group cognitive
computing
Understanding
interpersonal dynamics
Wide-area sensing,
actuation, and visualization
Fast multimedia search and
understanding
Interpreting human
hierarchy and intention
Tracking conversation
history and context
Core Cognitive Services
Sensor- and actuator-rich physical immersive environment
Data & Information Services
Crawl, Semantic Indexing
Analytic Services
Modeling, Simulation
Learning Services
Context, Trends, Behavior
Creativity Services
Novelty, Aesthetic Principles
Discovery Services
Inferences, Semantics
© 2015 International Business Machines Corporation
Image and video comprehension
© 2015 International Business Machines Corporation
Anomaly
• NN Image Processing
• Shallow and Deep Text Processing
• Integrated Symbolically with
Medical Background Knowledge
© 2013 International Business Machines Corporation
Deep Understanding of Business Artifacts: Compliance
Line PlotBulleted List
• Create representation for an obligation
• Models for “obligation language”
• Reason about list or data that refines the obligation
• Create document fragments by parsing out chunks
• Document structure models
• Reason about document chunks
Obligation
• Create representation for a fragments
• Document fragment models
• Reason about fragment constituentsfragment
Section
fragmentfragment
• Hierarchical Processing
• Machine-learned models and reasoning at all levels
• Learnability of artifacts, models
• Learn how to specify reasoners
Early Example :
29
7. All multilateral systems in financial instruments
shall operate either in accordance with the
provisions of Title II concerning MTFs or OTFs or
the provisions of Title III concerning regulated
markets.
Any investment firms which, on an organised,
frequent, systematic and substantial basis, deal on
own account when executing client orders outside
a regulated market, an MTF or an OTF shall
operate in accordance with Title III of Regulation
(EU) No 600/2014.
Without prejudice to Articles 23 and 28 of
Regulation (EU) No 600/2014, all transactions in
financial instruments as referred to in the first and
the second subparagraphs which are not
concluded on multilateral systems or systematic
internalisers shall comply with the relevant
provisions of Title III of Regulation (EU) No
600/2014.
Regulation (EU) No 600/2014: http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32014R0600Compliance Lead: Vijay Saraswat
http
://rese
arc
h.ib
m.c
om
/co
gn
itive
-co
mp
utin
g/c
og
nitiv
e-h
oriz
on
s-n
etw
ork
/
Cognitive
Horizons
Network
Join Us
Cognitive Systems
Knowledge Representation and Reasoning
Learning to Reason
Michael Witbrock
© 2017 International Business Machines Corporation