adding common sense into artificial intelligence common sense computing initiative software agents...

43
Adding Common Sense into Artificial Intelligence Common Sense Computing Initiative Software Agents Group MIT Media Lab

Upload: gerald-oliver

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Adding Common Sense into

Artificial Intelligence

Common Sense Computing InitiativeSoftware Agents Group

MIT Media Lab

Why do computers need common sense?

• Conversation works because of unspoken assumptions

• People tend not to provide information they consider extraneous (Grice, 1975)

• Understanding language requires understanding connections

What can computers do with common sense?

• Understand the context of what the user wants

• Fill in missing information using background knowledge

• Discover trends in what people mean, not just what they say

But how do we collect it?

A Brief Outline

• What is OMCS?

• What is ConceptNet?

• Using AnalogySpace for Inference

• Using Blending for Intuition

• OMCS Applications

Open Mind Common Sense Project

• Collecting common sense from internet volunteers since 2000

• We have over 1,000,000 pieces of English language knowledge from 15,000 contributors

• Multilingual– Additional resources in Chinese, Portuguese,

Korean, Japanese, and Dutch– In-progress: Spanish and Hungarian

• Users consider 87% of statements used in ConceptNet to be true

• “A coat is used for keeping warm.”

• “People want to be respected.”

• “The sun is very hot.”

• “The last thing you do when you cook dinner is wash your dishes.”

• “People want good coffee.”

What kind of knowledge?

Where does the knowledge come from?

• Contributors on our Web site (openmind.media.mit.edu)

• Games that collect knowledge

What is ConceptNet?

• A semantic network representation of the OMCS database (Liu and Singh, 2004)

• Over the years, used for:affect sensing, photo and video storytelling, text prediction, goal-oriented interfaces, speech recognition, task prediction, …

• ConceptNet 4.0– Over 300,000 connections between ~80,000 concepts– Natural language processing tools to help line up your data with ConceptNet

An Example

Creation of ConceptNet

• A shallow parser turns natural language sentences into ConceptNet assertions

• 59 top-level patterns for English, such as “You would use {NP} to {VP}”

• {NP} and {VP} candidates identified by a chart parser

Representation

• Statement: expresses a fact in natural language

• Assertion: asserts that a relation exists between two concepts

• Concepts: sets of related phrases– identified by lemmatizing (or stemming) and

removing stop words

• Relations: one of 25:– IsA, UsedFor, HasA, CapableOf, Desires,

CreatedBy, AtLocation, CausesDesire, …

Example

Reliability

• Reliability increases when more users affirm that a statement is true– by entering equivalent statements

independently– by rating existing statements on the Web

• Each assertion gets a weight according to how many users support it

Polarity

• Allows predicates that express true, negative information: “Pigs cannot fly”

• Negated assertions are represented by negative weights

• Reliability and polarity are independent

AnalogySpace

• Technique for learning, reasoning, and analyzing using common sense

• AnalogySpace can:– generalize from sparsely-collected knowledge– confirm or question existing knowledge– classify information in a knowledge base in a variety of ways

• Can use the same technique in other domains: businesses, people, communities, opinions

AnalogySpace Overview

• Finds patterns in knowledge

• Builds a representation in terms of those patterns

• Finds additional knowledge using the combination of those patterns

• Uses dimensionality reduction via Singular Value Decomposition

Input to the SVD

• Input to SVD: matrix of concepts vs. features

• Feature: a concept, a relation, and an open slot, e.g., (. . . , MadeOf, metal)

• Concepts × features = assertions

The Input Matrix

• For consistency, we scale each concept to unit Euclidean magnitude

Running the SVD

The Truncated SVD

Truncating the SVD smoothes over sparse data.

Good vs. Bad

QuickTime™ and a decompressor

are needed to see this picture.

Reasoning with AnalogySpace

• Similarity represented by dot products of concepts (AAT)– Approximately the cosine of their angle

Reasoning with AnalogySpace

• Predictions represented by dot products of concepts with features

Contributors are in the loop

Ad-hoc Categories

What can we use common sense for?

• A “sanity check” on natural language

• Text prediction

• Affect sensing

• Recommender systems

• “Knowledge management”

Common Sense in Context

• We don’t just use common sense to make more common sense

• Helps a system make sense of everyday life– Making connections in domain-specific

information– Understanding free text– Bridging different knowledge sources

Digital Intuition

• Add common sense intuition

• Using similar techniques to make connections and inference between data sets

• Create a shared “Analogy”Space from two data sets using Blending

Blending

• Two data sets are combined in a way to maximize the interaction between the data sets

• They are weighted by a factor:

C = (1 – f)A + fB

Blending Creates aNew Representation

• With f = 0 or 1, equivalent to projecting one dataset into the other’s space

• In the middle, representation determined by both datasets.

No overlap = no interaction

A’s singular values B’s singular values

Overlap -> Nonlinear Interaction (Veering)

Overlap -> Nonlinear Interaction

SVD over Multiple Data Sets

• Convert all data sets to matrices

• Find a rough alignment between the matrices– Some rows or features

• Find a blending factor– Maximize veering or interaction

• Run the AnalogySpace process jointly

Blends of Multiple Data Sets

• You can blend more than two things– Simple blending heuristic: scale all

your data so that their largest singular vectors are equal

Applications

• Inference over domain specific data

• Word sense disambiguation

• Data visualization and analysis

• Finance

Tools we Distribute

• The OMCS database

• ConceptNet

• Divisi

• In development: the Luminoso visualizer

The Common Sense Computing Initiative

Web: http://csc.media.mit.edu/

Email: [email protected]

Thank you!