natural language understanding

22
Natural Language Understanding Difficulties: Large amount of human knowledge assumed – Context is key. Language is pattern-based. Patterns can restrict possible interpretations. Language is purposeful. There is a goal behind an utterance.

Upload: gilda

Post on 11-Jan-2016

41 views

Category:

Documents


2 download

DESCRIPTION

Natural Language Understanding. Difficulties: Large amount of human knowledge assumed – Context is key. Language is pattern-based. Patterns can restrict possible interpretations. Language is purposeful. There is a goal behind an utterance. Other Difficulties. Ambiguity (different levels) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Natural Language Understanding

Natural Language Understanding

Difficulties: Large amount of human knowledge assumed –

Context is key. Language is pattern-based. Patterns can restrict

possible interpretations. Language is purposeful. There is a goal behind an

utterance.

Page 2: Natural Language Understanding

Other Difficulties

Ambiguity (different levels) word meanings syntactic structure referential ambiguity intentional ambigiuity

Imprecision Idioms, Jargon, Slang Language changes

Page 3: Natural Language Understanding

Analysis of Language

Analysis of Language occurs at different levels: Prosody – rhythm and intonation Phonology – sound formation (from phonemes) Morphology – word formation (from morphemes) Syntax – phrase and sentence formation Semantics – applying meaning to expressions Pragmatics – how language is used World knowledge – contextual information

Page 4: Natural Language Understanding

Processing Language

Parsing – analyzing the syntactic structure of

sentences, often resulting a parse tree. Semantics – analyzing the meaning of sentences,

resulting in semantic networks, logical statements,

or other KR. Integration of world knowledge – add appropriate

knowledge from the domain of discourse. Use of knowledge learning from discourse.

Page 5: Natural Language Understanding

Processing (cont'd)

Often, the steps are done sequentially (parse syntax

of sentences, make semantic inferences, add

domain knowledge, use result), with the output of

one stage becoming the input to the next stage.

Alternatively, fragments may be pass along once

they are determined (incremental parsing).

Feedback may be necessary to resolve references (“I

shot the bear in my pajamas) – blackboard

systems.

Page 6: Natural Language Understanding

Context-Free Grammars

A good deal of syntax can be represented by using

context-free grammars (cfg). Rules are of the

form:<non-terminal> <- list of <terminals> and <non-terminals>

non-terminals are syntactic categories, terminals are

words (and punctuation). One non-terminal is

“sentence”

Page 7: Natural Language Understanding

CFG Example

sent <- np, vp.

np <- noun.

np <- art, noun.

np <- art, adj, noun.

vp <- verb.

vp <- verb, np.

noun <- boy.

noun <- dog.

art <- a.

art <- the.

adj <- yellow.

verb <- runs.

verb <- pets.

Page 8: Natural Language Understanding

Prolog Code

sent(X,Y) :- np(X,Z), vp(Z,Y).np(X,Y) :- noun(X,Y).np(X,Y) :- art(X,Z), noun(Z,Y).np(X,Y) :- art(X,Z), adj(Z,W), noun(W,Y).vp(X,Y) :- verb(X,Y).vp(X,Y) :- verb(X,Z), np(Z,Y).noun([boy|Y],Y).noun([dog|Y],Y).art([a|Y],Y).art([the|Y],Y).adj([yellow|Y],Y).verb([runs|Y],Y).verb([pets|Y],Y).

Page 9: Natural Language Understanding

Prolog Example

| ?- sent([the,boy,pets,a,dog],[]).sent([the,boy,pets,a,dog],[]).

true ? ;

yes| ?- sent([the,boy,likes,a,dog],[]).sent([the,boy,likes,a,dog],[]).

no

Page 10: Natural Language Understanding

Parsing

We can augment the Prolog program so that each

clause has a third variable, which contains the parse

tree of the phrase. The parse trees are built up

recursively.

Page 11: Natural Language Understanding

Parsing Code

sent(X,Y,s(M1,M2)) :- np(X,Z,M1), vp(Z,Y,M2).np(X,Y,M) :- noun(X,Y,M).np(X,Y,np(M1,M2)) :- art(X,Z,M1), noun(Z,Y,M2).np(X,Y,np(M1,M2,M3)) :- art(X,Z,M1), adj(Z,W,M2), noun(W,Y,M3).

vp(X,Y,vp(M)) :- verb(X,Y,M).vp(X,Y,vp(M1,M2)) :- verb(X,Z,M1), np(Z,Y,M2).noun([boy|Y],Y,noun(boy)).noun([dog|Y],Y,noun(dog)).art([a|Y],Y,art(a)).art([the|Y],Y,art(the)).adj([yellow|Y],Y,adj(yellow)).verb([runs|Y],Y,verb(runs)).verb([pets|Y],Y,verb(pets)).

Page 12: Natural Language Understanding

Parsing Example| ?- sent([the,boy,pets,a,dog],[],M).

M = s(np(art(the),noun(boy)),vp(verb(pets),np(art(a),noun(dog)))) ? ;

(1 ms) no| ?- sent([the,yellow,dog,runs],[],M).

M = s(np(art(the),adj(yellow),noun(dog)),vp(verb(runs))) ? ;

no

Page 13: Natural Language Understanding

Semantics

Since we can use arbitrary Prolog code, it is possible

to add tests to the code. For example, we could

include a type system and only allow parses that are

consistent with the types (for example, only animate

actors)

In addition, we could return the meaning of the

phrase or sentence instead of just a parse tree.

Page 14: Natural Language Understanding

Frame and Slot Notation

In this simple example, we will use a frame and slot

notation for the meaning of words, phrases, and

sentences. A meaning will consist of a pair

containing a head item and a list of slots, each of

which is an attribute/value pair. Values may be

variables to be instantiate at a later time.

Page 15: Natural Language Understanding

Notation Examples

For example, a verb in Simmon's semantic

representation scheme has attributes agent and

object. The meaning of a verb, say, likes, could be

represented by the term

meaning([likes,[agent,X], [object,Y]], [[agent X],[object,Y]]))

X and Y will be instantiated by the meanings of other

words and phrases of the sentence.

Page 16: Natural Language Understanding

More on Slots

The attribute names may be semantic relationships

(agent, object), or surface semantic relationships

(adjmod – the thing modified by an adjective, or

pobj – the object of a preposition). The slot filler

must come from an appropriate part of the sentence

as indicated by the grammar.

Page 17: Natural Language Understanding

Another Example

prep([over|R], R, meaning([V,[location,[over,X]]A], [[pmod,[V|A]],[pobj,X]])).

The preposition over will modify the subject of the

preposition V (indicated by pmod) which may

already have a list of attributes A. The object of the

preposition X, is added to the list of attributes under

the attribute name location and value [over,X].

Page 18: Natural Language Understanding

Semantics - Example

| ?- sent([i,shot,the,bear,in,my,pajamas],[],M).

M = meaning([shoot,[location,[in,[pajamas,[owner,me]]]],[agent,[i]],[object,[bear]],[time,past]],[]) ? ;

;

no

Page 19: Natural Language Understanding

Phrase Structured Grammar

These kind of grammars are called phrase structured

grammars. As implement in Prolog, they have

equivalent computing power of any Turing

complete system and yet are simple to follow.

Page 20: Natural Language Understanding

Alternative Methods

Chart Parsing (Early) – see bookTransition Network Parser: The grammar is

represented as a set of finite state machines

(transition diagrams). Each FSM implements a non-

terminal. Arcs are labeled with non-terminals or

terminals. In the former case, a subprogram is

invoked (jump to the network for that non-

terminal). A path from the start node to the end

node indicates acceptance.

Page 21: Natural Language Understanding

Augmented Transition Networks

Procedures may be attached to arcs which are

triggered when the arcs are traversed. The

procedure may perform a test, or set a variable to a

value for later use. ATNS are often combines with

KR schema to produce a meaning of the sentence or

phrase (semantics).

Page 22: Natural Language Understanding

Uses of Natural Language

Database Front-endQuestion and AnsweringInformation Extraction and Summary (Web)Next generation computingBetter than keyword search – incorporates context