61a lecture 37 - university of california, berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. ·...

21
61A Lecture 37

Upload: others

Post on 14-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

61A Lecture 37

Page 2: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Announcements

Page 3: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Ambiguity

Page 4: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Syntactic Ambiguity in English

4

Sentence

NounPhrase

Verb Phrase

Subordinate Clause

1Preface of Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Sussman with Julie Sussman

1Programs must be written for people to read

Page 5: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Syntactic Ambiguity in English

5

Sentence

NounPhrase

Verb Phrase

Subordinate Clause

1Preface of Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Sussman with Julie Sussman

1Programs must be written for people to read

Page 6: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Syntactic Ambiguity in English

6

Verb Phrase

Verb Phrase

Sentence

NounPhrase

1Preface of Structure and Interpretation of Computer Programs by Harold Abelson and Gerald Sussman with Julie Sussman

1Programs must be written for people to read

Page 7: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Syntactic Ambiguity in English

7

pro•gram (noun) a series of coded software instructions

pro•gram (verb) provide a computer with coded instructions

must (verb) be obliged to

must (noun) dampness or mold

Definitions from the New Oxford American Dictionary

Programs must be written for people to read

Page 8: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Syntax Trees

Page 9: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Representing Syntactic Structure

A Tree represents a phrase:

• tag -- What kind of phrase (e.g., S, NP, VP)

• branches -- Sequence of Tree or Leaf components

A Leaf represents a single word:

• tag -- What kind of word (e.g., N, V)

• word -- The word

9

SentenceNoun

Phrase

cows intimidate

NounPhrase

VerbPhrase

Noun Verb

cows

Noun

Photo by Vince O'Sullivan licensed under http://creativecommons.org/licenses/by-nc-nd/2.0/

(Demo)

cows = Leaf('N', 'cows')

intimidate = Leaf('V', 'intimidate')

S, NP, VP = 'S', 'NP', 'VP'

Tree(S, [Tree(NP, [cows]),

Tree(VP, [intimidate,

Tree(NP, [cows])])])

Page 10: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Grammars

Page 11: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Grammar

A Sentence ...

... can be expanded as ...

... a Noun Phrase then a Verb Phrase.

Context-Free Grammar Rules

A grammar rule describes how a tag can be expanded as a sequence of tags or words

11

S NP VP

S

cows intimidate

NP VP

N

NP

V

cows

N

S NP VP

NP N

VP V NP

N cows

V intimidate

(Demo)

Page 12: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Parsing

Page 13: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Exhaustive Parsing

Expand all tags recursively, but constrain words to match input

13

buffalo buffalo buffalo buffalo

S

VPNP

0 1 2 3 4

Page 14: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

V J N

NP

Exhaustive Parsing

Expand all tags recursively, but constrain words to match input

14

S

NP VP

N

Constraint: A Leaf must match the input word

buffalo buffalo buffalo buffalo0 1 2 3 4

Page 15: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Exhaustive Parsing

15

0 1 2 3 4

Expand all tags recursively, but constrain words to match input

S

NP VP

buffalo buffalo buffalo buffalo

Page 16: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Exhaustive Parsing

16

(Demo)

Expand all tags recursively, but constrain words to match input

S

NP VP

buffalo buffalo buffalo buffalo

NP

NVJ N

0 1 2 3 4

Page 17: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Learning

(Demo)

Page 18: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Scoring a Tree Using Relative Frequencies

Not all syntactic structures are equally common

18

teacher strikes idle kids

SNP

NN NNS VB NNS

NP

VP

S NP VP

NP NN NNS

NN teacher

NNS strikes

VB idle

NNS kids

VP VB NP

NP NNS

Rule frequency per 100,000 tags

5

32

25372

1335

6679

4282

26

25

Page 19: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Scoring a Tree Using Relative Frequencies

Not all syntactic structures are equally common

19

VPS

NN VBZ JJ NNS

NP

NP

Rule frequency per 100,000 tags

19

5

18

32

4358

25372

3160

2526

1335

6679

4282

26

25

(Demo)

teacher strikes idle kids

S NP VP

NP NN

NN teacher

VBZ strikes

JJ idle

NNS kids

VP VBZ NP

NP JJ NNS

Page 20: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Translation

Page 21: 61A Lecture 37 - University of California, Berkeleycs61a/sp20/assets/slides/... · 2020. 7. 2. · VP S NN VBZ JJ NNS NP NP Rule frequency per 100,000 tags 19 5 18 32 4358 25372 3160

Syntactic Reordering

English Yoda-English Help you, I can! Yes! Mm!

When 900 years old you reach, look as good, you will not. Hm.

S

NP VP

PRP

I can

MD VP

VB PRP

help you

VB PRP

help you

VP .

,

(Demo)

21