adapting text instead of the model : an open domain approach

42
1 Adapting Text instead of the Model : An Open Domain Approach Gourab Kundu, Dan Roth University of Illinois at Urbana- Champaign

Upload: odelia

Post on 24-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Adapting Text instead of the Model : An Open Domain Approach. Gourab Kundu, Dan Roth University of Illinois at Urbana-Champaign. Motivating Example # 1. predicate. Semantic role. Original Sentence. Wrong. Scotty gazed at ugly gray slums . AM-LOC. Transformed Sentence. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Adapting Text instead of the Model : An Open Domain Approach

1

Adapting Text instead of the Model : An Open Domain Approach

Gourab Kundu, Dan RothUniversity of Illinois at Urbana-Champaign

Page 2: Adapting Text instead of the Model : An Open Domain Approach

2

Motivating Example #1

Scotty gazed at ugly gray slums .Original Sentence

Scotty looked at ugly gray slums .

Transformed Sentence

AM-LOC

A1

predicate

Semantic role

Wrong

Correct!

Page 3: Adapting Text instead of the Model : An Open Domain Approach

3

Motivating Example #2

Original SentenceHe was discharged from the hospital after a two-day checkup and he and his parents had what Mr. Mckinley described as a “celebration lunch” in the campus.

AM-TMPPredicate Wron

g

Page 4: Adapting Text instead of the Model : An Open Domain Approach

4

Transformed SentenceHe was discharged from the hospital after a two-day examination and he and his parents had what Mr. Mckinley described as a “celebration lunch” in the campus.

Motivating Example #2

Predicate AM-TMP

Correct!

Page 5: Adapting Text instead of the Model : An Open Domain Approach

5

Research Question

Can text perturbation be done in an automatic way to yield better NLP analysis?

We study this question in the context of semantic role labeling.

We focus on improving the performance of SRL on a different domain

Page 6: Adapting Text instead of the Model : An Open Domain Approach

6

Outline

Overview of Domain Adaptation Overview of Adaptation Using Transformations (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion

Page 7: Adapting Text instead of the Model : An Open Domain Approach

7

Domain Adaptation

Models trained on one domain perform significantly worse on another domain Semantic Role Labeling: WSJ domain (76%), Fiction domain (65%)

Important Problem for wide scale NLP Adaptation is a problem for many tasks of NLP There are many different domains where natural language varies Labeling is expensive and time consuming

Page 8: Adapting Text instead of the Model : An Open Domain Approach

8

Current Approaches to Domain Adaptation

Labeled Adaptation Uses labeled data from new domain

Unlabeled Adaptation Uses unlabeled data from new domain

Combined Adaptation Combines labeled and unlabeled data

• ChelbaAc04, Adaptation of a maximum entropy capitalizer: Little data can help a lot

• Daume07, Frustratingly Easy domain adaptation• FinkelMa09, Hierarchical Bayesian domain

adaptation

• BlitzerMcPe06, Domain Adaptation with Structural Correspondence Learning

• HuangYa09, Distributional Representations for Handling Sparsity in Supervised Sequence Labeling

• JiangZh07, Instance Weighting for Domain Adaptation in NLP

• ChangCoRo10, The necessity of combining adaptation methods

Page 9: Adapting Text instead of the Model : An Open Domain Approach

Limitations: Need to retrain the model -- can take a long time

Limitations (Retraining takes time)

9

NLP Tool 1

NLP Tool N

NLP Tool 2

Model

Retrain

Retrain

Target Domain Unlabeled Data

Source DomainUnlabeled Data

SRL: 20 hours

Page 10: Adapting Text instead of the Model : An Open Domain Approach

10

Limitations: Need to retrain other people’s tools -- may need implementation

Limitations (Some tools are hard to retrain)

NLP Tool 1

NLP Tool N

NLP Tool 2

Model

Target Domain Unlabeled Data

Source DomainUnlabeled Data

No option for retraining

Page 11: Adapting Text instead of the Model : An Open Domain Approach

11

Limitations: Need significant unlabeled data -- may not be available (e.g. website)

Limitations (Insufficient Unlabeled Data)

NLP Tool 1

NLP Tool N

NLP Tool 2

Model

Target Domain Unlabeled Data

Source DomainUnlabeled Data

May not be sufficient

Page 12: Adapting Text instead of the Model : An Open Domain Approach

12

Outline

Overview of Domain Adaptation Overview of Adaptation Using Transformations (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion

Page 13: Adapting Text instead of the Model : An Open Domain Approach

13

ADaptation Using Transformations (ADUT)

t1

Combination Module

Transformation Module

Transformed Sentences

t2

tk

Model Outputs

o1

o2

ok

Output o… …

Tool Tool

Model

Sentence s

Old System

Traditional Approach: Adapt model for the new text Our Approach: Adapt text for the old model

Page 14: Adapting Text instead of the Model : An Open Domain Approach

14

Transformation Functions

Definition: A Function that maps an instance to a set of instances

Example: Replacement of a word with synonyms that are common in training data

Properties: Label (Semantic role) Preserving Output examples are more likely to appear in Old Domain than input

example

Page 15: Adapting Text instead of the Model : An Open Domain Approach

15

Categorization of Transformation Functions

Resource Based Transformation Uses resources and prior knowledge

Learned Transformations Learned from training data

Page 16: Adapting Text instead of the Model : An Open Domain Approach

16

Resource Based Transformation

Replacement of Infrequent Predicate Replacement/Removal of Quoted String Replacement of Unknown Word (Word Cluster, WordNet) Sentence Simplification

Page 17: Adapting Text instead of the Model : An Open Domain Approach

17

Replacement of Infrequent Predicate (VerbNet)

Intuition: Model makes better prediction over frequent predicates.

Scotty gazed at ugly gray slums .Input Sentence

Scotty looked at ugly gray slums .

Transformed Sentence

Page 18: Adapting Text instead of the Model : An Open Domain Approach

18

Replacement/Removal of Quoted String

Intuition: Parser works better on simplified quoted sentences.

Input Sentence“We just sit quiet” , he said .

Transformed Sentences

We just sit quiet.

He said, “This is good”.

He said, “We just sit quiet”.

Page 19: Adapting Text instead of the Model : An Open Domain Approach

19

Replacing Unknown Word(Word Cluster, WordNet)

Intuition: Parser & Model works better on known words.

Input Sentence

He was released after a two-day checkup.

Transformed SentenceHe was released after a two-day examination.

Page 20: Adapting Text instead of the Model : An Open Domain Approach

20

Sentence Simplification (1)

Intuition: Parser & Model work better on simplified sentences.

Transformed Sentence

The science teacher and the students discussed the issue.

Input SentenceThe science teacher and the students discussed the issue at the classroom .

Delete PP

Page 21: Adapting Text instead of the Model : An Open Domain Approach

21

Sentence Simplification (2)

Transformed Sentence

The teacher discussed the issue.

Input Sentence

The science teacher and the students discussed the issue. Simplify NP

Page 22: Adapting Text instead of the Model : An Open Domain Approach

A2

Learned Transformation Rules

Motivation: Identify a specific context in the input sentence Transfer the candidate argument to a simpler context in which the SRL

is more robust

was entitled to a discount .

-2 -1 0 1 2.

Input SentenceMr. Mckinley

NP, Mckinley AUX, was PP, to

pattern p=[-2,NP,][-1,AUX,][1,,to]

Page 23: Adapting Text instead of the Model : An Open Domain Approach

-2

Rule: predicate p=entitle pattern p=[-2,NP,][-1,AUX,][1,,to] Location of Source Phrase ns=-2 Replacement Sentence st=“But he did not sing.” Location of Replacement Phrase nt=-3 Label Correspondence function f={(A0,A2),(Ai,Ai, i0)}

Context Component of Rules

was entitled to a discount .

-1 0 1 2.

Input SentenceMr. Mckinley

Page 24: Adapting Text instead of the Model : An Open Domain Approach

24

Replacement Component of Rules

Motivation:

Rule: predicate p=entitle pattern p=[-2,NP,][-1,AUX,][1,,to] Location of Source Phrase ns=-2

Replacement Sentence st=“But he did not sing.” Location of Replacement Phrase nt=-3 Label Correspondence function f={(A0,A2),(Ai,Ai, i0)}

did not sing .

-4 -3 -2 -1 0 1

Replacement SentenceBut he

Page 25: Adapting Text instead of the Model : An Open Domain Approach

Rule: predicate p=entitle pattern p=[-2,NP,][-1,AUX,][1,,to] Location of Source Phrase ns=-2 Replacement Sentence st=“But he did not sing.” Location of Replacement Phrase nt=-3

Label Correspondence function f={(A0,A2),(Ai,Ai, i0)}

Semantic Role mapping component of Rule

25

was entitled to a discount .

-2 -1 0 1 2

Input Sentence Transformed Sentencedid not sing .

-4 -3 -2 -1 0 1

Replacement SentenceMr. Mckinley But he

Gold AnnotationA2 Apply SRL SystemA0

Page 26: Adapting Text instead of the Model : An Open Domain Approach

26

for each phrase p in input sentence sfor each rule τ Є R

if τ applies to psentence t = transform(τ, p)r = semantic role of p in t using SRL modelsemantic role of p in s = map(τ, r)

Transforming a sentence by using rules

R is the set of rules, learned from training data

Page 27: Adapting Text instead of the Model : An Open Domain Approach

27

Learning Transformation Rules

Input: Predicate p, Semantic role r R Get Initial Rules (p, r) repeat

S Expand Rules (R) Sort R S based on accuracy∪ R Top rules in R S∪

Page 28: Adapting Text instead of the Model : An Open Domain Approach

28

Learning Transformation Rules

Input: Predicate p, Semantic role r R Get Initial Rules (p, r) repeat

S Expand Rules (R) Sort R S based on accuracy∪ R Top rules in R S∪

Page 29: Adapting Text instead of the Model : An Open Domain Approach

29

Get Initial Rules (entitle, A2)

was entitled to a discount .did not sing .

Replacement SentenceMr. MckinleyBut he

A2

Replacement Sentence I asked the man .

Page 30: Adapting Text instead of the Model : An Open Domain Approach

30

Learning Transformation Rules

Input: Predicate p, Semantic role r R Get Initial Rules (p, r) repeat

S Expand Rules (R) Sort R S based on accuracy∪ R Top rules in R S∪

Page 31: Adapting Text instead of the Model : An Open Domain Approach

31

Expand Rule ( )𝜏Rule :

st = “But he did not sing .”nt = -3p = asksp = [-1,NP,I][0,VBD,asked][1,NP,man]ns = 1f = {(A0,A2), (Ai,Ai, i0)}

Neighbor Rule of :st = .st

nt = p = .psp=[-1,NP,][0,VBD,asked][1,NP,man]ns = .ns

f = .f

He asked the man

Does not apply

Applies

Page 32: Adapting Text instead of the Model : An Open Domain Approach

32

Learning Transformation Rules

Input: Predicate p, Semantic role r R Get Initial Rules (p, r) repeat

S Expand Rules (R) Sort R S based on ∪ accuracy R Top rules in R S∪

Page 33: Adapting Text instead of the Model : An Open Domain Approach

33

Calculate Accuracy ( )𝜏 Example: A rule is correct

33

was entitled to a discount .

-2

-1 0 1 2.

Input Sentence Transformed Sentencedid not sing .

-4 -3 -2 -1 0 1

Replacement SentenceMr. Mckinley But he

Gold Annotation

A2

Apply SRL System

A0

A2 = f (A0)Correct!

-2

Page 34: Adapting Text instead of the Model : An Open Domain Approach

34

Calculate Accuracy ( )𝜏 Example: A rule makes a mistake

34

was entitled a big success .-2

-1 0 1 2.

Input Sentence Transformed Sentencedid not sing .

-4 -3 -2 -1 0 1

Replacement SentenceThe movie But he

Gold Annotation

A1

Apply SRL System

A0

A2 = f (A0)Wrong

-2

Page 35: Adapting Text instead of the Model : An Open Domain Approach

35

Outline

Overview of Domain Adaptation Overview of Adaptation Using Transformation (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion

Page 36: Adapting Text instead of the Model : An Open Domain Approach

Combination using Integer Linear Programming

Step1: Compute distribution of scores over labels for argument candidates Our SRL system classifies each phrase as a semantic role The system assigns a probability distribution over semantic roles for each argument For same argument in different sentences, compute the average

3636

Transformed Sentence 1

Scotty looked at ugly gray slums .

Transformed Sentence 2

Scotty gazed at ugly gray slums .

A1 0.3

AM-LOC 0.4

A1 0.6

AM-LOC 0.1

A1 0.45

AM-LOC 0.25Average

Page 37: Adapting Text instead of the Model : An Open Domain Approach

37

Inference via Integer Linear Programming

37

Input Sentence

Scotty gazed at ugly gray slums.

Goal: Find maximum likely semantic role assignment to all arguments without violating the constraints

Solve an ILP

Example of a Constraint:Two arguments can not overlap

Page 38: Adapting Text instead of the Model : An Open Domain Approach

38

Outline

Overview of Domain Adaptation Overview of Adaptation Using Transformations (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion

Page 39: Adapting Text instead of the Model : An Open Domain Approach

39

Results for Single Parse System (F1)

Charniak Parse based SRL Stanford Parse based SRL

65.5

62.9

69.3(+3.8)

65.7(+2.8)

Baseline ADUT

Page 40: Adapting Text instead of the Model : An Open Domain Approach

40

Results for Multi Parse System (1)

F1

67.8(-2.7)68.8(-1.7) 69.2(-1.3)

70.5

73.8(+3.3) (Retrain)

Punyakanok08 Toutanova08 Surdeanu07 (Cons)ADUT-Combined Huang10

Page 41: Adapting Text instead of the Model : An Open Domain Approach

41

Effect of each Transformation

F1

65.566.1

66.8 6766.4 66.2

69.3

Baseline Replacement of Unknown wordsReplacement of Predicate Replacement of QuotesSentence Simplification Transformation By RulesTogether

Page 42: Adapting Text instead of the Model : An Open Domain Approach

42

Conclusion

Current Work We suggested a framework for adapting text to yield better SRL analysis We showed that adaptation is possible without retraining and unlabeled

data We showed that simple transformations yield 13% error reduction for SRL

Future Work: Applying framework to other domains and tasks Using unlabeled data to improve transformations

Thank You.