multi-task transfer learning for weakly- supervised relation extraction jing jiang singapore...
TRANSCRIPT
Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction
Jing JiangSingapore Management University
ACL-IJCNLP 2009
Aug 5, 2009 ACL-IJCNLP 2009 2
Relation Extraction
• Task definition: to label the semantic relation between a pair of entities in a sentence (fragment)
…[leader arg-1] of a minority [government arg-2]…
PHYS PER-SOC EMP-ORG NIL
PHYS: PhysicalPER-SOC: Personal / SocialEMP-ORG: Employment / Membership / Subsidiary
Aug 5, 2009 ACL-IJCNLP 2009 3
Supervised Learning
• Current solution: supervised machine learning (e.g. [Zhou et al. 2005], [Bunescu & Mooney 2005], [Zhang et al. 2006])
• Training data is needed for each relation type
…[leader arg-1] of a minority [government arg-2]…
arg-1 word: leader arg-2 type: ORG
dependency:arg-1 of arg-2
EMP-ORGPHYS PER-SOC NIL
Aug 5, 2009 ACL-IJCNLP 2009 4
Challenge in Practice
• New relation type (in a new domain): no training data or a few seed instances
• In this work, we study weakly-supervised relation extraction– A few seed instances of the target relation type– Many instances of other auxiliary relation types– Additional human knowledge about the target relation
type
• Main idea: Auxiliary relation types can help!
Aug 5, 2009 ACL-IJCNLP 2009 5
Syntactic Similarity across Relation Types
…[leader arg-1] of a minority [government arg-2]…
arg-1 word: leader arg-2 type: ORG
dependency:arg-1 of arg-2
the youngest [son arg-1] of ex-director [Suharto arg-2]
the [Socialist People’s Party arg-1] of [Montenegro arg-2]
EMP-ORG
PER-SOC
GPE-AFF
Aug 5, 2009 ACL-IJCNLP 2009 6
Syntactic Similarity
Syntactic Pattern
Relation Instance Relation Type (Subtype)
arg-2 arg-1 Arab leaders OTHER-AFF (Ethnic)
his father PER-SOC (Family)
South Jakarta Prosecution Office
GPE-AFF (Based-in)
arg-1 [verb] arg-2 Yemen [sent] planes to Baghdad
ART (User-or-Owner)
His wife [had] three young children
PER-SOC (Family)
Jody Scheckter [paced] Farrari to both victories
EMP-ORG (Employ-Staff)
Aug 5, 2009 ACL-IJCNLP 2009 7
Problem Formulation based on Transfer Learning
• Domain adaptation and transfer learning (e.g. [Blitzer et al. 2006], [Hal Daume III 2007])
our goal: PER-SOC EMP-ORG
• We apply our previous framework ([Jiang & Zhai 2007b])
– Similar in spirit to [Evgeniou & Pontil 2004] and [Daume III, 2007]
Aug 5, 2009 ACL-IJCNLP 2009 8
Review of Relation Extraction BasicsLinear classifier
…[leader arg-1] of a minority [government arg-2]…
10..1..
x
arg-2 type: ORGarg-2 type: PER
dependency:arg-1 of arg-2
4.50.3
.
.6.7
.
.
w
arg-2 type: ORG
xwxf T)(feature vector weight vector in linear
classifier
dependency:arg-1 of arg-2
EMP-ORG
Aug 5, 2009 ACL-IJCNLP 2009 9
General vs. Specific FeaturesAssumption: some features are commonly useful
for different relation types, while other features are specific for individual relation types
: weight vector for target type
: weight vector for k’th auxiliary type
Kkw
w
kk
TT
,,1for
Tw
kwcommon weight vector in a lower H dimensional space
Aug 5, 2009 ACL-IJCNLP 2009 10
Learning Framework
loss function on the target seed instances
loss function on the auxiliary training instances
2
1
22
1
0,,,1
),(
),(minargˆ,ˆ,ˆ
K
kk
kT
T
K
kkk
TTF
TK
kk
DL
DLTk
104 1
Aug 5, 2009 ACL-IJCNLP 2009 11
General Features
Which subset of features should be captured by ?
Kkw
w
kk
TT
,,1for common weight vector in a
lower H dimensional space
Aug 5, 2009 ACL-IJCNLP 2009 12
Feature Separation
• Automatic separation within the learning framework (see [Jiang & Zhai 2007b])
• Human guidance– Argument word features: features that contain head
word of an argument• E.g. arg-1 word: sister
– Entity type features: features that contain the entity type (subtype) of an argument
• E.g. arg-2 type: ORG
• Combined
Aug 5, 2009 ACL-IJCNLP 2009 13
Imposing Entity Type Constraint
• Fix the possible entity types for the arguments for the target relation type
• Filter out the relation instances that do not satisfy the constraint in the end
Aug 5, 2009 ACL-IJCNLP 2009 14
Experiment Setup
• ACE 2004, 7 relation types– 6 types auxiliary types
1 type target type
• 5-fold cross validation
• # seed instances: 10
Aug 5, 2009 ACL-IJCNLP 2009 15
Methods Compared
• BL: train on seed instances only• BL-A: train on seed and auxiliary training
instances together w/o feature separation• TL-auto: transfer learning w/ automatic feature
separation• TL-guide: transfer learning w/ human-guided
feature separation• TL-comb: automatic feature separation
combined with human guidance• TL-NE: TL-comb + entity type constraint
Aug 5, 2009 ACL-IJCNLP 2009 16
ComparisonTarget Type BL BL-A TL-auto TL-
guideTL-
combTL-NE
Physical P 0.000 0.1692 0.2920 0.2934 0.3325 0.5056
R 0.000 0.0848 0.1696 0.1722 0.2383 0.2316
F 0.000 0.1130 0.2146 0.2170 0.2777 0.3176
Personal/Social P 1.000 0.0804 0.1005 0.3069 0.3214 0.6412
R 0.0386 0.1708 0.1598 0.7245 0.7686 0.7631
F 0.0743 0.1093 0.1234 0.4311 0.4533 0.6969
Employment/Membership/Subsidiary
P 0.9231 0.3561 0.5230 0.5428 0.5973 0.7145
R 0.0075 0.1850 0.2617 0.2648 0.3632 0.3601
F 0.0148 0.2435 0.3488 0.3559 0.4518 0.4789
Average P 0.8124 0.1475 0.2412 0.2703 0.2992 0.4231
R 0.0212 0.2432 0.3832 0.4764 0.5509 0.5464
F 0.0406 0.1532 0.2532 0.2958 0.3423 0.4132
Aug 5, 2009 ACL-IJCNLP 2009 17
Effect of λ
λμT 100 1000 10000
P 0.6265 0.3162 0.2992
R 0.1170 0.3959 0.5509
F 0.1847 0.2983 0.3423
Performance of TL-comb. λμk = 104, λν = 1.
2
1
22
K
kk
kT
T
Aug 5, 2009 ACL-IJCNLP 2009 20
Conclusions• We proposed to apply a multi-task transfer
learning framework to the weakly-supervised relation extraction problem.
• We defined two kinds of type-specific features.
• Our experiments show that automatic feature separation combined with human guidance and entity type constraint can significantly outperform the baselines.
Aug 5, 2009 ACL-IJCNLP 2009 22
Related Work
• [Zhou et al. 2008]: Different way of modeling commonality among relation types.
• [Banko & Etzioni, 2008]: Open-domain relation extraction. No target relation type.
• [Xu et al. 2008]: Rule-based adaptation. Same type.