a cross-lingual annotation projection approach for relation detection

A CROSS-LINGUAL ANNOTATION PROJECTION

APPROACH FOR RELATION DETECTION

The 23rd International Conference on Computational Linguistics (COLING 2010)August 24th, 2010, Beijing

Seokhwan Kim (POSTECH)Minwoo Jeong (Saarland University)

Jonghoon Lee (POSTECH)Gary Geunbae Lee (POSTECH)

Contents

• Introduction

• Methods

Cross-lingual Annotation Projection for Relation Detection

Noise Reduction Strategies

• Evaluation

• Conclusion

2

Contents

• Introduction

• Methods



• Evaluation

• Conclusion

3

What’s Relation Detection?

• Relation Extraction

To identify semantic relations between a pair of entities

ACE RDC

• Relation Detection (RD)

• Relation Categorization (RC)

4

Jan Mullins, owner Computer Recycler Incorporated said that …of

Owner-Of

What’s the Problem?

• Many supervised machine learning approaches have been

successfully applied to the RDC task

(Kambhatla, 2004; Zhou et al., 2005; Zelenko et al., 2003; Culotta

and Sorensen, 2004; Bunescu and Mooney, 2005; Zhang et al.,

2006)

• Datasets for relation detection

Labeled corpora for supervised learning

Available for only a few languages

• English, Chinese, Arabic

No resources for other languages

• Korean

5

Contents

• Introduction

• Methods



• Evaluation

• Conclusion

6

Cross-lingual Annotation Projection

• Goal

To learn the relation detector without significant annotation efforts

• Method

To leverage parallel corpora to project the relation annotation on

the source language LS to the target language LT

7

Cross-lingual Annotation Projection

• Previous Work

Part-of-speech tagging (Yarowsky and Ngai, 2001)

Named-entity tagging (Yarowsky et al., 2001)

Verb classification (Merlo et al., 2002)

Dependency parsing (Hwa et al., 2005)

Mention detection (Zitouni and Florian, 2008)

Semantic role labeling (Pado and Lapata, 2009)

• To the best of our knowledge, no work has reported on the

RDC task

8

ProjectionAnnotation

Overall Architecture

9

Parallel Corpus

Sentences in Ls

Preprocessing(POS Tagging,

Parsing)

NER

Relation Detection

AnnotatedSentences in

Ls

Sentences in Lt

Preprocessing(POS Tagging,

Parsing)

Word Alignment

Projection

AnnotatedSentences in

Lt

How to Reduce Noise?

• Error Accumulation

Numerous errors can be generated and accumulated through a

procedure of annotation projection

• Preprocessing for LS and LT

• NER for LS

• Relation Detection for LS

• Word Alignment between LS and LT

• Noise Reduction

A key factor to improve the performance of annotation projection

10

• Noise Reduction Strategies (1)

Alignment Filtering

• Based on Heuristics

A projection for an entity mention should be based on alignments between

contiguous word sequences


11

accepted rejected


Alignment Filtering




Both an entity mention in LS and its projection in LT should include at

least one base noun phrase


12

accepted rejected

N N N N

N

accepted rejected


Alignment Filtering




Both an entity mention in LS and its projection in LT should include at

least one base noun phrase

The projected instance in LT should satisfy the clausal agreement with the

original instance in LS


13

accepted rejected

N N N N

N

accepted rejected rejected



Alignment Correction

• Based on a bilingual dictionary for entity mentions

Each entry of the dictionary is a pair of entity mention in LS and its

translation or transliteration in LT

14

FOR each entity ES in LSRETRIEVE counterpart ET from DICT(E-T)

SEEK ET from the sentence ST in LTIF matched THEN

MAKE new alignment ES-ETENDIF

ENDFOR

A B C D E F G

α β γ δ ε δ ε

BCD - βγ

corrected



Assessment-based Instance Selection

• Based on the reliability of a projected instances in LT

Evaluated by the confidence score of monolingual relation detection for

the original counterpart instance in LS

Only instances with larger scores than threshold value θ are accepted

15

conf = 0.9

accepted

conf = 0.6

rejected

θ = 0.7

Contents

• Introduction

• Methods



• Evaluation

• Conclusion

16

Experimental Setup

• Dataset

English-Korean parallel corpus

• 454,315 bi-sentence pairs in English and Korean

• Aligned by GIZA++

Korean RDC corpus

• Annotated following LDC guideline for ACE RDC corpus

• 100 news documents in Korean

835 sentences

3,331 entity mentions

8,354 relation instances

17

Experimental Setup

• Preprocessors

English

• Stanford Parser (Klein and Manning, 2003)

• Stanford Named Entity Recognizer (Finkel et al., 2005)

Korean

• Korean POS Tagger (Lee et al., 2002)

• MST Parser (R. McDonald et al., 2006)

18

Experimental Setup

• Relation Detection for English Sentences

Tree kernel-based SVM classifier

• Training Dataset

ACE 2003 corpus

• 674 documents

• 9,683 relation instances

• Model

Shortest path enclosed subtrees kernel (Zhang et al., 2006)

• Implementation

SVM-Light (Joachims, 1998)

Tree Kernel Tools (Moschitti, 2006)

19

Experimental Setup

• Relation Detection for Korean Sentences

Tree kernel-based SVM classifier

• Training Dataset

Half of the Korean RDC corpus (baseline)

Projected instances

• Model

Shortest path dependency kernel (Bunescu and Mooney, 2005)

• Implementation

SVM-Light (Joachims, 1998)

Tree Kernel Tools (Moschitti, 2006)

20

Experimental Setup

• Experimental Sets

Combinations of noise reduction strategies

• (S1: Heuristic, S2: Dictionary, S3: Assessment)

1. Baseline

Trained with only half of the Korean RDC corpus

2. Baseline + Projections (no noise reduction)

3. Baseline + Projections (S1)

4. Baseline + Projections (S1 + S2)

5. Baseline + Projections (S3)

6. Baseline + Projections (S1 + S3)

7. Baseline + Projections (S1 + S2 + S3)

21

Experimental Setup

• Evaluation

On the second half of the Korean RDC corpus

• The first half is for the baseline

On true entity mentions with true chaining of coreference

Evaluated by Precision/Recall/F-measure

22

Experimental Results

Modelno assessment with assessment

P R F P R F

baseline 60.5 20.4 30.5 - - -

baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2

Baseline + projection(heuristics)

51.4 15.5 23.8 56.1 22.9 32.5

Baseline + projection(heuristics + dictionary)

55.3 19.4 28.7 59.8 26.7 36.9

23

Non-filtered Projects were Poor


P R F P R F

baseline 60.5 20.4 30.5 - - -



51.4 15.5 23.8 56.1 22.9 32.5


55.3 19.4 28.7 59.8 26.7 36.9

24

Heuristics Were Helpful


P R F P R F

baseline 60.5 20.4 30.5 - - -



51.4 15.5 23.8 56.1 22.9 32.5


55.3 19.4 28.7 59.8 26.7 36.9

25

Much Worse Than Baseline


P R F P R F

baseline 60.5 20.4 30.5 - - -



51.4 15.5 23.8 56.1 22.9 32.5


55.3 19.4 28.7 59.8 26.7 36.9

26

Dictionary Was Also Helpful


P R F P R F

baseline 60.5 20.4 30.5 - - -



51.4 15.5 23.8 56.1 22.9 32.5


55.3 19.4 28.7 59.8 26.7 36.9

27

Still Worse Than Baseline


P R F P R F

baseline 60.5 20.4 30.5 - - -



51.4 15.5 23.8 56.1 22.9 32.5


55.3 19.4 28.7 59.8 26.7 36.9

28

Assessment Boosted Performance


P R F P R F

baseline 60.5 20.4 30.5 - - -



51.4 15.5 23.8 56.1 22.9 32.5


55.3 19.4 28.7 59.8 26.7 36.9

29

Combined Strategies Achieved

Better Performance Then Baseline


P R F P R F

baseline 60.5 20.4 30.5 - - -



51.4 15.5 23.8 56.1 22.9 32.5


55.3 19.4 28.7 59.8 26.7 36.9

30

Contents

• Introduction

• Methods



• Evaluation

• Conclusion

31

Conclusion

• Summary

A cross-lingual annotation projection for relation detection

Three strategies for noise reduction

Projected instances from an English-Korean parallel corpus helped

to improve the performance of the task

• with the noise reduction strategies

• Future work

A cross-lingual annotation projection for relation categorization

More elaborate strategies for noise reduction to improve the

projection performance for relation extraction

32

a cross-lingual annotation projection approach for relation detection

Technology

ls relation detection

lt noise reduction

whats relation detection

relation detector

relation extraction

relation instances17

entity mention

ls word alignment