![Page 1: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/1.jpg)
1
Joint Inference for Knowledge Extraction from
Biomedical Literature
Hoifung PoonDept. Computer Science & Eng.
University of Washington
(Joint work with Lucy Vanderwende
at Microsoft Research)
![Page 2: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/2.jpg)
2
Outline
Motivation Bio-event extraction Our system Experimental results Conclusion
![Page 3: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/3.jpg)
3
Knowledge Extraction From Web……
WWW
![Page 4: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/4.jpg)
4
Knowledge Extraction From Web
If we succeed ……Breach knowledge acquisition bottleneckSemantic search, question answering, …
But where should we start?More urgent and/or amenableGeneral approaches
![Page 5: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/5.jpg)
5
Knowledge Extraction From Biomedical Literature
PubMed: 18 million abstracts; += 2000 / mo. Success would mean:
Revolutionize biomedical research Dramatic speed-up in drug design
Grammatical English General challenges:
Beyond traditional information extraction Complex, nested structures Naturally call for joint inference
![Page 6: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/6.jpg)
6
BioNLP: An Emerging Field
Protein name recognition Protein-protein interaction Bio-event extraction: Shared task of 2009
[Kim et al. 2009]
Pathway Network
……
![Page 7: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/7.jpg)
7
BioNLP: An Emerging Field
Protein name recognition Protein-protein interaction (top F1 ~ 60%) Bio-event extraction: Shared task of 2009
[Kim et al. 2009]
Pathway Network
……
This talk
![Page 8: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/8.jpg)
8
This Talk: Bio-Event Extraction
We present the first joint approach that achieves state-of-the-art results
Based on Markov logic [Domingos & Lowd 2009]
Novel formulation that expands the scope of joint inference
Adding a few joint inference formulasto simple logistic regression
doubles the F1
![Page 9: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/9.jpg)
9
Outline
Motivation Bio-event extraction Our system Experimental results Conclusion
![Page 10: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/10.jpg)
10
Bio-Event: State change of bio-molecules
Gene expression Transcription Protein catabolism Localization Phosphorylation Binding Regulation Positive regulation Negative regulation
![Page 11: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/11.jpg)
11
Example
Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41 envelope protein of human immunodeficiency virus type 1 ...
T1 Protein 15 29 p70(S6)-kinaseT2 Protein 44 49 IL-10T3 Protein 86 90 gp41
T4 Regulation 0 11 InvolvementT5 Positive_regulation 30 40 activationE1 Regulation:T4 Theme:E2 Cause:T3E2 Positive_regulation:T5 Theme:T1
…
![Page 12: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/12.jpg)
12
Why Is It Hard?
Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41 envelope protein of human immunodeficiency virus type 1 ...
![Page 13: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/13.jpg)
13
Why Is It Hard?
Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocytes by gp41 envelope protein of human immunodeficiency virus type 1 ...
involvement
up-regulation
IL-10human
monocyte
SiteTheme Cause
gp41 p70(S6)-kinase
activation
Theme Cause
Theme
Traditional information extraction ignores this
![Page 14: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/14.jpg)
14
Why Is It Hard?
Variations in denoting same eventsE.g., negative regulation
532 inhibited, 252 inhibition, 218 inhibit, 207 blocked, 175 inhibits, 157 decreased, 156 reduced, 112 suppressed, 108 decrease, 86 inhibitor, 81 Inhibition, 68 inhibitors, 67 abolished, 66 suppress, 65 block, 63 prevented, 48 suppression, 47 blocks, 44 inhibiting, 42 loss, 39 impaired, 38 reduction, 32 down-regulated, 29 abrogated, 27 prevents, 27 attenuated, 26 repression, 26 decreases, 26 down-regulation, 25 diminished, 25 downregulated, 25 suppresses, 22 interfere, 21 absence, 21 repress ……
![Page 15: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/15.jpg)
15
Why Is It Hard?
Same word denotes different eventsE.g., appearance
“in the nucleus” Localization
“mRNA” Transcription
“IL-2 activity” Positive-regulation
……
![Page 16: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/16.jpg)
16
Participants
![Page 17: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/17.jpg)
17
Top System: UTurku
Adopts the pipeline architecture First, determines event candidates and types Then, classifies for each pair of candidates
whether the latter is a theme or cause No way to feedback information to events
given evidence of arguments Decisions are made independently
![Page 18: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/18.jpg)
18
Joint Inference for Bio-Event Extraction Complex, nested structures naturally argue
for joint inference However, under-explored for this task Previous best joint approach [Riedel et al. 2009]
still lags UTurku by a large margin
![Page 19: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/19.jpg)
19
Outline
Motivation Bio-event extraction Our system Experimental results Conclusion
![Page 20: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/20.jpg)
20
Design Desiderata
Jointly predict events and arguments Incorporate prior knowledge, e.g.,
Each event has a theme Only regulation events can have cause
Expand scope of joint inference to include individual dependency edges
![Page 21: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/21.jpg)
21
Markov Logic [Domingos & Lowd 2009]
Syntax: Weighted first-order formulas Semantics: Feature templates for Markov nets A Markov Logic Network (MLN) is a set of pairs
(Fi, wi) where Fi is a formula in first-order logic
wi is a real number
1( ) exp ( )i i
i
P x w N xZ
Number of true
groundings of Fi
![Page 22: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/22.jpg)
22
Markov Logic
Unifying framework for joint inference A plethora of efficient algorithms available Open-source implementation: Alchemyalchemy.cs.washington.edu
![Page 23: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/23.jpg)
23
Input: Stanford Dependencies
involvement
up-regulation
IL-10human
monocyte
prep_innn prep_by
gp41 p70(S6)-kinase
activation
prep_in prep_of
nn
Involvement of p70(S6)-kinase activation in IL-10 up-regulation in human monocyte by gp41 …
![Page 24: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/24.jpg)
24
Joint Predictions
involvement
up-regulation
IL-10human
monocyte
prep_innn prep_by
gp41 p70(S6)-kinase
activation
prep_in prep_of
nn
Trigger word?Event type?
Trigger word?Event type?
Trigger word?Event type?
Trigger word?Event type?
Trigger word?Event type?
Trigger word?Event type?
Trigger word?Event type?
![Page 25: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/25.jpg)
25
Joint Predictions
involvement
IL-10human
monocyte
prep_innn prep_by
gp41 p70(S6)-kinase
activation
prep_in prep_of
nn
In theme path?In cause path?
In theme path?In cause path?
In theme path?In cause path?
In theme path?In cause path?
In theme path?In cause path?
In theme path?In cause path?
![Page 26: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/26.jpg)
26
Why Individual Dependencies?
regulate
dobj
IL-10
regulate
dobj
protein
regulate
dobj
IL-8
IL-10 IL-10
nn conj
… regulate IL-10 … … regulate IL-10 protein … … regulate IL-8 and IL-10 …
![Page 27: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/27.jpg)
27
Why Individual Dependencies?
regulate
dobj
IL-10
regulate
dobj
protein
regulate
dobj
IL-8
IL-10 IL-10
nn conj
… regulate IL-10 … … regulate IL-10 protein … … regulate IL-8 and IL-10 …Beginning of theme paths
![Page 28: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/28.jpg)
28
Why Individual Dependencies?
regulate
dobj
IL-10
regulate
dobj
protein
regulate
dobj
IL-8
IL-10 IL-10
nn conj
… regulate IL-10 … … regulate IL-10 protein … … regulate IL-8 and IL-10 …
Continuation of a path …
![Page 29: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/29.jpg)
29
MLN For Bio-Event Extraction
Logistic regression Hard constraints Linguistically motivated joint formulas
![Page 30: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/30.jpg)
30
Logistic Regression
Lexical evidenceE.g.: “activation” probably refers to positive-regulation
Syntactic evidenceE.g.: “nsubj” probably leads to a cause
Lexical-syntactic evidenceE.g.: “nsubj” from “binds” probably leads to a theme
![Page 31: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/31.jpg)
31
Hard Constraints
EventsE.g.: Event must have a theme
Argument pathsE.g.: If edge s t is in a theme path, then
either s is an event or there is some p s in the theme path
Decisions about events and argument edges interdependent with each other
![Page 32: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/32.jpg)
32
Linguistically-Motivated Joint Formulas
Syntactic alternations, e.g.: A increases the level of B The level of B increases
Add context-specific formulaE.g., if increases signifies an event, and it has
both nsubj and dobj dependencies, then nsubj probably leads to a cause
![Page 33: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/33.jpg)
33
Correct Syntactic Error with Semantic Information
Coordination: expression of IL-8 and IL-10
expression
IL-8 IL-10
prep_of conj
expression
IL-8
IL-10
prep_of
conj
![Page 34: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/34.jpg)
34
Correct Syntactic Error with Semantic Information
PP-attachment: involvement of IL-8 in IL-10 regulation
involvement
IL-8
regulation
prep_of
prep_in
IL-10
nn
involvement
IL-8 regulation
prep_of prep_in
IL-10
nn
![Page 35: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/35.jpg)
35
Outline
Motivation Bio-event extraction Our system Experimental results Conclusion
![Page 36: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/36.jpg)
36
Dataset
BioNLP-09 Shared Task (PubMed abstracts) Training: 800 Development: 150 Test: 260
Main evaluation criteria for the task Event-level recall, precision, F1 Account for nested event structures
![Page 37: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/37.jpg)
37
Experiment Objectives
Relative contributions of feature components Identify the bottlenecks for performance Comparison with state-of-the-art systems
![Page 38: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/38.jpg)
38
Results: Development Set
25
35
45
55
F1
LR
![Page 39: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/39.jpg)
39
Results: Development Set
25
35
45
55
F1
LR LR+HARD
Add hard joint inference formulas
26
![Page 40: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/40.jpg)
40
Results: Development Set
25
35
45
55
F1
LR LR+HARD FULL
Add soft joint inference formulas
2
![Page 41: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/41.jpg)
41
Results: Development Set
25
35
45
55
F1
LR LR+HARD NO-SYN-FIXFULL
If no fixing syntactic errors
4
![Page 42: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/42.jpg)
42
Results: Development Set
25
35
45
55
F1
LR LR+HARD NO-SYN-FIX UTurkuFULL
UTurku
![Page 43: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/43.jpg)
43
Per-Type Performance
Event F1
Catabolism 92
Phosphorylation 87
Expression 77
Localization 75
Transcription 71
Binding 48
Negative-Reg. 46
Positive-Reg. 46
Regulation 37
![Page 44: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/44.jpg)
44
Per-Type Performance
Event F1 Trigger-Word F1
Catabolism 92 91
Phosphorylation 87 90
Expression 77 80
Localization 75 73
Transcription 71 70
Binding 48 71
Negative-Reg. 46 64
Positive-Reg. 46 68
Regulation 37 51
![Page 45: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/45.jpg)
45
Results: Test Set
25
35
45
55
F1
UTurku JULIELab Riedel et al. Our MLNConcordU
Reduce F1 error by over 10%Compare to previous best joint approach
![Page 46: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/46.jpg)
46
Future Work
Incorporate more features More joint inference opportunities Leverage discourse (e.g., coreference) Joint syntactic / semantic processing
![Page 47: 1 Joint Inference for Knowledge Extraction from Biomedical Literature Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with](https://reader035.vdocuments.us/reader035/viewer/2022062511/551475e6550346414e8b631d/html5/thumbnails/47.jpg)
47
Conclusion
First joint approach for bio-event extraction with state-of-the-art results
Based on Markov Logic Novel formulation with expanded joint inference Correcting syntactic errors
with semantic information helps