Download - Text mining
![Page 1: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/1.jpg)
Lars Juhl Jensen
Text mining
>10 km
![Page 2: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/2.jpg)
exponential growth
![Page 3: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/3.jpg)
![Page 4: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/4.jpg)
![Page 5: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/5.jpg)
~45 seconds per paper
![Page 6: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/6.jpg)
corpus
![Page 7: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/7.jpg)
most use abstracts
![Page 8: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/8.jpg)
few use full-text articles
![Page 9: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/9.jpg)
no access
![Page 10: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/10.jpg)
![Page 11: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/11.jpg)
information retrieval
![Page 12: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/12.jpg)
find the relevant papers
![Page 13: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/13.jpg)
ad hoc retrieval
![Page 14: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/14.jpg)
user-specified query
![Page 15: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/15.jpg)
“yeast AND cell cycle”
![Page 16: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/16.jpg)
PubMed
![Page 17: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/17.jpg)
![Page 18: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/18.jpg)
indexing
![Page 19: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/19.jpg)
fast lookup
![Page 20: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/20.jpg)
stemming
![Page 21: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/21.jpg)
word endings
![Page 22: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/22.jpg)
dynamic query expansion
![Page 23: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/23.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1
hyperphosphorylation and degradation
![Page 24: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/24.jpg)
no tool will find that
![Page 25: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/25.jpg)
still too much to read
![Page 26: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/26.jpg)
computer
![Page 27: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/27.jpg)
as smart as a dog
![Page 28: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/28.jpg)
teach it specific tricks
![Page 29: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/29.jpg)
![Page 30: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/30.jpg)
![Page 31: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/31.jpg)
named entity recognition
![Page 32: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/32.jpg)
identify the concepts
![Page 33: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/33.jpg)
comprehensive lexicon
![Page 34: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/34.jpg)
small molecules
![Page 35: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/35.jpg)
proteins
![Page 36: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/36.jpg)
cellular components
![Page 37: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/37.jpg)
tissues
![Page 38: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/38.jpg)
diseases
![Page 39: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/39.jpg)
environments
![Page 40: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/40.jpg)
organisms
![Page 41: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/41.jpg)
orthographic expansion
![Page 42: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/42.jpg)
prefixes and postfixes
![Page 43: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/43.jpg)
Cdc28 vs. Cdc28p
![Page 44: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/44.jpg)
singular and plural forms
![Page 45: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/45.jpg)
flexible matching
![Page 46: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/46.jpg)
upper- and lower-case
![Page 47: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/47.jpg)
spaces and hyphens
![Page 48: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/48.jpg)
“black list”
![Page 49: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/49.jpg)
SDS
![Page 50: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/50.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1
hyperphosphorylation and degradation
![Page 51: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/51.jpg)
information extraction
![Page 52: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/52.jpg)
formalize the facts
![Page 53: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/53.jpg)
the starting point
![Page 54: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/54.jpg)
named entity recognition
![Page 55: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/55.jpg)
two approaches
![Page 56: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/56.jpg)
co-mentioning
![Page 57: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/57.jpg)
within documents
![Page 58: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/58.jpg)
within paragraphs
![Page 59: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/59.jpg)
within sentences
![Page 60: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/60.jpg)
weighted counts
![Page 61: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/61.jpg)
NLPNatural Language Processing
![Page 62: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/62.jpg)
part-of-speech tagging
![Page 63: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/63.jpg)
semantic tagging
![Page 64: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/64.jpg)
sentence parsing
![Page 65: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/65.jpg)
Gene and protein names
Cue words for entity recognition
Verbs for relation extraction
[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]
![Page 66: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/66.jpg)
handle negations
![Page 67: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/67.jpg)
high precision
![Page 68: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/68.jpg)
poor recall
![Page 69: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/69.jpg)
highly domain specific
![Page 70: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/70.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1
hyperphosphorylation and degradation
![Page 71: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/71.jpg)
text/data integration
![Page 72: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/72.jpg)
augmented browsing
![Page 73: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/73.jpg)
Reflect
![Page 74: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/74.jpg)
show relevant information
![Page 75: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/75.jpg)
Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009O ’Donoghue et al., Journal of Web Semantics, 2010
![Page 76: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/76.jpg)
guilt by association
![Page 77: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/77.jpg)
![Page 78: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/78.jpg)
heterogeneous evidence
![Page 79: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/79.jpg)
knowledge
![Page 80: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/80.jpg)
experiments
![Page 81: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/81.jpg)
text mining
![Page 82: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/82.jpg)
predictions
![Page 83: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/83.jpg)
common identifiers
![Page 84: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/84.jpg)
quality scores
![Page 85: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/85.jpg)
web interface
![Page 86: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/86.jpg)
STRING
![Page 87: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/87.jpg)
proteins
![Page 88: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/88.jpg)
Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011
![Page 89: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/89.jpg)
Frishman et al., Modern Genome Annotation, 2009
![Page 90: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/90.jpg)
STITCH
![Page 91: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/91.jpg)
small molecules
![Page 92: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/92.jpg)
Kuhn et al., Nucleic Acids Research, 2012
![Page 93: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/93.jpg)
COMPARTMENTS
![Page 94: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/94.jpg)
subcellular localization
![Page 95: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/95.jpg)
compartments.jensenlab.org
![Page 96: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/96.jpg)
TISSUES
![Page 97: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/97.jpg)
human tissue expression
![Page 98: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/98.jpg)
tissues.jensenlab.org
![Page 99: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/99.jpg)
DISEASES
![Page 100: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/100.jpg)
human diseases
![Page 101: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/101.jpg)
![Page 102: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/102.jpg)
evidence viewers
![Page 103: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/103.jpg)
![Page 104: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/104.jpg)
web services
![Page 105: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/105.jpg)
bulk download
![Page 106: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/106.jpg)
summary
![Page 107: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/107.jpg)
text mining
![Page 108: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/108.jpg)
simpler
![Page 109: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/109.jpg)
more useful
![Page 110: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/110.jpg)
less boring
![Page 111: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/111.jpg)
thank you!
![Page 112: Text mining](https://reader034.vdocuments.us/reader034/viewer/2022052505/554e8ad6b4c905fc368b48be/html5/thumbnails/112.jpg)
questions?