topic mapping tools for biomedical corpora
DESCRIPTION
Topic Mapping Tools for Biomedical Corpora. Gully APC Burns, USC/ISI Dave Newman, UC Irvine Bruce Herr, IU. ‘Snapshots of Neuroscience’. Society for Neuroscience Annual meeting (2000 New Orleans) ~30,000 attendees, ~12,000 posters per year. Basic Idea: Topic Modeling. - PowerPoint PPT PresentationTRANSCRIPT
Topic Mapping Tools for Biomedical Corpora
Gully APC Burns, USC/ISIDave Newman, UC IrvineBruce Herr, IU
‘Snapshots of Neuroscience’
Society for Neuroscience Annual meeting (2000 New Orleans)~30,000 attendees, ~12,000 posters per year
Basic Idea: Topic Modeling
Erythropoietin (Epo), a hematopoietic cytokine, has recently been demonstrated to provide neuroprotection on nigral dopaminergic neurons. However, there is no information available about whether Epo can protect dopaminergic neurons from the neurotoxicity of 6-hydroxydopamine (6-OHDA) that is most commonly used to create a rat model of Parkinson’s disease (PD). In the present study, we tested the hypothesis that recombinant human Epo (rhEpo) would protect dopaminergic neurons and improve neurobehavioral outcomes in a rat model of progressive PD. rhEpo (20 units in 2μl of vehicle) was stereotaxically injected into one side of the striatum. The 6-OHDA lesion was made into the same side one day after rhEpo treatment. Methamphetamine-induced rotation was measured 3 and 10 weeks after the lesion, and paw reaching was also tested at 10 weeks. After the last time of behavioral test, rats were then sacrificed, and the brains were perfusion-fixed for histology and immunocytochemistry. We observed that intrastriatal administration of rhEpo significantly reduced the degree of rotational asymmetries. The rhEpo-treated animals also showed a better improvement in skilled forelimb use when compared with the control rats. In accompanying with the recovery of neurobehavioral outcomes, tyrosine hydroxylase (TH)-immunoreactive neurons of the substantia nigra were protected from progressive degeneration in the rhEpo-treated rats. TH-immunoreactivity in the 6-OHDA lesioned striatum also significantly increased in the rhEpo-treated rats. To examine if systemic administration of rhEpo could exert the similar biological effects …
Basic Idea: Topic Modeling
Basic Idea: Topic Modeling
... plus all remaining ‘topic mass’ – provides a signature from which we can calculate document-document similarities (~12,000 x ~12,000 matrix)
‘Topic Mapping’ Workflow
ischemia cerebral
ischemic stroke brain occlusion injury infarct mcao hour reperfusion
artery volume model middle
transient
LiteratureCorpus
Topic Modeling
using Gibbs Sampling
Topic ModelDocument-Document
Similarity Map
Google MapsApplication
Graph Layout
Processing with VxOrd /
DrL
Multi-level image
rendering, Cluster
analysis for label
placement
Implementation 1: SfN 2006 Maps @ SfN 2007
Analysis: Dave Newman, UCIVisualization: Bruce Herr, IU
Lessons LearnedThis demonstration had a high impact at SfN 2007
[Shown to Neuroinformatics Committee (NIC), PubMed Plus Panel, Program Committee, General Council]
Why?1. System emphasizes elegant visualization2. Application has natural, familiar, intuitive design3. Criticisms centered on concerns about analysis
validity (‘what do clusters actually mean’?) ...but, system focused on utility, not interpretations...
Next StepsGary Westbrook
[NIC, ex-editor of J Neurosci, external committee of National Institute of Neurological Disorders and Stroke, NINDS]
Edmund Talley [Program Director NINDS, Channels Synapses
and Circuits]Requested a system to examine NINDS
grants accessed from CRISP
CRISP: Computer Retrieval of Information on Scientific Projects Lists all funded DHHS projects from 1972
[including data from NIH, CDC, FDA, HRSA and AHRQ]Build topic map of NINDS 2006 grants in relation
to 13 other NIH institutes involved with funding Neuroscience research.[Largest Institute: NCI ~ 9373 grants (2006)][Smallest Institute: NIAAA ~ 1198 grants (2006)]
Downloaded 10 years of abstracts from NINDS (to weight distribution in favor of NINDS topics) and 1 year of all other 13 institutes.
NINDS staff hand-annotated ~2500 grants with SfN categories (theme, sub-theme, topic) to compare with categories generated by the topic model.
Additional Features for this implementation Improved navigability Multiple maps Multiple labeling / coloring schemes Search
Google Map – based flags, etc. full-text search within the HTML
application
Implementation 2: NINDS + NIH Maps for 2006
What’s Next?All 2007 abstracts from NIH (all institutes)Diagnostic functions within browser
- ‘Heat maps’ of each individual topic- ‘Cluster Expansion’
Trend analysisWhich topics are emergent? Which are in decline?
Can we perform analysis across corpora? SfN abstracts from 2001-2008Medline (>8 million abstracts)CRISP (funded federal project abstracts) PubMed Central (~1 million full text papers)Other full-text resources
‘Cluster Expansion’
What’s Next?All 2007 abstracts from NIH (all institutes)Diagnostic functions within browser
- ‘Heat maps’ of each individual topic- ‘Cluster Expansion’
Trend analysisWhich topics are emergent? Which are in decline?
Can we perform analysis across corpora? SfN abstracts from 2001-2008Medline (>8 million abstracts)CRISP (funded federal project abstracts) PubMed Central (~1 million full text papers)Other full-text resources
Data across many years allows trend analysis
Medline Data
PDHIVp53
What’s Next?All 2007 abstracts from NIH (all institutes)Diagnostic functions within browser
- ‘Heat maps’ of each individual topic- ‘Cluster Expansion’
Trend analysisWhich topics are emergent? Which are in decline?
Can we perform analysis across corpora? SfN abstracts from 2001-2008Medline (>8 million abstracts)CRISP (funded federal project abstracts) PubMed Central (~1 million full text papers)Other full-text resources
Full-text Biomedical Articles
Source Size (# articles millions)
Type
Medline 15.8 CitationsElsevier’s ScienceDirect 6.75 ArticlesPubMed Central 0.97 ArticlesCambridge Journals 0.18 ArticlesJSTOR 1.62 ArticlesSpringerLink (Biomedical / Medical)
1.32 (0.72 / 0.60)
Articles
Wiley Interscience 1.50 Articles
AcknowledgementsFunding
Information Sciences Institute, seed funding
NSF: IIS-0513650 NINDS contracts
(Ned Talley)
Collaborators Dave Newman (UCI) Bruce Herr (IU)
Developers Tommy Ingulfsen
Contributing Computer Scientists Padhraic Smyth
(UCI) Katy Borner (IU) Patrick Pantel
(ISI/Yahoo!)