sam stewart - knowledge linages
TRANSCRIPT
Knowledge LinkagesAugmenting Online Clinical Care Discussions
with Published Literature
Sam Stewart, Syed Sibte Raza Abidi
NICHE Research GroupFaculty of Computer Science
Dalhousie University, Halifax, Canada
November 30, 2010
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 1 / 35
Outline
Introduction
Problem Description
Knowledge Linkage Framework
Preliminary Results
Conclusion
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 2 / 35
Introduction
Introduction
Pediatric pain management is a complex subject
I children lack the cognitive ability to properly express theirpain, which can lead to incorrect interventions.
Lack of specialized knowledge or training in pediatric painmanagement.
I Because of the temporal and physical restrictions thatclinicians face, traditional educational systems are not aplausible solution
Web 2.0 technologies provide alternate knowledge disseminationmediums for clinicians to converge and share their knowledge.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 3 / 35
Introduction
Rationale for Knowledge Linkages
Pediatric Pain Mailing List (PPML)
I Brings together over 700 pediatric pain practitioners fromaround the world to share their clinical experiences and seekadvice
The knowledge shared on the PPML is practice-based ratherthan evidence based
It is important to augment the practice-based (tacit) knowledgeon the PPML with explicit knowledge
The goal of this project is to establish knowledge linkagesbetween discussions on the PPML and publishedliterature on Pubmed.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 4 / 35
Introduction
Project Objectives
The objective of knowledge linkage is to reaffirm thepractice-related recommendations on the PPML withevidence-based literature from Pubmed
The outcome of the project will allow users to
I search through PPML archives to find topics of interest
I Retrieve research articles related to the topics fromPubmed, through a “single-click” evidence retrievalstrategy.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 5 / 35
Introduction
Knowledge Linkage Framework
PPMLArchives
OnlineDisucssion
Forum
MessageParsing
FilteredMessages
ThreadingAlgorithm
Threads
MappingTo MeSH
ThreadMeSHThread MeSHThread
Papers
Information Retrieval
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 6 / 35
Project Framework
Step 1: Processing the Archives
The archives are stored in simple ASCII text files, organized bymonth, starting June 1993 and ending December 2008
The messages are processed to extract the sender, date, subjectline and content of the messages
The messages are filtered to remove non-substantive content
PPMLArchives
OnlineDisucssion
Forum
MessageParsing
FilteredMessages
ThreadingAlgorithm
Threads
MappingTo MeSH
ThreadMeSHThread MeSHThread
Papers
Information Retrieval
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 7 / 35
Project Framework
Step 2: Threading
A thread is a series of messages centred around a commonsubject.
They are the embodiment of experiential knowledge on thePPML.
The messages are assigned to threads using their subject lines.
PPMLArchives
OnlineDisucssion
Forum
MessageParsing
FilteredMessages
ThreadingAlgorithm
Threads
MappingTo MeSH
ThreadMeSHThread MeSHThread
Papers
Information Retrieval
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 8 / 35
Project Framework
Step 3: Mapping to MeSH
PPMLArchives
OnlineDisucssion
Forum
MessageParsing
FilteredMessages
ThreadingAlgorithm
Threads
MappingTo MeSH
ThreadMeSHThread MeSHThread
Papers
Information Retrieval
The threads are parsed and connected toformal MeSH terms, using the Metamapprogram
Metamap creates a mapping score, which is ameasure of the strength of the connection
Each mesh-based thread is used to queryPubMed
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 9 / 35
Project Framework
Metamap: Mapping free text to MeSH
Metamap is a program developed by Dr. Alan Aronson at theNLM that maps biomedical text to the MeSH lexicon
Each mapping is assigned a score that is a measure of thestrength of the mapping.
1000×(Centrality +Variation+2×Coverage+2×Cohesiveness)/6
The scores provide a baseline measure of how well the mappedMeSH term represents the original term in the thread
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 10 / 35
Project Framework
Example
Sample Statement
‘‘The report stated that when music therapy is used, the
babies required less pain medication. Does anyone know of any
published reports of empirical research demonstrating the
effect?’’
Source MeSH Term Scoremusic therapy Music Therapy 1000the babies Infant 966less pain medication Pain 660less pain medication Pharmaceutical Preparations 827published reports Publishing 694empirical research Empirical Research 1000
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 11 / 35
Project Framework
Step 4: Literature Search StrategyPPMLArchives
OnlineDisucssion
Forum
MessageParsing
FilteredMessages
ThreadingAlgorithm
Threads
MappingTo MeSH
ThreadMeSHThread MeSHThread
Papers
Information Retrieval
Passively links the threads to published medical literature
Naive approach: Retrieve all papers that contain every MeSHterm in the thread. If no papers exist the algorithm would dropthe lowest scoring terms and reiterate
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 12 / 35
Project Framework
Naive Approach
The naive approach has several problems
I It doesn’t provide any kind of ordering on the resultingpapers
I It doesn’t fully utilize the MeSH scores
I It doesn’t take into account the possibility of incorrectmappings
One of the challenges of mapping free text with Metamap is itsinaccuracy.
The presence of a false MeSH term with a high MeSH score willprevent the retrieval of useful papers
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 13 / 35
Project Framework
Improved Search Strategy
Our improved search strategy makes full use of the Metamapscores
It also addresses the problem of incorrect mappings
It is based on the Extended Boolean Information Retrieval(eBIR) algorithm
I Customizes the algorithm to deal with pediatric pain byadding a specialized filter
Let (Mi ,mi) be MeSH term i and the associated Metamap score.
Q = [Infant OR Child OR Adolescent] AND
[(M1,m1) ORP(M2,m2) ORP . . . (Mn,mn)] (1)
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 14 / 35
Project Framework
Step 5: Discussion Forum
PPMLArchives
OnlineDisucssion
Forum
MessageParsing
FilteredMessages
ThreadingAlgorithm
Threads
MappingTo MeSH
ThreadMeSHThread MeSHThread
Papers
Information Retrieval
An online forum is being developed that allows practitioners tointeract with the PPML discussions and review the researcharticles for a specific discussion thread.
The forum will be navigated by a standard search function, or bya search function based on MeSH terms
As well the threads will be organized into a hierarchy based ontheir MeSH terms
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 15 / 35
Results
Example
This first example is the first thread ever transmitted on thePPML
It is on the subject of Music Therapy
The following slides show the discussion, then the list of MeSHterms mapped, then a sampling of the papers returned
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 19 / 35
Results
Music Therapy I
Sender: 1Subject: Music TherapyDate: Mon Jun 28 21:19:36 ADT 1993Thread: 3, falseThe last several days, the local NBC station aired a ”medical report”about the use of music therapy. The report was from Miami andincluded a short report on the use of music therapy in a NICU. Thereport stated that when music therapy was used, the babies requiredless pain medication. Does anyone know of any published reports ofempirical research demonstrating this effect?
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 20 / 35
Results
Music Therapy IIMessage
Sender:2Subject: Music TherapyDate: Tue Jun 29 08:25:12 ADT 1993Thread: 3, trueI would suggest that you might contact **** ******** in Pediatricsat Washington University Medical School. Her research is onneonatal pain and she might know where the local station picked upthe report. I haven’t seen any data on the topic.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 21 / 35
Results
Music Therapy IIIMessage
Sender: 3Subject: Music TherapyDate: Tue Jun 29 10:20:41 ADT 1993Thread: 3, trueI’m not aware of specific studies conducted using music therapy toreduce the need for pain medication (i.e., music therapy to managepain). However, several cognitive interventions have been used quiteeffectively to manage pain. Donald Meichenbaum developed atechnique in the early 1970s called stress inoculation training whichcombines aspects of self-instruction training and relaxation training.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 22 / 35
Results
Music Therapy MeSH Terms
MeSH Score PapersMusic Therapy -4802 1621
Pain -4215 237890Research -2000 254813
Pharmaceutical Preparations -1688 438290Infant, Newborn -1660 422011
Education -1654 492705Teaching -1320 52938
Vaccination -1320 44092Intensive Care Units, Neonatal -1000 6847
Pediatrics -1000 34612Empirical Research -1000 8897Relaxation Therapy -1000 5677
Vision, Ocular -966 18516Behavior -966 879624Infant -966 795209Air -966 16342
Awareness -861 9711Schools, Medical -861 17436
Biomedical Research -827 28157Publishing -694 27222Cognition -589 76048
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 23 / 35
Results
Music Therapy Papers Returned
Bo LK, Callaghan P. Soothing pain-elicited distress in Chineseneonates. Pediatrics:2000,105(4). 10742370.
Cignacco E, Hamers JP, Stoffel L, van Lingen RA, Gessler P,McDougall J, Nelle M. The efficacy of non-pharmacologicalinterventions in the management of procedural pain in pretermand term neonates. A systematic literature review. Europeanjournal of pain (London, England):2006,11(2). 16580851.
Kemper KJ, Danhauer SC. Music as therapy. Southern medicaljournal:2005,98(3). 15813154.
Tagore T. Why music matters in childbirth. Midwifery todaywith international midwife:2009,(89). 19397157.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 24 / 35
Results
Pilot Study
A pilot study was conducted on all messages from 2007 and 2008
100 threads were reviewed to determine
1 the accuracy of the message parsing2 the accuracy of the thread assignment3 The accuracy of the papers returned
The message parsing was successful on 74% of the messages
The threading was successful on 92% of the messages
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 25 / 35
Results
Recall, Precision, Utility
Precision and relative recall were compared between the modifiedsearch strategy, the eBIR model, and a traditional VSM.
I Relative recall is for comparing search strategies ofunannotated databases
Precision =Number of relevant papers returned by the search
Total number of papers returned
Recall =Number of relevant papers returned by the search
Number of relevant papers returned by all searches
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 26 / 35
Results
Precision
2 4 6 8 10 12 14
0.00
0.05
0.10
0.15
0.20
Top k papers
Pre
cisi
on
●
●
●●
●
●
●● ● ●
● ● ●●
●
●
CustomVSMeBIR
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 27 / 35
Results
Recall
2 4 6 8 10 12 14
0.0
0.2
0.4
0.6
0.8
Top k papers
Rel
ativ
e R
ecal
l
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
CustomVSMeBIR
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 28 / 35
Results
Precision-Recall
0.2 0.4 0.6 0.8
0.08
0.10
0.12
0.14
0.16
0.18
Relative Recall
Pre
cisi
on
●
●
●
●●
●
●
● ● ●●
● ●
●●●
CustomVSMeBIR
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 29 / 35
Results
Search Strategy Results
The precision of the modified algorithm is significantly higherthan the other two algorithms at k = 15 (p-values of 0.013 and0.003 respectively)
The recall, however, is only significantly different between themodified and ebir models (p < 0.0001) and not with the VSMalgorithm (p = 0.351)
Ultimately, a search is “good” if it returns at least one pertinentresult
Utility at level k is an indicator of whether the search returns arelevant paper in the first k results.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 30 / 35
Results
Utility vs. k
2 4 6 8 10 12 14
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Top k papers
Util
ity
●
●
●
●
●● ●
●
●
●●
●● ●
●
●
CustomVSMeBIR
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 31 / 35
Conclusion
Conclusion
The mapping of experiential to explicit clinical knowledge iscritical, given the rapid changes in medical knowledge and itsapplication in specialized domains
Clinical experiences should be supported by clinical evidence, andthis has been achieved through our Knowledge LinkageFramework
Presented a method of leveraging web 2.0 techniques byincorporation medical information retrieval strategies to improvethe overall medical knowledge base
Automatic query generation, using clinical terms and contexts, isa unique aspect of the research
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 32 / 35
Conclusion
Future Work
The next step is to provide open access to a wide number ofusers and get their feedback
More time should be spent looking into the variables within theeBIR algorithm and the modified algorithm
Q = [Infant ORp1 Child ORp1 Adolescent] ANDp2
[M1 ORp3 M2 ORp3 . . .ORp3 Mn]
Tweaking the Metamap scores, either within the Metamapsystem or through post-processing, should also be explored
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 33 / 35
Conclusion
Acknowledgement
This work is carried out with the aid of a grant from the InternationalDevelopment Research Centre (IDRC), Ottawa, Canada
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 34 / 35
Appendix Metamap
Metamap Algorithms
There are three general types of matches:
Simple match a direct connection between the recognized nounand the UMLS term
Complex match when a noun phrase can be mapped directly toa combination of UMLS semantic types
Partial match when part of the noun/noun-phrase does not mapto UMLS
The general mapping strategy is, for each term SPECIALISTrecognizes: generate all variants of the noun-phrase, form thecandidate set of all the UMLS strings that contain 1 of thevariants, sort the candidate set by the strength of mapping,combine candidates for disjoint parts of the noun-phrase, thenselect the mapping with the best score.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 2 / 11
Appendix Metamap
Variants
The variants are all composite parts of a noun-phrase, alongwith all acronyms, abbreviations and synonyms of those terms,all variants of those variants, etc . . .
For the term ocular the following figure depicts the generation ofthe variants
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 3 / 11
Appendix Metamap
Metamap Scores
The scores range from [-1000, 0], with lower scores being better
The score is based on 4 metrics: centrality, variation, coverageand cohesiveness. The final score is calculated as:
−1000× (Centrality + Variation + 2×Coverage + 2×Cohesiveness)/6
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 4 / 11
Appendix Metamap
Metamap Scores I
Centrality a 1/0 indicating whether the match is to the head of thephrase
Variation A measure of the distance the matched term is from theroot word. The distance, D, is a sum of the followingvariations. The score is calculated as 4
D+4.:
spelling: 0
inflectional: 1
synonym/acronym/abbreviation: 2
derivational: 3.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 5 / 11
Appendix Metamap
Metamap Scores II
Coverage How much of both the UMLS string and the phrase areinvolved in the match. The number of words in eachphrase are computed, as well as the spans of each term,i.e., the length of the matching terms, ignoringnon-matching terms. The score is calculated as
2
3
Span
UMLS Length+
1
3
Span
term length
Cohesiveness Like coverage, but focusing on connected terms. Itcalculates the length of connected components (themaximal sequence of connected words in both terms),and takes a weighted mean again, this time of the sumof squares.
2
3
SS UMLS con comps
UMLS length2 +1
3
SS phrase con comps
phrase length2
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 6 / 11
Appendix Metamap
Sample Metamap ScoresFrom PPARCH.199603
Noun UMLS Cent. Var. Cov. Coh.aired Air 1 D=1;4/5 1 1
of music therapy Music Therapy 1 0;1 1 1a NICU ICU, Neonatal 1 0;1 1 1
the babies Infant 1 D=1; 4/5 1 1less pain medication Pain 0 0;1 2
311+ 1
313
2311+ 1
319
Pharm Prep 1 0;1 2322+ 1
313
2322+ 1
319
of any published reports Publishing 0 0;1 2311+ 1
312
2311+ 1
314
empirical research Empirical Research 1 0;1 1 1
Noun UMLS TOTALaired Air −1000× (1 + 4/5 + 2(1) + 2(1))/6 = −966
of music therapy Music Therapy −1000a NICU ICU, Neonatal −1000
the babies Infant −1000× (1 + 4/5 + 2(1) + 2(1))/6 = −966less pain medication Pain −1000× (0 + 1 + 2( 7
9) + 2( 19
27))/6 = −660
Pharm Prep −1000× (1 + 1 + 2( 79) + 2( 19
27)/6 = −− 827
−660−8272
= −743.5
of any published reports Publishing −1000(0 + 1 + 2( 1012) + 2( 36
48))/6 = −694
empirical research Empirical Research −1000
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 7 / 11
Appendix Metamap
Extended Boolean Information Retrieval (eBIR)
The eBIR system incorporates query weights into the traditionalBIR model
Let the set of query terms be A = {(A1, s1), . . . , (An, sn)}, whereAi is the i th query term, and si is the associated score
Let the OR and AND queries be
QOR(p) = {(A1, s1) ORp . . . ORp (An, sn)}
QAND(p) = {(A1, s1) ANDp . . . ANDp (An, sn)}
The selection of p effects the influence of high-scoring terms onthe returned query scores.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 8 / 11
Appendix Metamap
Modified IR Algorithm
The problem with applying the eBIR algorithm to this project isthat it doesn’t address the issue of specialized domains
MeSH keywords such as Pediatrics or Pain could be implicitlyrepresentative of all conversations on the list
Our algorithm modified the eBIR algorithm by adding aspecialized filter
I adding an AND operator to the query
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 9 / 11
Appendix Metamap
Modified IR Algorithm
The new query would modify the search query by adding Infant,Child or Adolescent to the set of MeSH terms, as demonstratedin equation (2)
I Let (Mi ,mi) be MeSH term i and the associated Metamapscore.
Q = [Infant ORpChild ORPAdolescent] ANDP
[(M1,m1) ORP(M2,m2) ORP . . . (Mn,mn)] (2)
Using the eBIR algorithm the next step would be to apply queryweights to the terms in the specialized filter and then find asuitable value for p.
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 10 / 11
Appendix Metamap
Modified IR Algorithm
In order to accommodate the filter the eBIR algorithm wasmodified, making the AND operator a strict Boolean operator,and leaving the query weights on the OR operator
The decision was also made to set p = 1.
Q = [Infant OR Child OR Adolescent] AND
[(M1,m1) ORP(M2,m2) ORP . . . (Mn,mn)] (3)
The result is a search strategy customized to the pediatric paindomain, that makes full use of the Metamap scores to return apertinent set of papers
Sam Stewart (Dal) Knowledge Linkages November 30, 2010 11 / 11