automated summarisation for evidence based...
TRANSCRIPT
![Page 1: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/1.jpg)
Automated Summarisation for Evidence BasedMedicine
Diego Molla
Centre for Language Technology,Macquarie University
HAIL, 22 March 2012
![Page 2: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/2.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 2/60
![Page 3: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/3.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
About Us: Research Group on Natural LanguageProcessing of Medical Texts
http://web.science.mq.edu.au/~diego/medicalnlp/
Active Members
Diego Molla Senior lecturer at Macquarie University.
Cecile Paris Senior principal research scientist at CSIRO ICT Centre.
Abeed Sarker PhD student at Macquarie University.
Sara Faisal Shash Masters student.
Past Members
Marıa Elena Santiago-Martınez Research programmer.
Patrick Davis-Desmond Masters student.
Andreea Tutos Masters student.
EBM Summarisation Diego Molla 3/60
![Page 4: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/4.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 4/60
![Page 5: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/5.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Evidence Based Medicine
http://laikaspoetnik.wordpress.com/2009/04/04/evidence-based-medicine-the-facebook-of-medicine/
EBM Summarisation Diego Molla 5/60
![Page 6: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/6.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
EBM and Natural Language Processing
http://hlwiki.slais.ubc.ca/index.php?title=Five_steps_of_EBM
EBM Summarisation Diego Molla 6/60
![Page 7: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/7.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
PICO for Asking the Right Question
EBM Summarisation Diego Molla 7/60
![Page 8: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/8.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Where to search for external evidence?
1. Evidence-based Summaries (Systematic Reviews):I EBM Online (http://ebm.bmj.com).I UptoDate (http://www.uptodate.com).I The Cochrane Library (http://www.thecochranelibrary.com/).I . . .
2. Search the Medical Literature:I E.g. PubMed (http://www.ncbi.nlm.nih.gov/pubmed/).
EBM Summarisation Diego Molla 8/60
![Page 9: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/9.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Where to search for external evidence?
1. Evidence-based Summaries (Systematic Reviews):I EBM Online (http://ebm.bmj.com).I UptoDate (http://www.uptodate.com).I The Cochrane Library (http://www.thecochranelibrary.com/).I . . .
2. Search the Medical Literature:I E.g. PubMed (http://www.ncbi.nlm.nih.gov/pubmed/).
EBM Summarisation Diego Molla 8/60
![Page 10: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/10.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Searching Cochrane
EBM Summarisation Diego Molla 9/60
![Page 11: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/11.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Searching PubMed
EBM Summarisation Diego Molla 10/60
![Page 12: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/12.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Searching the Trip Database
EBM Summarisation Diego Molla 11/60
![Page 13: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/13.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Appraising the Evidence
The SORT Taxonomy
Level A Consistent and good-quality patient-orientedevidence.
Level B Inconsistent or limited-quality patient-orientedevidence.
Level C Consensus, usual practise, opinion, disease-orientedevidence, or case series for studies of diagnosis,treatment, prevention, or screening.
EBM Summarisation Diego Molla 12/60
![Page 14: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/14.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Levels of Evidence
Study quality Diagnosis Treatment / prevention /screening
Prognosis
Level 1:good-qualitypatient-orientedevidence
Validated clinical decisionrule; SR/meta-analysis ofhigh-quality studies; high-quality diagnostic cohortstudy
SR/meta-analysis of RCTswith consistent findings;high-quality individualRCT; all-or-none study
SR/meta-analysis of good-quality cohort studies;prospective cohort studywith good follow-up
Level 2:limited-qualitypatient-orientedevidence
Unvalidated clinicaldecision rule; SR/meta-analysis of lower-qualitystudies or studies withinconsistent findings;lower-quality diagnosticcohort study or diagnosticcase-control study
SR/meta-analysis of lower-quality clinical trials or ofstudies with inconsistentfindings; lower-quality clin-ical trial; cohort study;case-control study
SR/meta-analysis of lower-quality cohort studies orwith inconsistent results;retrospective cohort studyor prospective cohort studywith poor follow-up; case-control study; case series
Level 3: otherevidence
Consensus guidelines, extrapolations from bench research, usual practice, opinion,disease-oriented evidence (intermediate or physiologic outcomes only), or caseseries for studies of diagnosis, treatment, prevention, or screening
EBM Summarisation Diego Molla 13/60
![Page 15: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/15.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Where can NLP Help?
I Questions:I Help to formulate
answerable questions.I Question analysis and
classification.
I Search:I Retrieve and rank
relevant literature.I Extract the
evidence-basedinformation.
I Summarise the results.
I Appraisal: Classify theevidence.
EBM Summarisation Diego Molla 14/60
![Page 16: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/16.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Where can NLP Help?
I Questions:I Help to formulate
answerable questions.I Question analysis and
classification.
I Search:I Retrieve and rank
relevant literature.I Extract the
evidence-basedinformation.
I Summarise the results.
I Appraisal: Classify theevidence.
EBM Summarisation Diego Molla 14/60
![Page 17: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/17.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Where can NLP Help?
I Questions:I Help to formulate
answerable questions.I Question analysis and
classification.
I Search:I Retrieve and rank
relevant literature.I Extract the
evidence-basedinformation.
I Summarise the results.
I Appraisal: Classify theevidence.
EBM Summarisation Diego Molla 14/60
![Page 18: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/18.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 15/60
![Page 19: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/19.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Where’s the Corpus for Summarisation?
Summarisation Systems
I CENTRIFUSER/PERSIVAL: Developed and tested using userfeedback (iterative design).
I SemRep: Evaluation based on human judgement.
I Demner-Fushman & Lin: ROUGE on original paper abstracts.
I Fiszman: Factoid-based evaluation.
Corpora
I Several corpora of questions/answers available.
I Answers lack explicit pointers to primary literature.
I Medical doctors want to know the primary sources.
EBM Summarisation Diego Molla 16/60
![Page 20: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/20.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Where’s the Corpus for Summarisation?
Summarisation Systems
I CENTRIFUSER/PERSIVAL: Developed and tested using userfeedback (iterative design).
I SemRep: Evaluation based on human judgement.
I Demner-Fushman & Lin: ROUGE on original paper abstracts.
I Fiszman: Factoid-based evaluation.
Corpora
I Several corpora of questions/answers available.
I Answers lack explicit pointers to primary literature.
I Medical doctors want to know the primary sources.
EBM Summarisation Diego Molla 16/60
![Page 21: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/21.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 17/60
![Page 22: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/22.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Journal of Family Practice’s “Clinical Inquiries”
EBM Summarisation Diego Molla 18/60
![Page 23: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/23.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
The XML Contents I
<r e c o r d i d =”7843”><u r l>h t t p : / /www. j f p o n l i n e . com/ Pages . asp ?AID=7843& ; i s s u e=September 2009& ; UID=</u r l><q u e s t i o n>Which t r e a t m e n t s work b e s t f o r h e m o r r h o i d s?</q u e s t i o n><answer>
<s n i p i d =”1”><s n i p t e x t>E x c i s i o n i s t h e most e f f e c t i v e t r e a t m e n t f o r thrombosed
e x t e r n a l h e m o r r h o i d s .</ s n i p t e x t><s o r t y p e=”B”> r e t r o s p e c t i v e s t u d i e s </sor><l o n g i d =”1 1”>
<l o n g t e x t>A r e t r o s p e c t i v e s t u d y o f 231 p a t i e n t s t r e a t e dc o n s e r v a t i v e l y o r s u r g i c a l l y found t h a t t h e 48.5% o f p a t i e n t st r e a t e d s u r g i c a l l y had a l o w e r r e c u r r e n c e r a t e than t h ec o n s e r v a t i v e group ( number needed to t r e a t [NNT]=2 f o rr e c u r r e n c e a t mean f o l l o w−up o f 7 . 6 months ) and e a r l i e rr e s o l u t i o n o f symptoms ( a v e r a g e 3 . 9 days compared w i t h 24 daysf o r c o n s e r v a t i v e t r e a t m e n t ).</ l o n g t e x t><r e f i d =”15486746” a b s t r a c t =” A b s t r a c t s /15486746. xml”>GreensponJ , W i l l i a m s SB , Young HA , e t a l . Thrombosed e x t e r n a lh e m o r r h o i d s : outcome a f t e r c o n s e r v a t i v e o r s u r g i c a lmanagement . Dis Colon Rectum . 2 0 0 4 ; 4 7 : 1493−1498.</ r e f>
</long><l o n g i d =”1 2”>
<l o n g t e x t>A r e t r o s p e c t i v e a n a l y s i s o f 340 p a t i e n t s who underwento u t p a t i e n t e x c i s i o n o f thrombosed e x t e r n a l h e m o r r h o i d s underl o c a l a n e s t h e s i a r e p o r t e d a low r e c u r r e n c e r a t e o f 6.5% a t a
EBM Summarisation Diego Molla 19/60
![Page 24: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/24.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
The XML Contents II
mean f o l l o w−up o f 1 7 . 3 months.</ l o n g t e x t><r e f i d =”12972967” a b s t r a c t =” A b s t r a c t s /12972967. xml”>Jongen J ,Bach S , S t u b i n g e r SH , e t a l . E x c i s i o n o f thrombosed e x t e r n a lh e m o r r h o i d s under l o c a l a n e s t h e s i a : a r e t r o s p e c t i v e e v a l u a t i o no f 340 p a t i e n t s . Dis Colon Rectum . 2 0 0 3 ; 4 6 : 1226−1231.</ r e f>
</long><l o n g i d =”1 3”>
<l o n g t e x t>A p r o s p e c t i v e , randomized c o n t r o l l e d t r i a l (RCT) o f 98p a t i e n t s t r e a t e d n o n s u r g i c a l l y found improved p a i n r e l i e f w i t h ac o m b i n a t i o n o f t o p i c a l n i f e d i p i n e 0.3% and l i d o c a i n e 1.5% comparedw i t h l i d o c a i n e a l o n e . The NNT f o r complete p a i n r e l i e f a t 7 days was3.</ l o n g t e x t><r e f i d =”11289288” a b s t r a c t =” A b s t r a c t s /11289288. xml”>P e r r o t t i P ,A n t r o p o l i C , Mol ino D , e t a l . C o n s e r v a t i v e t r e a t m e n t o f a c u t ethrombosed e x t e r n a l h e m o r r h o i d s w i t h t o p i c a l n i f e d i p i n e . DisColon Rectum . 2 0 0 1 ; 4 4 : 405−409.</ r e f>
</long></s n i p>
</answer></r e c o r d>
EBM Summarisation Diego Molla 20/60
![Page 25: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/25.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Components of the Corpus
Question direct extract from the source.
Answer split from the source and manually checked.
Evidence extracted from the source.
Additional text manually extracted from the source and massaged.
References PMID looked up in PubMed (automatic and manualprocedure).
EBM Summarisation Diego Molla 21/60
![Page 26: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/26.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 22/60
![Page 27: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/27.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Annotation of Text Justifications
Goal
I Identify the text justifications.
I Align the text justifications with the answer parts.
Method
I Three annotators (members of the research group).I Annotation tool contains pre-zoned text:
I answer summary;I body text;I recommendations;I references.
I Annotators need to copy and paste (and massage) the text.
EBM Summarisation Diego Molla 23/60
![Page 28: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/28.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Annotation Tool I
EBM Summarisation Diego Molla 24/60
![Page 29: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/29.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Annotation Tool II
EBM Summarisation Diego Molla 25/60
![Page 30: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/30.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Annotating Answer Justifications
Conventions for text massaging
1. Remove/edit connecting phrases.
2. Remove irrelevant introductory text.
3. If a paragraph has several references, attempt to split theparagraph.
I May need to massage the text of resulting splits.
4. If a paragraph has no references, attempt to merge withprevious or next paragraph.
EBM Summarisation Diego Molla 26/60
![Page 31: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/31.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Finding PubMed IDs
Method
1. Split the reference text into sentences.
2. Remove author and pagination text:I Use simple regexps.
3. Perform a sequence of searches with all combinations ofsentences.
EBM Summarisation Diego Molla 27/60
![Page 32: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/32.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Example I
Collins NC . Is ice right? Does cryotherapy improve outcomefor acute soft tissue injury? Emerg Med J. 2008; 25: 65-68.
I Collins NC .
I Is ice right?
I Does cryotherapy improve outcome for acute soft tissue injury
I Emerg Med J. 2008; 25: 65-68.
EBM Summarisation Diego Molla 28/60
![Page 33: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/33.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Example II
list search ID title match %
1, 2, 3 Is ice right? Does cryotherapyimprove outcome for acute softtissue injury? Emerg Med J
18212134 Is ice right? Does cryotherapyimprove outcome for acute softtissue injury?
92
1, 2 Is ice right? Does cryotherapyimprove outcome for acute softtissue injury?
18212134 Is ice right? Does cryotherapyimprove outcome for acute softtissue injury?
100
1, 3 Is ice right? Emerg Med J 18212134 Is ice right? Does cryotherapyimprove outcome for acute softtissue injury?
39
2, 3 Does cryotherapy improve out-come for acute soft tissue injury?Emerg Med J
18212134 Is ice right? Does cryotherapyimprove outcome for acute softtissue injury?
82
1 Is ice right? None None 02 Does cryotherapy improve out-
come for acute soft tissue injury?15496998 Does Cryotherapy Improve Out-
comes With Soft Tissue Injury?78
3 Emerg Med J None None 0
EBM Summarisation Diego Molla 29/60
![Page 34: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/34.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using Amazon Mechanical Turk I
Mechanics
I AMT was used to find the correct IDs.I An AMT hit had 10 references:
I 2 known references for checking quality of annotation.
I Each hit was assigned to 5 Turkers.
I There was a preliminary training session.
EBM Summarisation Diego Molla 30/60
![Page 35: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/35.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using Amazon Mechanical Turk II
Approving and rejecting hits
Reject hit if there are two or more “bad” IDs, i.e. one of:
I A known ID is wrong.I The ID is invalid:
I Not found in PubMed;I No title is returned.
I The title of the ID does not match the title of our reference:I threshold: 50% match.
I The ID does not agree with majority.
EBM Summarisation Diego Molla 31/60
![Page 36: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/36.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using Amazon Mechanical Turk III
Checking validity for final annotation
I Majority wins automatically except when:I majority is a “bad” ID;I majority is the “nf” ID;I the other two are agreeing (“full house”).
I Manual check is done in all other cases.
EBM Summarisation Diego Molla 32/60
![Page 37: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/37.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 33/60
![Page 38: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/38.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Corpus Statistics
Size
I 456 questions (“records”).
I 1,396 answers (“snips”).
I 3,036 text explanations (“longs”).I 3,705 references:
I 2,908 unique references.I 2,657 XML abstracts from PubMed.
EBM Summarisation Diego Molla 34/60
![Page 39: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/39.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Answers per Question
Avg=3.06
EBM Summarisation Diego Molla 35/60
![Page 40: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/40.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Answer justifications per answer
Avg=2.17
EBM Summarisation Diego Molla 36/60
![Page 41: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/41.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
References per answer justification
Avg=1.22
EBM Summarisation Diego Molla 37/60
![Page 42: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/42.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
References per question
Avg=6.57
EBM Summarisation Diego Molla 38/60
![Page 43: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/43.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Evidence Grade
EBM Summarisation Diego Molla 39/60
![Page 44: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/44.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
References
EBM Summarisation Diego Molla 40/60
![Page 45: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/45.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 41/60
![Page 46: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/46.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 42/60
![Page 47: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/47.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Evidence-based Summarisation
Single Document Summarisation
Input: Question, reference.
Target: Text explanation.
Multi-document Summarisation
Input: Question, group of relevant references.
Target: Answer parts (optional: plus text explanation).
EBM Summarisation Diego Molla 43/60
![Page 48: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/48.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Evidence-based Summarisation
Single Document Summarisation
Input: Question, reference.
Target: Text explanation.
Multi-document Summarisation
Input: Question, group of relevant references.
Target: Answer parts (optional: plus text explanation).
EBM Summarisation Diego Molla 43/60
![Page 49: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/49.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Appraisal, Clustering
Text Classification for Appraisal
Input: Group of references.
Target: Evidence-based grade.
Clustering
Input: Question, group of relevant references.
Target: Cluster groupings (optional: plus answer parts).
EBM Summarisation Diego Molla 44/60
![Page 50: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/50.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Appraisal, Clustering
Text Classification for Appraisal
Input: Group of references.
Target: Evidence-based grade.
Clustering
Input: Question, group of relevant references.
Target: Cluster groupings (optional: plus answer parts).
EBM Summarisation Diego Molla 44/60
![Page 51: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/51.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Retrieval?
Possible task
Input: Question.
Target: List of references.
However. . .
I Some of the references are old.
I The references are likely not exhaustive.
EBM Summarisation Diego Molla 45/60
![Page 52: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/52.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Retrieval?
Possible task
Input: Question.
Target: List of references.
However. . .
I Some of the references are old.
I The references are likely not exhaustive.
EBM Summarisation Diego Molla 45/60
![Page 53: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/53.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 46/60
![Page 54: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/54.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Input, Output
Input
I Question.
I Document Abstract.
Output
I Extractive summary that answers the question.
I Target summary is the annotated evidence text (“long”).
I Evaluated using ROUGE-L with Stemming.
EBM Summarisation Diego Molla 47/60
![Page 55: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/55.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Baselines
plain Return the last n sentences.
keywords Return the last n sentences that share any non-stopwords with the question.
umls Return the last n sentences that share any UMLSconcepts with the question.
System F Conf Interval
baseline plain 0.193 [0.190–0.196]baseline keywords 0.195 [0.192–0.198]baseline umls 0.194 [0.190–0.197]
EBM Summarisation Diego Molla 48/60
![Page 56: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/56.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using the Abstract Structure
Preselect sentences and then:
1. Use PubMed’s section tags (background, conclusions, methods, objective,results).
2. Select the first n sentences of the last “conclusions” section.
3. If we have less than n sentences, fill from the first sentences of theprevious “conclusions” section, and so on until all “conclusions” sectionsare used up.
4. If we have less than n sentences, fill from the “results” sections.
5. If we still have less than n sentences, fill from the “methods” sections.
6. If the abstract has no structure, return the last n sentences.
AbstractSection 1 S1.1 S1.2Section 2 S2.1Section 3 S3.1 S3.2Section 4 S4.1 S4.2Section 5 S5.1 S5.2
Summary
S5.1 S5.2 S4.1 S4.2 S3.1
EBM Summarisation Diego Molla 49/60
![Page 57: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/57.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using the Abstract Structure
Preselect sentences and then:
1. Use PubMed’s section tags (background, conclusions, methods, objective,results).
2. Select the first n sentences of the last “conclusions” section.
3. If we have less than n sentences, fill from the first sentences of theprevious “conclusions” section, and so on until all “conclusions” sectionsare used up.
4. If we have less than n sentences, fill from the “results” sections.
5. If we still have less than n sentences, fill from the “methods” sections.
6. If the abstract has no structure, return the last n sentences.
AbstractBackground S1.1 S1.2Methods S2.1Results S3.1 S3.2Conclusions S4.1 S4.2Conclusions S5.1 S5.2
Summary
S5.1 S5.2 S4.1 S4.2 S3.1
EBM Summarisation Diego Molla 49/60
![Page 58: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/58.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using the Abstract Structure
Preselect sentences and then:
1. Use PubMed’s section tags (background, conclusions, methods, objective,results).
2. Select the first n sentences of the last “conclusions” section.
3. If we have less than n sentences, fill from the first sentences of theprevious “conclusions” section, and so on until all “conclusions” sectionsare used up.
4. If we have less than n sentences, fill from the “results” sections.
5. If we still have less than n sentences, fill from the “methods” sections.
6. If the abstract has no structure, return the last n sentences.
AbstractBackground S1.1 S1.2Methods S2.1Results S3.1 S3.2Conclusions S4.1 S4.2Conclusions S5.1 S5.2
Summary
S5.1 S5.2
S4.1 S4.2 S3.1
EBM Summarisation Diego Molla 49/60
![Page 59: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/59.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using the Abstract Structure
Preselect sentences and then:
1. Use PubMed’s section tags (background, conclusions, methods, objective,results).
2. Select the first n sentences of the last “conclusions” section.
3. If we have less than n sentences, fill from the first sentences of theprevious “conclusions” section, and so on until all “conclusions” sectionsare used up.
4. If we have less than n sentences, fill from the “results” sections.
5. If we still have less than n sentences, fill from the “methods” sections.
6. If the abstract has no structure, return the last n sentences.
AbstractBackground S1.1 S1.2Methods S2.1Results S3.1 S3.2Conclusions S4.1 S4.2Conclusions S5.1 S5.2
Summary
S5.1 S5.2 S4.1 S4.2
S3.1
EBM Summarisation Diego Molla 49/60
![Page 60: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/60.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using the Abstract Structure
Preselect sentences and then:
1. Use PubMed’s section tags (background, conclusions, methods, objective,results).
2. Select the first n sentences of the last “conclusions” section.
3. If we have less than n sentences, fill from the first sentences of theprevious “conclusions” section, and so on until all “conclusions” sectionsare used up.
4. If we have less than n sentences, fill from the “results” sections.
5. If we still have less than n sentences, fill from the “methods” sections.
6. If the abstract has no structure, return the last n sentences.
AbstractBackground S1.1 S1.2Methods S2.1Results S3.1 S3.2Conclusions S4.1 S4.2Conclusions S5.1 S5.2
Summary
S5.1 S5.2 S4.1 S4.2 S3.1
EBM Summarisation Diego Molla 49/60
![Page 61: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/61.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using the Abstract Structure
Preselect sentences and then:
1. Use PubMed’s section tags (background, conclusions, methods, objective,results).
2. Select the first n sentences of the last “conclusions” section.
3. If we have less than n sentences, fill from the first sentences of theprevious “conclusions” section, and so on until all “conclusions” sectionsare used up.
4. If we have less than n sentences, fill from the “results” sections.
5. If we still have less than n sentences, fill from the “methods” sections.
6. If the abstract has no structure, return the last n sentences.
AbstractBackground S1.1 S1.2Methods S2.1Results S3.1 S3.2Conclusions S4.1 S4.2Conclusions S5.1 S5.2
Summary
S5.1 S5.2 S4.1 S4.2 S3.1
EBM Summarisation Diego Molla 49/60
![Page 62: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/62.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Using the Abstract Structure
Preselect sentences and then:
1. Use PubMed’s section tags (background, conclusions, methods, objective,results).
2. Select the first n sentences of the last “conclusions” section.
3. If we have less than n sentences, fill from the first sentences of theprevious “conclusions” section, and so on until all “conclusions” sectionsare used up.
4. If we have less than n sentences, fill from the “results” sections.
5. If we still have less than n sentences, fill from the “methods” sections.
6. If the abstract has no structure, return the last n sentences.
AbstractBackground S1.1 S1.2Methods S2.1Results S3.1 S3.2Conclusions S4.1 S4.2Conclusions S5.1 S5.2
Summary
S5.1 S5.2 S4.1 S4.2 S3.1
EBM Summarisation Diego Molla 49/60
![Page 63: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/63.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Results
The F is calculated using ROUGE-L with stemming.
System F Conf Interval
baseline plain 0.193 [0.190–0.196]baseline keywords 0.195 [0.192–0.198]baseline umls 0.194 [0.190–0.197]
structure plain 0.196 [0.193–0.199]structure keywords 0.193 [0.190–0.197]structure umls 0.192 [0.189–0.195]
EBM Summarisation Diego Molla 50/60
![Page 64: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/64.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
ROUGE-L with Stemming for All 3-Sentence Subsets I
Process
1. Compute the ROUGE-L of all 3-sentence subsets in eachabstract.
2. Find the decile boundaries in each abstract.
3. Find the distribution of decile boundaries.
0 1 2 3 4 5
Mean 0.094 0.136 0.153 0.164 0.176 0.188Std Dev 0.060 0.062 0.065 0.067 0.070 0.073
6 7 8 9 10
Mean 0.200 0.213 0.229 0.249 0.299Std Dev 0.076 0.081 0.087 0.094 0.112
EBM Summarisation Diego Molla 51/60
![Page 65: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/65.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
ROUGE-L with Stemming for All 3-Sentence Subsets II
EBM Summarisation Diego Molla 52/60
![Page 66: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/66.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Contents
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
EBM Summarisation Diego Molla 53/60
![Page 67: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/67.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
ALTA 2011 Shared Task
The ALTA Shared Tasks
I Competitions where all participants are evaluated on the samedata.
I The ALTA 2011 shared task was based on evidence grading.
The Data
I Clusters of abstracts.
I The SOR grade of each cluster.
EBM Summarisation Diego Molla 54/60
![Page 68: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/68.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Data Sample
Fragment
41711 B 10553790 15265350
53581 C 12804123 16026213 14627885
53583 B 15213586
52401 A 15329425 9058342 11279767
EBM Summarisation Diego Molla 55/60
![Page 69: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/69.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Words as Features
Abstract n-grams
I Generated n-grams (n = 1, 2, 3, 4) for each of the abstracts.
I Replaced specific medical concepts with generic ’sem type’tags using UMLS.
I Stemmed, lowercased, stop words removed.
Title n-grams
I Generated n-grams (n = 1, 2) for each title.
I Processed in the same way as abstract n-grams.
EBM Summarisation Diego Molla 56/60
![Page 70: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/70.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Publication Types as Features I
Distribution of publication types in a different corpus.
EBM Summarisation Diego Molla 57/60
![Page 71: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/71.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Publication Types as Features II
Publication types
I Rule-based classifier to detect publication types.
I Simple regular expressions that identify major publicationtypes.
I Used the publication types marked up by PubMed whenavailable.
I If an article has several possible publication types, choose theone with highest quality.
EBM Summarisation Diego Molla 58/60
![Page 72: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/72.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Cascaded Classification
Process: Cascaded SVMs
1. Default class: B.
2. SVMs with abstract n-grams to identify A and C.
3. SVMs with publication types to identify A and C.
4. SVMs with title n-grams to identify A and C.
Results
Method Accuracy Confidence Intervals
Majority (B) 48.63% 41.5 – 55.83Cascaded SVMs 62.84%
EBM Summarisation Diego Molla 59/60
![Page 73: Automated Summarisation for Evidence Based Medicineweb.science.mq.edu.au/~diego/medicalnlp/slides/HAIL201203.pdf · Evidence Based Medicine Our Corpus for Summarisation Applications](https://reader031.vdocuments.us/reader031/viewer/2022022418/5a721f937f8b9aac538d4def/html5/thumbnails/73.jpg)
Evidence Based Medicine Our Corpus for Summarisation Applications
Questions?
Evidence Based Medicine
Our Corpus for SummarisationStructure of our CorpusHow we Created the CorpusStatistics
ApplicationsPossible UsesSingle-document SummarisationEvidence Grading
Further Information
http://web.science.mq.edu.au/~diego/medicalnlp/
EBM Summarisation Diego Molla 60/60