medication extraction from clinical data using frame semantics

28
www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se DIMITRIOS KOKKINAKIS Centre for Language Technology University of Gotehnburg [email protected] Medication Extraction from Clinical Data Using Frame Semantics

Upload: almira

Post on 05-Jan-2016

27 views

Category:

Documents


3 download

DESCRIPTION

Medication Extraction from Clinical Data Using Frame Semantics. DIMITRIOS KOKKINAKIS Centre for Language Technology University of Gotehnburg [email protected]. OVERVIEW. Motivation Semantic Annotation of Corpora and Event-Based Information Extraction - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

DIMITRIOS KOKKINAKISCentre for Language TechnologyUniversity of [email protected]

Medication Extraction from Clinical Data Using Frame Semantics

Page 2: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

OVERVIEW

Motivation Semantic Annotation of Corpora and

Event-Based Information Extraction e.g. i2b2 Medication Challenge

Frame Semantics Medical Frames Pilot. Administration_of_Medication

Design and Resources (so far…) Conclusion and Future Work

Page 3: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Semantic annotation of corpora for mining complex relations and events has gained a considerable growing attention in the medical domain

Goal (work in progress) to develop an appropriate infrastructure for automatic event labeling in the clinical domain using hybrid techniques (e.g. supervised machine learning, rules, lexicons, etc)

Event extraction can be modeled as a sequential tagging problem, train and test data sets will be/are taken from Swedish medical corpora while the Swedish FrametNet++ provides the basis for the events’ description

MOTIVATION (EXTRACTION OF FACTS and EVENTS)

Page 4: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Information extraction (IE) is a technology thathas a direct correlation with frame-like structures in FrameNet; since templates in the context of IE are frame-like structures with slots representing event information. Most event-based IE approaches are designed to identify role fillers that appear as arguments to event verbs or nouns, either explicitly via syntactic relations or implicitly via proximity

EVENT-BASED INFORMATION EXTRACTION

Page 5: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

The Third i2b2 Workshop on NLP Challenges for Clinical Records (designed as an information extraction task) focused on the extraction of medications and medication-related information from discharge summaries

The ”Medication Challenge” i2b2… (2009)

Page 6: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

The ”Medication Challenge” i2b2… (2009)

Page 7: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

ADVANTAGES OF STRUCTURED DATA…

get an overview of the medication ordered in diff dimensions help organize and improve the presentation of EHR; advanced

graphical presentation of EHR data create the basis for data mining, evidence-based medicine; e.g.

for the epidemiological analysis of adverse events allow the automatic transmission of data to various registries aggregate data from many patients in repositories, facilitating e.g.

open comparisons make the selection of more reliable quality comparisons between

different parts of the country / world create a database directly accessible to the research allowing the generation of new hypotheses and new (semantic)

relationships improving patient safety, pharmacovigilance …

Page 8: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

The FrameNet approach is based on the linguistic theory of frame semantics supported by corpus evidence. A semantic frame is a script-like structure of concepts, which are linked to the meanings of linguistic units and associated with a specific event or state

Each frame identifies a set of frame elements, which are frame specific semantic roles; both so called core roles, arguments, tightly coupled with the particular meaning of the frame and more generic non-core ones, adjuncts or modifiers which to large extent are event-independent semantic roles

When using computers to extract semantic information for NLP tasks, FrameNet's semantic mapping provides a means for the computer to extract meaning from a string of words

FRAME SEMANTICS…

Page 9: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

FRAME SEMANTICS…Thus, a word activates, or evokes, a frame of semantic knowledge

relating to the specific concept it refers to. A semantic frame is a collection of facts that specify "characteristic features, attributes, and functions of a denotatum, and its characteristic interactions with things necessarily or typically associated with it". A semantic frame can also be defined as a coherent structure of related concepts that are related such that without knowledge of all of them, one does not have complete knowledge of any one

E.g., one would not be able to understand the word sell without knowing anything about the situation of commercial transfer, which also involves a seller, a buyer, goods, money, the relation between the money and the goods and so on

Page 10: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

http://www.icsi.berkeley.edu/pubs/icsi/2011AnnualReport.pdf

FN began collaborations with two industrial partners this year. One is with a defense contractor to develop frames and annotation for reports written by U.S. soldiers after patrols in Afghanistan and Iraq. The other is a partnership with Siemens Research U.S. to develop frames and annotation for medical texts, such as medical textbooks and guidelines for the treatment of diseases.

RELEVANT APPLICATIONS…

Page 11: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

A slide from an LREC 2012 presentation (closing session)

Page 12: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Page 13: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

MEDICALLY ORIENTED FRAMES

https://framenet.icsi.berkeley.edu/fndrupal/index.php?q=frame_report&name=Medical_intervention

Page 14: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Swedish MEDICALLY ORIENTED FRAMESAdministration_of_medication AddictionBirth Death Experience_bodily_harm Falling_ill Health_response Institutionalization Medical_disorders Medical_instruments Medical_interaction_scenarioMedical_professionals Medical_specialties Medical_treatment Observable_bodyparts People_by_disease Recovery …

http://spraakbanken.gu.se/eng/research/swefn/development-version

Page 15: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Example Frame: CURE

http://spraakbanken.gu.se/eng/research/swefn/development-version

Page 16: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

http://spraakbanken.gu.se/eng/research/swefn/development-version

Example

Page 17: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

CORE Frame Elements

NON-CORE Frame Elements

Frame: Administration_of_Medication

Page 18: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Design so far… Resources in Use1. FASS is the Swedish national formulary: contains a list of

medicines that are approved for prescription throughout 2. Swedish SNOMED CT’s Substance hierarchy: contains

“concepts that can be used for recording active chemical constituents of drug projects, food and chemical allergens, adverse reactions, toxicity or poisoning information, and physicians and nursing orders”<http://www.ihtsdo.org/snomed-ct/snomed-ct0/snomed-ct-hierarchies/substance/>

3. Swedish MeSH’s category D, Chemicals and Drugs (5,886)4. Drug lexicon extensions (e.g. generic expressions of drugs,

detecting misspellings)5. List of relevant abbreviations+variants: iv, i.v., im, i.m. sc,

s.c., po, p.o., vb, v.b., V b, T, inj., tbl, …6. …

Page 19: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Design so far… Resources in Use1. Named Entity Recognition for the relevant entities:

1. Drug Names2. Time3. Frequency

2. Terminology Recognition1. MeSH2. SNOMED CT

3. (ongoing) Manual annotation with the frame elements

Page 20: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Page 21: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Page 22: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Page 23: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Page 24: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Page 25: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Richard Johansson, Karin Friberg Heppin, Dimitrios Kokkinakis. Semantic Role Labeling with the Swedish FrameNet. Proceedings of the 8th International Conf on Language Resources and Evaluation (LREC'12), pp. 3697–3700. Istanbul, Turkey, 2012.

Page 26: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

The driving force for the experiments is frame semantics, which allows us to work with a more holistic and detailed semantic event description than it is possible using for instance most traditional efforts based on binary relation extraction approaches

Event extraction is more complicated and challenging than relation extraction since events usually have internal structure involving several entities as participants allowing a detailed representation of more complex statements

Preliminary results suggest that SweFN++ seems a good start for annotating corpora. The role set described is general enough to capture a wide range of phenomena that characterize the majority of semantic arguments of general medical events

CONCLUSIONS

Page 27: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

Need larger size of annotated corpora for larger scale experiments (which are planned…)

We are currently working with:• extending/refining/encoding new frames according to the BFN descriptions• manually annotating larger corpora• investigate how existing frame descriptions can actually capture semantics• continue with more experiments (methods, software, larger data sets) for learning to annotate the arguments• using a richer set of features, and particularly syntactic information and the distance between the arguments

FUTURE WORK

Page 28: Medication Extraction from Clinical Data Using Frame Semantics

www.svenska.gu.se www.clt.gu.se spraakbanken.gu.se

…related REFERENCES

• Sigfried Gold, Noémie Elhadad, Xinxin Zhu, James J. Cimino, and George Hripcsak. Extracting Structured Medication Event Information from Discharge Summaries. AMIA Annu Symp Proc. 2008; 2008: 237–241.

• Jon Patrick, Min Li. High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge. J Am Med Inf Assoc 2010;17:524e527.

• Louise Deléger, Cyril Grouin, Pierre Zweigenbaum. Extracting medical information from narrative patient records: the case of medication-related information. J Am Med Inf Assoc 2010;17:555e558.

• Son Doan, Lisa Bastarache, Sergio Klimkowski, Joshua C Denny, Hua Xu. Integrating existing natural language processing tools for medication extraction from discharge summaries. J Am Med Inf Assoc 2010;17:528e531.

• Thierry Hamon, Natalia Grabar. Linguistic approach for identification of medication names and related information in clinical narratives. J Am Med Inf Assoc 2010;17:549e554.

• Scott Russell Halgrim, Fei Xia, Imre Solti, Eithon Cadag, Özlem Uzuner. A cascade of classifiers for extracting medication information from discharge summaries. J of Biomed Sem 2011, 2(Suppl 3):S2