creating dynamic groupers using overrepresentation of clinical terms
DESCRIPTION
Presented at Epic's Research Advisory Council, April 3, 2014, Verona, WI See a novel approach to query expansion based on pre-existing structured information within the EHR. Presenters adopted over-representation analysis to find statistically significant associations among the clinical terms extracted from Clarity reports. The study population consisted of over 7,000 patients and their 12 million observations - including labs, medications, phenotypes, diseases, and procedures. See the detailed findings and discuss computational and terminology challenges.TRANSCRIPT
![Page 1: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/1.jpg)
Creating Dynamic Groupers Using Overrepresentation of Clinical Terms
Tomasz Adamusiak MD PhD
Froedtert & Medical College of Wisconsin
![Page 2: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/2.jpg)
2
![Page 3: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/3.jpg)
Conflict of interest disclosure
Tomasz Adamusiak has no real or apparent conflicts of interest to report
3
![Page 4: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/4.jpg)
Learning objectives
• Recognize the value of structured clinical information
• Identify computational and terminology challenges in big data analytics
• Evaluate how this approach applies to different use cases
4
![Page 5: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/5.jpg)
What is a grouper?
Lists of specific values derived from standard vocabularies used to define clinical concepts, e.g. patients with diabetes
• SNOMED CT concepts
• ICD-9/10 codes
• EDG terms
• CQM Value Sets
5
![Page 6: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/6.jpg)
Diabetes: Eye Exam CMS eMeasure: CMS131v2
Value Set Name
Diabetes
Type Grouping
Steward National Committee for Quality Assurance
Program CMS,MU2 EP Update 2013-06-14
… … …
190330002 Diabetes mellitus, juvenile type, with hyperosmolar coma (disorder)
SNOMEDCT
250 Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled
ICD9CM
E10.10 Type 1 diabetes mellitus with ketoacidosis without coma
ICD10CM
6
![Page 7: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/7.jpg)
Mining associations in EHR data
Diabetes mellitus
Yes No
Glucohemoglobin measurement
Yes 1509 5442
No 881 99
7
Positive association
Background reference
![Page 8: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/8.jpg)
Dynamic = expansion + association
8
CPT-4 83036
ICD10 E08-E13
![Page 9: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/9.jpg)
Extract-Load-Transform
9
![Page 10: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/10.jpg)
Transformation in ClinMiner https://clinminer.hmgc.mcw.edu user:epicdemo pass:epicdemo
10
This image by Tomasz Adamusiak is licensed under a CC BY 3.0 US license
ClinMiner is a non-commercial, prototype software
![Page 11: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/11.jpg)
Pilot: test all possible diabetes associations
11
8k patients
12M observations
Labs (CPT-4/LOINC)
Medications (RxNorm)
Problems (ICD-9)
Procedures (CPT-4)
18 764 terms 162 significant
associations
![Page 12: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/12.jpg)
Summarize, but normalize per patient 1 + 1 = 1
12
Parent Concepts
ICD-10-CM
![Page 13: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/13.jpg)
Relatively straightforward in ICD
13
Parent Concepts
ICD-10-CM
![Page 14: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/14.jpg)
Caveat: flat hierarchy results in disconnected clinical contexts
Q: All tuberculosis codes
• 010-018.99 TUBERCULOSIS
• 137 Late effects of tuberculosis
• 647.3 Tuberculosis complicating pregnancy childbirth or the puerperium
14
![Page 15: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/15.jpg)
Expansion has to take into account multiple inheritance in SNOMED CT
15
SNOMED CT
Parent Concepts
![Page 16: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/16.jpg)
Pieter Brueghel the Elder (1526/1530–1569) [Public domain], via Wikimedia Commons
In pursuit of a single language
16
![Page 17: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/17.jpg)
Integrating terminologies with UMLS
Donald A.B. Lindberg, M.D.
Clinical
Terminologies
UMLS
17
![Page 18: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/18.jpg)
UMLS is ideal for integration of heterogeneous clinical data
• Single entry point to MU terminologies
• Cross-walk between MU terms
• Terminology-agnostic
• Text-mining
18
![Page 19: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/19.jpg)
UMLS
Exanthema C0015230
SNOMED CT
ICD-10-CM
UMLS establishes equivalence mappings across biomedical terminologies
SNOMED CT
rash NOS
ICD-10:R21
Cutaneous eruption
SCT:112625008
Eruption
SCT:1806006
![Page 20: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/20.jpg)
UMLS
Exanthema C0015230
SNOMED CT
ICD-10-CM
UMLS establishes equivalence mappings across biomedical terminologies
SNOMED CT
Cutaneous eruption
SCT:112625008
rash NOS
ICD-10:R21
Eruption
SCT:1806006
![Page 21: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/21.jpg)
6o of terminological Kevin Bacon
Acute myocardial infarction
Myocardial ischemia
Vascular Diseases
Disorder of soft tissue
Collagen Diseases
Connective Tissue Diseases
Epidermal and dermal conditions
Skin and subcutaneous tissue disorders
Dermatologic disorders
21
![Page 22: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/22.jpg)
Expansion limited to MU terminologies and by semantic type
22
Finding
Disease or Syndrome
Ignore
![Page 23: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/23.jpg)
Open issue: cycles due to subtle differences in meaning
23
Immune System
Endocrine System
![Page 24: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/24.jpg)
Expansion in UMLS across MU sources
24
Diabetes mellitus without mention of complication,
type II or unspecified type, not stated as
uncontrolled
ICD-9
ICD-10
SNOMED CT
NDF-RT
Situation with explicit
context
Metabolic diseases
roots:
![Page 25: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/25.jpg)
Statistical methods for establishing over/under-representation
• Serial contingency tables
• Chi-squared test with Bonferroni correction
• RR estimate of effect size
• Test diabetes in all 18 764 concept pairs
25
![Page 26: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/26.jpg)
EHR-based association rule mining
Diabetes mellitus (C0011849)
Yes No
Glucohemoglobin measurement
(C0202054)
Yes 1509 5442
No 881 99
26
Positive association
Background reference
![Page 27: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/27.jpg)
Other positive associations
• C0785704 Blood glucose monitoring equipment
• C0935929 Antidiabetics
• C0304870 Insulin, Long-Acting
• C0770893 Metformin hydrochloride
• C0011882 Diabetic Neuropathies
• C0011880 Diabetic Ketoacidosis
• C0011884 Diabetic Retinopathy
27
Expansion generalization on
class or system level
![Page 28: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/28.jpg)
A non-representative control background can bias the findings
Diabetes inversely associated with
• C1314183 Special EEG tests
• C0242953 Barbiturate hypnotic
• C0064636 lamotrigine
• C1719410 Epilepsy and recurrent seizures
28
![Page 29: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/29.jpg)
Open issue: reconciling lab orders with results
Clinical Laboratory
Hemoglobin A1c/Hemoglobin .total in Blood by
HPLC
LOINC:17856-6
Hemoglobin; glycosylated (A1C)
CPT-4:83036
29
![Page 30: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/30.jpg)
Challenges
• Availability of correctly and exhaustively coded data
• Expansion with multiple inheritance memory intensive
• Testing all possible (180M) combinations computationally expensive
30
![Page 31: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/31.jpg)
What can we learn from other industries?
31
![Page 32: Creating Dynamic Groupers Using Overrepresentation of Clinical Terms](https://reader033.vdocuments.us/reader033/viewer/2022060110/555ea734d8b42a6d068b5b88/html5/thumbnails/32.jpg)
Thank You!
Tomasz Adamusiak MD PhD
Human and Molecular Genetics Center
Medical College of Wisconsin
@7omasz
For more information
• Next-generation phenotyping using the Unified Medical Language System (UMLS). Adamusiak T, Shimoyama N, Shimoyama M, JMIR Med Inform. doi:10.2196/medinform.3172
• EHR-based phenome wide association study in pancreatic cancer. Adamusiak T, Shimoyama M, AMIA Summits Transl Sci Proc. 2014 (in press)