synthea: massive fhir data...2018/11/15 · high demand for ehr datasets • non-clinical or...
TRANSCRIPT
HL7®, FHIR® and the flame Design mark are the registered trademarks of Health Level Seven International and are used with permission.
Amsterdam, 14-16 November | @HL7 @FirelyTeam | #fhirdevdays18 | www.fhirdevdays.com
Synthea: Massive FHIR Data
Jason Walonoski
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Synthea
• Synthetic Patient Simulation• Synthea is an open-source synthetic patient generator that simulates the
medical history of synthetic patients.
• High Quality Health Records• The system outputs high-quality synthetic, realistic but not real, patient data
and associated health records covering every aspect of health.
• Freely Available• The resulting data is free from legal, cost, privacy, and security restrictions for
a variety of secondary uses in academia, research, industry, and government where realistic (but not real) data is sufficient
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Sound useful? Get Started while I’m talking…
• Requirements• Java Development Kit 1.8• Git Version Control
git clone https://github.com/synthetichealth/synthea.gitcd synthea./gradlew build check test./run_synthea
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Why synthetic data?
• High demand for EHR datasets• Non-clinical or secondary uses including software development, testing, clinical training, policy analysis, where
realistic (but not real) data is sufficient
• Lack of Access• EHR datasets are difficult to obtain
• Costs and Demand• Anonymized records are being bought and sold
• Risks• Real patient records carry privacy, confidentiality, consent, policy, and legal risks that effectively prevent use
• Real patients fear exposure of their intimate health data including lifestyle, family history, and mental health data
• Not Anonymous• Deidentified and anonymized records have been successfully reidentified
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Not just patient records…
• Access to Care• Modeling healthcare facilities and utilization• Calculates individual access
• Health Outcomes• Calculates Quality Adjusted Life Years (QALY) and Disability Adjusted Life Years (DALY)• Quality Measures
• Cost and Price• Modeling claims insurance coverage
• Medicaid, Medicare, Dual-Eligible, Private, None• Cost to individual and family burden (annual and lifetime)• Health System costs
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
DiseasesTop 10 Reasons Patients Visit PCP Top 10 Years of Life Lost
1 Routine infant/child health check Ischemic Heart Disease
2 Essential Hypertension Lung Cancer
3 Diabetes Mellitus Alzheimer’s Disease
4 Normal Pregnancy COPD
5 Respiratory Infections (Pharyngitis, Bronchitis, Sinusitis) Cerebrovascular Disease
6 General Adult Medical Examination Road Injuries
7 Disorders of Lipoid Metabolism Self-Harm
8 Ear Infections (Otitis Media) Diabetes Mellitus
9 Asthma Colorectal Cancer
10 Urinary Tract Infections Drug Use Disorders (limited to Opioids)
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
90 modules with 722 clinical codes…
Disease modules are state machines…
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Disease modules are written in JSON{
"name": "Ear Infections","states": {
"Initial": {"type": "Initial","direct_transition": "No_Infection"
},"No_Infection": {
"type": "Delay","direct_transition": "Gets_Ear_Infection","range": { "low": 1, "high": 2,
"unit": "months" }},"Gets_Ear_Infection": {
"type": "ConditionOnset","target_encounter": "Ear_Infection_Encounter","codes": [{
"system": "SNOMED-CT", "code": "65363002","display": "Otitis media"
}],"direct_transition": "Ear_Infection_Encounter"
},"Ear_Infection_Encounter": {
"type": "Encounter","encounter_class": "outpatient","reason": "Gets_Ear_Infection","codes": [{
"system": "SNOMED-CT", "code": "185345009",
"display": "Encounter for symptom"}],"distributed_transition": [
{ "distribution": 0.8, "transition": "Antibiotic" },
{ "distribution": 0.2, "transition": "Painkiller" }
]},"Antibiotic": {
"type": "MedicationOrder","codes": [{
"system": "RxNorm", "code": 309310,"display": "Ciprofloxacin 100 MG/ML
Oral Suspension"}],"direct_transition": "Terminal"
},"Painkiller": {
"type": "MedicationOrder","codes": [{
"system": "RxNorm", "code": 307668,"display": "Acetaminophen 32 MG/ML
Oral Suspension"}],"direct_transition": "Terminal"
},"Terminal": { "type": "Terminal" }}}
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Control States: control the flow
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Clinical States: drive disease and care
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Exampilitis Walk-Through
• 10 minute walk-through• “Examplitis is a painful condition
that affects only males. Most patients can be cured with Examplitol or an Examplotomy but some never recover.”
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
https://synthetichealth.github.io/module-builder/
Setup
• Requirements• Java Development Kit 1.8• Git Version Control
git clone https://github.com/synthetichealth/synthea.gitcd synthea./gradlew build check test
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Generating Data example@hostname ~/synthea $ ./run_synthea
> Task :runLoading C:\Users\example\synthea\build\resources\main\modules\allergic_rhinitis.jsonLoading C:\Users\example\synthea\build\resources\main\modules\allergies\allergy_incidence.json[... many more lines of Loading ...]Loading C:\Users\example\synthea\build\resources\main\modules\wellness_encounters.jsonLoaded 90 modules.Running with options:Population: 1Seed: 1519063214833Location: Massachusetts
1 -- Jerilyn993 Parker433 (10 y/o) Lawrence, Massachusetts
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Synthea generates FHIR Resources (DSTU2, STU3, R4)• Bundle• Patient• Encounter• Condition• AllergyIntolerance• Observation• DiagnosticReport• Procedure• ImagingStudy• Immunization• CarePlan• MedicationRequest• Claim• ExplanationOfBenefit (STU3 only, BB2.0)• Coverage (STU3 only)
{"resourceType": "Observation","id": "15cbce37-e98d-40b8-8ab2-d57907fa3a2b","status": "final","category": [ { "coding": [ {
"system": "http://hl7.org/fhir/observation-category","code": "vital-signs", "display": "vital-signs"
} ] } ],"code": { "coding": [ {
"system": "http://loinc.org", "code": "55284-4","display": "Blood Pressure"
} ], "text": "Blood Pressure"},"subject": { "reference": "urn:uuid:2af35dd1-fb58-40f2-8066-21be17fb420d" },"context": { "reference": "urn:uuid:6a1a467c-aeb0-4ca8-9826-af16aaca4dc2" },"effectiveDateTime": "2009-01-12T00:03:59-05:00","issued": "2009-01-12T00:03:59.349-05:00","component": [ {"code": { "coding": [ {
"system": "http://loinc.org", "code": "8462-4","display": "Diastolic Blood Pressure"
} ], "text": "Diastolic Blood Pressure"},"valueQuantity": {"value": 85.19988531686474, "unit": "mmHg","system": "http://unitsofmeasure.org", "code": "mmHg"
} }, {"code": { "coding": [ {
"system": "http://loinc.org", "code": "8480-6","display": "Systolic Blood Pressure"
} ], "text": "Systolic Blood Pressure"},"valueQuantity": {"value": 108.84244941704915, "unit": "mmHg","system": "http://unitsofmeasure.org", "code": "mmHg"
}} ]
}
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
POST the data to a servercurl http://hapi.fhir.org/baseDstu3 --data-binary "@/Users/example/synthea/output/fhir/Maryetta775_Rowe323_2cb7e4dd-9d8b-49cf-b1e4-9839be8bc754.json" -H "Content-Type: application/fhir+json"
{ "resourceType": "Bundle", "id": "ca4d459f-b078-4c04-a152-c7ce76a25179","type": "transaction-response","link": [{ "relation": "self",
"url": "http://hapi.fhir.org/baseDstu3"}],"entry": [{ "response": { "status": "201 Created","location": "Patient/4147259/_history/1", "etag": "1", "lastModified": "2018-06-06T18:26:06.038+00:00" }}]
}
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Configuring FHIR Settings
• The exporting of FHIR can be configured using src/main/resources/synthea.properties
# Abridged synthea.properties file# default FHIR configuration.exporter.fhir.export = true# transaction bundle 'true' produces transaction Bundles# while 'false' produces collection Bundles.exporter.fhir.transaction_bundle = true# Standard Health Record (SHR) extensions for STU3exporter.fhir.use_shr_extensions = true# Exporting FHIR DSTU2exporter.fhir_dstu2.export = false# Exporting FHIR R4exporter.fhir_r4.export = false# Exporting Hospital Provider Data in STU3 or DSTU2exporter.hospital.fhir.export = trueexporter.hospital.fhir_dstu2.export = false
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
How are other developers using Synthea?
• Healthcare Services Platform Consortium (HSPC)• Developer sandbox environment to spin-up FHIR servers and load them with
Synthea data. They also host a FHIR server preloaded with Synthea data called HSPC Synthea STU3 (3.0.1) (authentication required).
• https://sandbox.hspconsortium.org/
• SMART Health IT • Datasets available for download including Synthea data, which are also available
through http://docs.smarthealthit.org/data/stu3-sandbox-data.html• They also provide a Docker version of the HAPI FHIR Server preloaded with
Synthea data here: https://github.com/smart-on-fhir/hapi• Used Synthea data with their Bulk Data Server.
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
How are other developers using Synthea?
• Algorex Health has used Synthea data to explore Open Clinical Analysis.• https://blog.algorexhealth.com/2017/04/open-clinical-analysis-with-mitre-part-2/
• The MITRE Corporation has used Synthea data to create SyntheticMass, a 1/7th scale simulated model of the Commonwealth of Massachusetts, including a FHIR server (FHIR v1.8).
• https://syntheticmass.mitre.org
• An MSDN blog post illustrates Loading Synthea FHIR Data with Logic Apps and Functions in Azure Government.
• https://blogs.msdn.microsoft.com/mihansen/2018/05/10/loading-synthea-fhir-data-with-logic-apps-and-functions-in-azure-government/
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
How are other developers using Synthea?
• Cerner uses Synthea in their Bunsen tutorial which adds the FHIR data model to Apache Spark queries within Jupyter notebooks
• https://github.com/cerner/bunsen-tutorial
• Google uses Synthea in their FHIR protobuf examples using Big Query• https://github.com/google/fhir/tree/master/examples/bigquery#example-code-
to-upload-fhir-resources-into-bigquery
• What will you use Synthea for?
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Synthea Tutorials and Exercises
1. Install, Configure, and Run Synthea2. Use the Synthetic FHIR Data3. Explore and Modify the Disease Modules4. Localize Synthea for Alternative Geographic Locations
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.
Roadmap
• Current FHIR Support• DSTU2, STU3, and R4• Argonauts IG• Blue Button 2.0 IG• JSON and Bulk Data (ndjson)
• New FHIR versions• Additional Implementation Guides
• Terminology Variation• Splitting the Record• Clinical Notes
• Updating existing patients periodically (daily?)
• Claims tied to Payers
• Multiple private/government payers• Care Seeking Behavior
• In and Out of Network• Variable Care
• Health Disparities and Determinants of Health
Open Source Resources
• Contact• Jason Walonoski• [email protected]
• Synthea• https://github.com/synthetichealth/synthea
• Module Builder• https://synthetichealth.github.io/module-builder/
© 2018 The MITRE Corporation. ALL RIGHTS RESERVED. Approved for public release. Distribution unlimited 18-1678-1.