segmentation of clinical texts

Kavita Ganesan & Michael Subotin

Presented at: 2014 Conference on IEEE Big Data

All sorts of notes types!

Admit notes ◦ documenting why patient is being admitted◦ baseline status, etc.

Progress notes◦ progress during course of hospitalization

Discharge notes◦ conclusion of a hospital stay or series of treatments

Others◦ Operative notes◦ Procedure notes◦ Delivery notes◦ Emergency Department notes, etc

PRIMARY CARE PHYSICIAN:

Dr. XXXXX XXXXXXXX.

CHIEF COMPLAINT:

Injured right little toe.

HISTORY OF PRESENT ILLNESS:

This is a 63-year-old male with a past medical history of multiple

myeloma who presents today after hitting his fifth toe of the right foot

on a wood panel yesterday……

Review of Systems:

CONSTITUTIONAL: No fever, chills, or weight loss.

RESPIRATORY: No cough, shortness of breath, or wheezing.

CARDIOVASCULAR: No chest pain, chest pressure, or palpitations.

...............

PAST MEDICAL HISTORY

Multiple myeloma, peripheral neuropathy, hypertension..

PAST SURGICAL HISTORY:-

Stem cell transplant.

SOCIAL HISTORY

The patient formerly smoked tobacco; however, quit within the last 10

years.

FAMILY HISTORY:

Hypertension.

ALLERGIES:

ASPIRIN.

………

Purpose of visit

Patient’s current condition in

narrative form

Ongoing issues, issues in the past

Information on allergies


Dr. XXXXX XXXXXXXX.

CHIEF COMPLAINT:






Review of Systems:




...............





SOCIAL HISTORY


years.

FAMILY HISTORY:

Hypertension.

ALLERGIES:

ASPIRIN.

………

Purpose of visit

Patient’s current condition in

narrative form

Ongoing issues, issues in the past

Information on allergies

This is how most notes look:• some longer, some shorter• different set of headers, etc


Dr. XXXXX XXXXXXXX.

CHIEF COMPLAIN:



This is a 63-year-old male with

a past medical history of…

Review of Systems:

CONSTITUTIONAL: No fever,

chills, or weight loss.

CARDIOVASCULAR: No chest pain,

chest pressure, or palpitations.

...............

………


Dr. XXXXX XXXXXXXX.

CHIEF COMPLAIN:





Review of Systems:





...............

………


Dr. XXXXX XXXXXXXX.

CHIEF COMPLAINT:





Review of Systems:





...............

………

Very unstructured◦ formatting cues inconsistent◦ varies: across physicians, notes,

hospitals

Hard to analyze specific sections◦ E.g. analyze allergies patient population ◦ Need to segment notes to extract

all allergy info.

◦ Information collected vary from note types to note types Ex. info on progress notes vs. admit note

◦ Contents & formatting can vary from hospital to hospital Even within the same organization – E.g. Kaiser

◦ Contents & formatting vary between physicians Different styles, speed of typing, etc.

If you are looking at a single note type, from a single hospital - then maybe

Not suitable as a general segmentation approach:

Can easily break:◦ on unseen note types and minor format variations◦ Example: regex based on all caps regex based on seen headers only

Several works have explored supervised methods to segmenting clinical notes[Cho et al. 2003, tepper et al. 2012, apostolva et al. 2009]

Problem: methods not general!◦ Cho et al. 2003: One model for each type of note 20 note types 20 models! Not practical maintain each model

◦ Tepper et al. 2012: Model had low adaptability to unseen documents features used, training data used, etc.

General segmentation approach for clinical texts

Requirements: ◦ Single model/approach for most note types ◦ Discount extreme non-standard formatting

e.g. tabular format

Segment:◦ Header◦ Top level sections◦ Footer


Dr. XXXXX XXXXXXXX.

CHIEF COMPLAINT:






Review of Systems:




...............





SOCIAL HISTORY


years.

FAMILY HISTORY:

Hypertension.

ALLERGIES:

ASPIRIN.

………

Header

Top-level section

Top-level section

Top-level section

Top-level section

Top-level section

Top-level section

Top-level section

Supervised approach using L1-Logistic Regression with a constraint combination approach

Idea: scan each line in a clinical document and label as:◦ BeginHeader◦ ContHeader◦ BeginSection◦ ContSection◦ Footer

Labels are predicted with certain confidence

But, problem using line-wise predictions as is:◦ Label sequences may not make sense ◦ E.g. There maybe a BeginHeader after a BeginSection

incorrect

Post-processing: enforce sequence combination rules:◦ First line of document: BeginHeader or BeginSection◦ BeginHeader cannot come right after BeginHeader or ContHeader◦ ContHeader must come after BeginHeader or ContHeader◦ ContSection must come after BeginSection or ContSection◦ Footer cannot come right after BeginHeader or ContHeader

Rules applied after all lines in document labeled◦ Applied to consecutive label pairs ◦ Computed efficiently: Viterbi algorithm

• Notes from 12 different enterprises• Some large enterprises• All sorts of note types• Some noisy sectioning, some clean

Inpatient Outpatient

• 100 radiology notes• Fairly clean sections

• One hospital • All sorts of note types• Fairly well sectioned• 35, 000 notes in total

• 2000 randomly sampled notes(inpatient)

• 100 radiology notes• Fairly clean sections

Emphasis on training data

Variation in training data ◦ Use different note types for training◦ Intuition: help model generalize well

Sample training data:◦ Instead of using all training data from 2100 notes◦ Generated subsets of training data with varying size and

cross-validate on test sets◦ Intuition: allows to pick the best model Best model only used < 700 notes (out of 2100)

5 test sets◦ 4/5 test set from hospitals not in train set true estimate of accuracy

◦ Covers both inpatient and outpatient notes ◦ Covers different note types◦ ~12,500 test notes

Primary evaluation metric: line-wise accuracy ◦ percentage of correctly predicted line labels

Train set 3-folded cross

validationUnseen test

accuracy

Inp1HospB (300 - limited) 96.70% 67.00%

Inp3HospD (300 - varied) 96.58% 88.23%

Important to have variety in training notes in building general segmentation model

1st model: limited variety (hp + discharge)

2nd model: variety (11 types - hp, ds, pn…)

3-folded cross-validation accuracy: high in both

Model with variety: higher accuracy on unseen test set

Client/Data In/Outpatient # Test Docs Accuracy

1. Inp1HospB In 300 92.58%

2. Inp2HospC In 1000 93.29%

3. Inp3HospD In 300 95.81%

4. Rad1MixedHosps Out 9000 92.45%

5. Rad2HospA Out 1902 93.67%

Average 93.56%

Accuracy consistently > 90% across enterprises

• Average accuracy: 93.56% • Covers inpatient/outpatient

Single model: But, performs well across enterprises

Document Type Accuracy

1. History and Physical 95.70%

2. Physician Clinicals 93.10%

3. Discharge Summary 94.00%

4. Consult Note 94.60%

5. Short Stay Summary 94.60%

6. Operative Note 92.20%

7. Progress Note 87.80%

8. Cardiac Cath Report 85.40%

9. Procedure Note 83.60%

• Model performs well across note types• Lowest performance: procedure notes

low recall on segmenting “technique” sections

Performs very well > 90%

Reasonable..> 80%

Accuracy Breakdown for Inp2HospC

86.00%

87.00%

88.00%

89.00%

90.00%

91.00%

92.00%

93.00%

94.00%

0 500 1000 1500 2000

Acc

ura

cy

# Training Notes

# Notes vs. Accuracy

Avg. accurracy peaks @500 notes on all test sets

No benefit with more notes

No need for big data for a general model.We need good data from all that big data!

No benefit with more notes

Unigrams – of each line (LineUnigram)

Relative position of line in document (PosInDoc)◦ Top, Middle, Bottom

Known Header features (KnownHeader)◦ Find potential headers using repository of seen headers

◦ Seen headers can have canonical type E.g. Past Medical History, Previous Med History “PAST_MEDICAL_HISTORY”

◦ If potential headers found, we include features: Canonical type Unigram & Char n-gram of potential header Caps/colon info – mixed case, all caps, lowercase Length of potential header

Feature SetAvg.

Accuracy Improvement

LineUnigram 85.55%

LineUnigram+PosInDoc 88.62% +3.46%

LineUnigram+PosInDoc+KnownHeader 93.10% +4.81%

Explored:◦ Supervised approach to building a very general segmentation

model for clinical texts

Evaluation showed:◦ Model works well on notes across enterprises◦ Model works across note types

Key to effectiveness:◦ Variation in training data –all sorts of note types◦ Training data selection strategy – sample and cross-validate◦ Feature set – not explored in existing works

Contact:Kavita Ganesanganesan.kavita@gmail.comwww.kavita-ganesan.comwww.text-analytics101.com

mailto:[email protected]

http://www.kavita-ganesan.com/

http://www.text-analytics101.com/

segmentation of clinical texts

Technology