mining and visualization of flow cytometry data angela chin university of houston research...

Mining and Visualization of Flow Cytometry DataANGELA CHIN

UNIVERSITY OF HOUSTON RESEARCH EXPERIENCE FOR UNDERGRADUATES

JULY 3, 2013

Contents1. Introduction to Flow Cytometry2. The Problem3. Current Approaches & Results4. Future Work

Flow CytometryMEDICAL TECHNIQUE USED FOR CELL COUNTING AND CELL SORTING

How it Works

Picture from: Abcam http://www.abcam.com/index.html?pageconfig=resource&rid=11446 4

Flow Cytometry ApplicationDetermine whether a person has b-cell lymphomaBased on the number of clusters that result from flow cytometry• Two clusters : cancer patient

• Three clusters : healthy individual

Example: Flow Cytometry Results

Cancer PatientHealthy Patient

Problems with Current MethodsThe process for determining if there are two or three clusters is manualDoctors’ time could be better spent on other tasks

The ProblemCREATING AN AUTOMATED METHOD TO DETERMINING THE NUMBER OF CLUSTERS

Past ApproachesMany ways to determine number of clusters• Most need to know the number of clusters ahead of time

Most popular is k-means, but there are some problems• Need to give the algorithm the number of clusters beforehand

• Has difficulty when clusters are close, different sizes, etc.

Further Defining the ProblemWe want to be able to determine the number of clusters when:The distance between clusters is very smallThe ratio of cluster sizes is large (100:1 to 1000:1)

We decided to further constrain the problem such that we could determine:1 cluster vs 2 clusters when the size ratio was up to 1000:1

Current Approaches & Results

Two Approaches Approach #1: TransformationFind the center of the dataTake each point and find its angle from the horizontal line located at the center (new x-value) and distance from the center (new y-value)Use transformed data to determine number of clusters

Approach #2: Testing Normal FitProject 2D data onto line to create 1D dataApply normal distribution fitCompare the Bayesian Information Criterion (BIC) of the fit to a cut-off limitIf the BIC is above the limit, there are two clusters; otherwise, there is one

Approach #1: Transformation

𝜋/2 3/2 2𝜋

Approach #1: Transformation Process

𝜋/2 3/2 2𝜋

Approach #1: Transformation

𝜋/2 3/2 2𝜋

Approach #2: Testing Normal Fit

Approach #2: Testing Normal Fit3 standard deviations apart, ratio 1:99

ONE CLUSTER BEST FITS TWO CLUSTER BEST FITS

Approach #2: Testing Normal Fit Comparing BIC of the one

cluster versus two clusters

All data was generated using 100000 points and the same standard deviations

The ratios between clusters and distance between two clusters (if applicable) was varied• Ratios: 199:1 to 63:1• Distance: 1.5 to 5 Standard

Deviations apart

Approach #2: Testing Normal Fit

Comparing BIC of the one cluster versus two clusters

All data was generated using 100000 points and the same standard deviations

The ratios between clusters and distance between two clusters (if applicable) was varied• Ratios: 199:1 to 63:1• Distance: 1.5 to 5 Standard

Deviations apart

Future Work

Future WorkApproach #1:

Determine if there is a way to detect the second cluster in the transformation

Approach #2: Use real data to see if a cut-off can be determined

Overall: After figuring out how to distinguish one and two clusters, extend the method to two versus three clusters

LimitationsAssume the data will have Gaussian distributionNumber of clusters limited to two or three

AcknowledgementsI would like to thank my research advisor, Dr. Stephen Huang, and Mitch Shih for their guidance on this project. I would also like to thank the University of Houston Computer Science Department and the National Science Foundation for providing me with the opportunity to participate in the REU.

mining and visualization of flow cytometry data angela chin university of houston research...

flow cytometrytwo clusters

cancer patientthree

number of clustersmost

number of clustersapproach

normal fit comparing

flow cytometry results

normal fitproject

cluster vs

Documents

what is flow cytometry? flow cytometry uic april 05, 2013...

basic principles in flow cytometry - uf icbr...basic...

amrep flow cytometry core facility · pdf fileamrep flow...

dermatophytes undergraduates

introduction to flow cytometry -...

practical flow cytometry - buch.de · practical flow...

role of flow role of flow cytometry cytometry in...

intracellular flow cytometry - thermo fisher...

ra undergraduates

flow cytometry leukocyte differential : a critical appraisal...

rsc water forum: flow cytometry day using flow cytometry

sql undergraduates

intracellular flow cytometry - thermo fisher...

image cytometry

flow cytometry handbook

chin - chin chinaman

flow cytometry

what is flow cytometry? introduction to flow cytometry igc...

integrated cytometry

international society for advancement of cytometry...