scaling unsupervised ciliary motion analysis for actionable biomedical insights with pyspark by...
TRANSCRIPT
![Page 1: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/1.jpg)
Scaling unsupervised ciliary motion analysis for actionable biomedical insights with PySpark
Shannon QuinnUniversity of Georgia
![Page 2: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/2.jpg)
Who am I?• Georgia Tech alumnus
• Carnegie Mellon University & University of Pittsburgh alumnus
• Assistant Professor of Computer Science & Cellular Biology at University of Georgia
• Public health, imaging, data science, open science, running…
![Page 3: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/3.jpg)
What are cilia?
Scale bars: 10μm
![Page 4: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/4.jpg)
Why do we care about cilia?• Clinical
– Ciliopathies– Association with
congenital heart disease
• Developmental– Nodal flow– Left-right asymmetry
![Page 5: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/5.jpg)
How do we diagnose ciliopathies?Cheap, fast, inaccurate Slow, expensive, accurate (?)
Measure nasal nitric oxide (NO)
levels
Electron microscopy to search
for structural defects
Ciliary beat frequency
(CBF) computation
Manual ciliary beat
pattern analysis
“Gold standard”
![Page 6: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/6.jpg)
What is our goal?• Input: high-speed video of ciliary biopsy• Output: quantitative properties of observed motion
Curly!
![Page 7: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/7.jpg)
Strategy for quantifying motion
![Page 8: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/8.jpg)
From videos to features
![Page 9: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/9.jpg)
Features of motion
Scaling Deformation(biaxial shear)
Rotation(curl)
Not useful in 2D
Novel use of differential image velocity invariants to categorize ciliary motion defects.Quinn SP, Francis R, Lo C, Chennubhotla CS. Proceedings of the Biomedical Science and Engineering Conference (BSEC) 2011.
![Page 10: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/10.jpg)
What do these features look like?
Rotation (rad/s)
![Page 11: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/11.jpg)
How do we model the features?~yt = C~xt
~xt = A1~xt�1 +A2~xt�2 + ...+Ad~xt�d
Featurevectors!
![Page 12: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/12.jpg)
What can we do with these features?
93% accuracy
Automated identification of abnormal respiratory ciliary motion in nasal biopsies.Quinn SP, Zahid M, Durkin J, Francis R, Lo C, Chennubhotla CS. Science Translational Medicine 2015.
![Page 13: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/13.jpg)
Great, but…
![Page 14: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/14.jpg)
…definitely more than two motion types
![Page 15: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/15.jpg)
Subtypes likely have clinical implications
• Primary ciliary dyskinesia– Genetic disorder directly
affecting cilia• Other disorders highly
correlated with ciliary dysfunction– Congenital heart disease– Heterotaxy / situs inversus– Cognitive defects– Developmental defects
![Page 16: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/16.jpg)
Short answer: Yes! Clustering!
• AR parameters A1, A2, …, Ad
• Nonlinear space• Geodesic distance metrics
– “Vanilla” K-means is out
~yt = C~xt
~xt = A1~xt�1 +A2~xt�2 + ...+Ad~xt�d
![Page 17: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/17.jpg)
Dataset(s)2015 Classification Study
• 291 videos
Unsupervised subtyping• 291 from previous study• 431 left out (artifacts)• 628 from internal
collaborators• 1000+ from external
collaborators
• ~200MB / video• ~500GB raw data
![Page 18: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/18.jpg)
Data Acquisition
http://ciliaweb.csb.pitt.edu
![Page 19: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/19.jpg)
Spark Pipeline• Preprocess videos
– Identify regions of interest (patches)
– Compute optical flow & motion features (rotation, deformation)
rdd = raw.flatMap(find_rois)
.map(flow_features)
Preprocess Features Clustering
(OpenCV, scikit-image, PCA-flow)
![Page 20: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/20.jpg)
Spark Pipeline• Derive AR subspace
– Principal components– Compute AR motion
parameters A1…Ad
svd = rdd.computeSVD()
_svd_ = sc.broadcast(svd)ar = rdd.map(ar_params)
Preprocess Features Clustering
(SciPy, thunder, bolt)
![Page 21: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/21.jpg)
Spark Pipeline• Cluster parameters
– Pairwise similarity– Eigendecomposition of
graph Laplacian
L = ar.cartesian(ar) \.map(pairwise)
X = L.computeSVD()
DON’T DO THIS. EVER.
Preprocess Features Clustering
(scikit-learn)
![Page 22: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/22.jpg)
Eigenvectors of L
![Page 23: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/23.jpg)
Conclusions• 93% classification: methods are sound
– Dynamic texture representation is accurate• Low-dim embeddings of AR motion
parameters– Definitely more complicated than normal /
abnormal• Need lots of data!
![Page 24: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/24.jpg)
Big picture• Blackbox tool for clinicians
– Web front-end + Python middleware + Spark back-end
• Upload video -> Get analysis– Assist experts with diagnostics
• Expert input– Phenotype annotations, regions of interest
![Page 25: Scaling Unsupervised Ciliary Motion Analysis for Actionable Biomedical Insights with PySpark by Shannon Quinn](https://reader031.vdocuments.us/reader031/viewer/2022021500/587155691a28ab8e5b8b509f/html5/thumbnails/25.jpg)
THANK YOU.• [email protected]• @SpectralFilter• https://magsol.github.io/