© Trustees of Indiana UniversityReleased under Creative Commons 3.0 unported license; license terms on last slide.
XSEDE-enabled High-throughput Caries Lesion Activity Assessment
Hui Zhang, Guangchen Ruan, Hongwei Shen, Michael Boyles, Huian Li, Masatoshi Ando
XSEDE'13 San DiegoJuly 24th , 2013
Outline
• Background– What is caries lesion activity– Scientific goal and computing objective
• Dataset and Methods– Computing task implemented in a serial means– How Map-Reduce framework can be applied
• Assessment Examples– Visualization and analysis – Qualitative and quantitative lesion activity
assessment
• Conclusion and Future Work
Introduction
• Dental caries management project in IUSD (2010 ~)– Scientific goal: reduce, or reverse the prevalence
of dental caries lesion active → inactive → reversed • Active lesion is a caries lesion that exhibits evidence
of progression for a specific period of time» losing mineral content (or, demineralization)
• Inactive/arrested lesion is a caries lesion that exhibits no evidence of progression for a specific period of time
• Reversed (with treatments)» gaining mineral content (or, remineralization )
Introduction
• Lesion activity assessement (arrested or active) is important
– essential and critical in dental studies– critical impact on dental treatment decision-
making– incorrect determination can easily result in
wrong treatment
Introduction
• But …….Today in dental clinical practice visual and tactile
inspections are commonly used :– subjective– dependent on observer's experience to be accurate– results often in-consistent
» tracking» temporal comparison
Visual Assessment
Tactile Sensation
Introduction
• (Dental) Computing objective– Bring computers and computing technologies to
dentistry research» dental imaging technology
(µ-CT imaging→ cross-sectional dental scans)» image segmentation
(cross-sectional scans→ ROIs)» visualization and analysis
(lesion activity assessment → 3D-time series analysis)
– Design methods not only for "marking" on dental scans, but also quantifying the volumetric information in the assessment
– Use HPC and parallel computing to scale to larger datasets
Datasets and Methods
• The study reported195 ground/polished 3x3x2mm blocks prepared from extracted human teeth collected from Indiana dental practitioners (approved by IU IRB#0306-64)
a: Dimension b: Region of interest (ROI)
Schematic diagrams showing specimen dimension (a), and region of interest (b).
Datasets and Methods
• Longitudinal dental experiment• uses 5-phase dem./rem. model • healthy1→dem2 →dem3→dem4 →rem5
• temporal evaluation– U-CTs– specimen/phase
Datasets and Methods
• µ-CT Dental Scans– ~1000 scans per specimen per time point– each u-CT scan
• 16-bit gray-scale image• 1548×1120 resolution • ~1.65 MB size• lesion on u-CT scan shows observable gray-scale difference
Datasets and Methods• 3D-Time Series Analysis Workflow (to quantify and
compare volumetric lesion information over time)– Pre-analysis training
• threshold, pivot values (based on histograms)
– Region-of-interest (ROI) segmentation• blob detection, morphological operation
– 3D construction• stacking ROIs, generating isosurface and
geometry
– Visual analysis (on volumetric models)• temporal comparison
– How lesion evolves on same specimen
• cross-conditional comparison– How lesion evolves with different treatments
Datasets and Methods
• The Serial Implementation Model – A small collection of representative dental scans
• threshold, valley grayscales, pivot values
Datasets and Methods
• The Serial Implementation Model – A small collection of representative dental scans
• threshold, pivot values– Segment ROIs on all scans (with established parameters)
• binary image conversion• apply morphological operations (erosion and dilation)
to remove false ROI candidates• blob detection → ROI boundary• processing images to keep only relevant pixels
Datasets and Methods
• The Serial Implementation Model – Select representative dental scans
• Threshold, pivot values– Segment ROIs on all scans
• binary image conversion• apply morphological operations (erosion and dilation)
to remove false ROI candidates• blob detection → ROI boundary• processing images to keep only relevant pixels
– 3D construction• stack ROIs and visual analysis
Datasets and Methods
• The Parallel Model • MapReduce - center around 2 func. to
represent domain problems• General pattern
Map(Di) → list(Ki,Vi); Reduce(Ki, list(Vi)) → list(Vf)• Divide the dataset D into individual data values Di
• Map(Di) is applied to each individual value, producing many lists of key value pairs list(Ki,Vi)
• Data produced by Map operations will be grouped by key Ki, producing associated values list(Vi)
• Reduce(Ki, list(Vi)) takes each key Ki and associated list of values list(Vi) to produce a list of final output values
Datasets and Methods• Lesion activity assessment using Map-
Reduce
D ∑ Ii
Di Ii
Ki PhaseID
Vi roiByteArray
Vf 3DModelByteArray
Map(Di) → list(Ki,Vi):•performs ROI segmentation; •extract image phaseID (encoded in filename); •produce (phaseID, roiByteArray) as key-value pair
Reduce(Ki, list(Vi)) → list(Vf) :•receives ROI collections keyed to phaseID;•performs 3D construction;•produce (phaseID, 3DModelByteArray) pair
Datasets and Methods
• Better performance with sequence files and data compression• Hadoop excels in processing small # of large files • Too many I/O operations → extra burden • Implementation
– Data packing before 3D-time series workflow– Map task loads images– Reduce task
» produce sequence files» apply compression
Datasets and Methods• Computing setup and parameters
– 64-node cluster on SDSC-Gordon• 8 Map slots 4 Reduce slots
– Used DEFLATE codec and block compression for sequence files
– 40,000 images in 12.62 minutes– More performance and scalability data reported in “
Exploting MapReduce and Data Compression for Data-intensive Applications“
Lesion Activity Assessment
• Quantitative Assessment– lesion and its volumetric change measured in
pixel^3– objective and consistent comparisons across
specimen and across different experimental conditions
– scalable to larger datasets
Lesion Activity Assessment
• 3D-Time Series Visualization– highlight lesion's volumetric changes B/A treatment
Lesion Activity Assessment
• 3D-Time Series Visualization– show lesion's volumetric changes B/A treatment– combine dem. and rem.
enamel in an integrated view with transparency
Lesion Activity Assessment
• Shape Generation and Depth Measure– some studies concern finding the association
between lesion depth and treatment variables
previous effort:approximate lesion depth based grayscale on QLF images
Lesion Activity Assessment
• Shape Generation and Depth Measure– some studies concern finding the association
between lesion depth and treatment variables
Lesion Activity Assessment
• Shape Generation and Depth Measure– some studies concern finding the association
between lesion depth and treatment variables– 3D Poisson surfaces constructed for interactive
depth measurement and comparison
Conclusion
• Dental computing gives rise to a broad range of educational and treatment planning applications for dentistry;
• A promising research approach that allows users to use imaging technology, computational algorithm, and visualization methods to make lesion activity assessment faster and more accurate;
• The workflow can be supported computationally; implemented using parallel programming model such as MapReduce; further automated using HPC resources.
Future Work
• Provide templates to other domains with similar computing task
• Potential improvement of the workflow– The final result is much lighter compared to
raw inputs• Data transfer with ROI boundary vectors
instead of heavy image arrays • Compression of intermediate analysis results
Thank you!Questions?