Download - Performance in Medical Image Computing
Performance in Medical Image Computing
Dr Daniel RueckertDepartment of ComputingImperial College London
Introduction
Distributed data storage
Distributed image analysis
IXI project is about application of e-science to medical imaging research
Distributed image acquisition
Workflow
Aims
• To build a grid infrastructure for medical image analysis
• Apply it to exemplars relevant to:– biomedical research – drug discovery – healthcare
• Use e-science as a driver for novel algorithm development
IXI System: Overview
• How does IXI work?• What sort of images?• How big are images?• How long do the algorithms take?• What are the end-user applications?
IXI System: Aims
• Make large remote compute resources available via Grid Services– Dedicated service for each algorithm– Be able to compose one service with another to form a
workflow• Hide the complexity from the user
– Seamless integration with interface– Invisible and secure transfer of files
• Make the results easily available– Store in a database
IXI System: Architecture
• Local network– Web portal– Relational database (image and meta data are directly
imported from DICOM)– XML database (stored workflows)– File system (image files)– Locally hosted grid service (reinsertion)
• Remotely– Registry Service (index)– Workflow Service (workflow execution)– IXI Core Service (delegation)
Matching subjectslocated
Retrieve workflow& parse XML
Web Portal
User
Relational Database
XML Database
E-mail user
Workflow Service
Condor GT3 GRAM Executelocally
File transfer via Grid FTP
Insert derived dataand copy to file system
Instantiate workflow and submit
Clicks link in e-mail
Retrieve workflow and local file locations
Select workflow template
Select files to keep
Reinsertion Service
Initiate reinsertion
IXI Core Service
Co-ordinate and delegate
File transfer via Grid FTP
Submit workflow
User enters search criteria
Finished
Workbench
Query
Reviewresults
How it works
IXI: Dynamic brain atlas demonstrator
Database of medical images
rigid/non-rigid registration
Statistical or probabilistic atlas
classification
Age 16-35
Age 35-65
Age 65+
Web Portal
User
Relational Database
XML Database
Workflow Service
Workbench
How it works: Problems
Submit workflow
E-mail user?
What sorts of images?
2D images (ie x-ray) 3D images (ie CT, MR, PET) 4D images (ie CT, MR)
How big are images?
• Current clinical routine:– MRI examination: 200 – 300 slices of 256 x 256 pixels x
2 bytes per pixel ~ 30Mbytes– CT examination: 10 – 30 slices of 512 x 512 pixels, 2
bytes per pixel ~10Mbytes– Digital x-ray: 512 x 512 pixels x 2 bytes x 8 -25fps x 100
– 500 seconds ~1.5Gbytes • but only small fraction of this used for measurement or archive
How big will images be soon?
• Latest technology:– MRI examination: 300 – 500 slices of 512 x 512 at 2
bytes per pixel ~150Mbytes– CT examination: 100 – 300 slices at 512 x 512 at 2
bytes per pixel: ~100Mbytes– And can be dynamic, eg: 10 – 50 cardiac phases
• The raw data problem: – Latest techniques manipulate raw data eg: 32 complex
channels, which is 128x larger than reconstructed data ~20Gbytes
How long do the algorithms take to run?
• Segmentation– tissue segmentation: between 30 secs and 10 minutes– anatomical segmentation: between several minutes and hours
• Registration– rigid and affine: between 30 secs and 5 minutes– non-rigid: between 10 minutes and 24 hours
• Visualisation:– rendering: near real-time even on standard PCs
Broad categories of IXI applications
• Accessing, Collecting and Mining Image Data:– Genomics, proteomics, Gene
expression– Drug discovery– Clinical Trials
• Large Scale Simulation and Analysis– Simulation of cardiac blood flow
using CFD
• Large image based databases
– Interpretation, training
• Support of multidisciplinary and collaborative environments requiring complex planning and guidance tasks
– Diagnosis
– Treatment planning
– Treatment verification
Biomedical Research Healthcare
Why does performance matter?
• Performance is mainly dependent on:– Computing time– Data transfer time– Reliability and availability of services
• Performance has different priority for different applications:– Drug discovery study with 100 subjects– Computer assisted surgery
Biomedical Research: Drug discovery
• Image mining:– Statistical parametric maps of volume change in
patients with schizophrenia undergoing drug treatment
populationtime t = 1
populationtime t = 2
reference
intersubject registration
intrasubject registration
TBM
Why does performance matter?
• Drug discovery study with 100 subjects– End user: Researcher– Computing time for each job: ca. 8 hours– Total computing time: 100 x 8 hours, but jobs can run in
parallel– Data transfer time for each job: ca. 1-2 minutes– Total transfer time: 100 x 2 minutes, however transfers
can’t run in parallel (complications: firewalls slow data transfer down significantly)
• Reliability is more important than run-time
Healthcare: Computer-assisted interventions
Use non-rigid registration to update pre-operative plan
Ideally real-time, however 10-20 minutesare acceptable
Why does performance matter?
• Computer-assisted surgery– End user: Clinicians & Surgeons– Computing time: ca. 1 – 8 hours on a workstation– Total computing time: Depending on available machine
between 10 mins (cluster) and several hours (single workstation)
– Data transfer time: Can be neglected• Reliability is important, but performance
prediction is far more important:– Which machine should I run the job?– How long will it take on that machine?
Performance modelling for image registration
source
target Rueckert et al IEEE TMI 1999
Performance modelling for image registration
Performance modelling for image registration
Initial trans-formation T
Final trans-formation T
Calculate cost functionC for transformation T
Generate new estimate of T by minimizing C
Is new transformationan improvement ?
Update trans-formation T
Non-linear
optimization
Performance modelling
• Analytical performance modelling:– Seems impossible– Not desirable since as it often
takes more time than developing the algorithms
• Experimental performance modelling:– Run algorithms with different
parameters and datasets
Work by Stephen Jarvis, Dan Spooner, Brian FoleyUniversity of Warwick
Performance modelling
• Highly variable runtime - a factor of 16 between fastest and slowest at the same image size
• Two classes of registration. Depends on destination image.
• Self registration is fast.• Significant speedup using
MPI cluster implementation• Prediction based on timing of
subsampled imagesWork by Stephen Jarvis, Dan Spooner, Brian Foley
University of Warwick
Performance modelling
• Highly variable runtime - a factor of 16 between fastest and slowest at the same image size
• Two classes of registration. Depends on destination image.
• Self registration is fast.• Prediction based on timing of
subsampled images• Significant speedup using
MPI cluster implementationWork by Stephen Jarvis, Dan Spooner, Brian Foley
University of Warwick
Performance modelling
• Highly variable runtime - a factor of 16 between fastest and slowest at the same image size
• Two classes of registration. Depends on destination image.
• Self registration is fast.• Significant speedup using
MPI cluster implementation• Prediction based on timing of
subsampled imagesWork by Stephen Jarvis, Dan Spooner, Brian Foley
University of Warwick
Performance modelling
• Highly variable runtime - a factor of 16 between fastest and slowest at the same image size
• Two classes of registration. Depends on destination image.
• Self registration is fast.• Prediction based on timing of
subsampled images• Significant speedup using
MPI cluster implementationWork by Stephen Jarvis, Dan Spooner, Brian Foley
University of Warwick
Modelling systems and applications
• Highly variable runtime - a factor of 16 between fastest and slowest at the same image size
• Two classes of registration. Depends on destination image.
• Self registration is fast.• Prediction based on timing of
subsampled images• Significant speedup using
MPI cluster implementation
What next?
• Incorporate performance modelling and predication into the IXI workflow (with help from S. Jarvis, Warwick):– to enable the user to tune parameters of the workflow
with respect to the predicted performance– to enable the user to specify performance constraints– to inform the user about progress of workflow and
provide updated measures of predicted performance– to implement different policies for scheduling for
different IXI applications and end users
What next: Challenges
• Data transfer can affect performance significantly:– Model data transfer times– Model bottlenecks such as firewalls or database servers
• Performance modelling for different algorithms is a time-consuming tedious task: – Large number of different algorithms and different implementations– Can this be automated?
• Reliability and availability is generally more important than performance, however this will change as the grid middleware and infrastructure becomes more mature
• Future projects require near real-time performance– Analyze data while patient is inside the scanner (Neurogrid)
Acknowledgements
• IXI team– Imperial College: Jo Hajnal, Andrew Rowland, Raj
Chandrashekara, Michael Burns, Dimitrios Perperidis– University College: Derek Hill, Kelvin Leung, Bea Sneller– University of Oxford: Steve Smith, John Vickers
• Stephen Jarvis, Dan Spooner, Brian FoleyHigh Performance Systems GroupDepartment of Computer ScienceUniversity of Warwick