Transcript
Page 1: Matthias Vogelgesang, Suren Chilingaryan and Andreas Kopmann · Matthias Vogelgesang, Suren Chilingaryan and Andreas Kopmann X-Ray Tomography at Synchrotrons Image Processing Toolkit

Flexible X-Ray Image Processing on GPUsMatthias Vogelgesang, Suren Chilingaryan and Andreas Kopmann

X-Ray Tomography at Synchrotrons

Image Processing Toolkit

X-ray tomography is a non-destructive technique to analyze and visualize the inner structure of organisms and materials in three and four dimensions.

• Core written in C• Uses hardware-dependent OpenCL kernels• Python and JS bindings• Implicit buffer management• Multi-threading and implicit synchronisation using asychronous queues • Automatic node-to-GPU mapping using heuristics

from gi.repository import Ufo

g = Ufo.Graph()for f in ['reader', 'writer', \ 'fft', 'ifft', 'ramp']: globals()[f] = g.get_filter(f)

bp = g.get_filter('backproject')fft.set_properties(dimensions=1)reader.connect_to(fft)fft.connect_to(ramp)ramp.connect_to(ifft)ifft.connect_to(bp)bp.connect_to(writer)

g.run()

High pixel resolutions, bit depths and frame rates of state-of-the-art cameras increase the size of a single data set to several gigabytes. In order to process sequences of recon-structed volumes during acquisition, GPU processing is in-evitable.

Our software toolkit is used for pre-processing, image re-construction and post-processing in online and offline environ-ments. Most algorithms can be expressed as a composition of smaller sub-operations such as convolutions, filters and complex arithmetics. Thus, we map each of the processing stages to a computation graph representing the full algorithm. Each node in the graph is connected using a queue that is used to push processed data to its successor and scheduled according to its workload.

Some of its key features are:

ReferenceChilingaryan, Mirone, Hammersley, Ferrero, Helfen, Kopmann, dos Santos Rolo and Vagovic. “A GPU-Based Architecture for Real-Time Data Assess-ment at Synchrotron Experiments” IEEE Transactions on Nuclear Science, vol. 58, pp. 1447–1455, Aug. 2011.

Results

Tesla S1070

0 200 400 600 800 1000 1200

1099

269

75

64

38

Reconstruction time in seconds

2x GTX 295

2x Xeon E5472

1x GTX 280

4x GTX 580

1 2 3 4

Number of GPUs

200

400

600

800

GFL

OPS

Scalability of Back-Projection on Tesla S1070Time for reconstructing an 11 GB 3D Volume from 2000 Projections (~ 24 GB)

Acquisition of Projections

Example: Reconstruction Graph

Sinogram Generation

FFT Filter IFFT

Back-Projection onone or more GPUs

Storage Segmentation/Meshing

Except for acquisition and storage, each node is executed on one of theavailable GPUs according to a heuristic.

Hardware Setup

100 fps, 5.5 Megapixel

330 fps, 2.2 Megapixel

Sychrotron Radiation10–65 keV Energy range Data Processing

w/ up to 6 GPUsHadoop Storage Cluster2 PB disk + 2.5 PB tape

Online evaluation allows us to adjust motor and acquisition parameters

Truelinearity

Top Related