implementing 3d spharm surfaces registration on cell b.e. processor
TRANSCRIPT
![Page 1: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/1.jpg)
Implementing 3D SPHARM Surfaces Registration on Cell ProcessorRegistration on Cell Processor
Huian Li ([email protected]) Mi Yan ([email protected])Robert Henschel (rhensche@indiana edu) Li Shen (shenli@iupui edu)Robert Henschel ([email protected]) Li Shen ([email protected])
July 29, 2009
![Page 2: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/2.jpg)
Contents• SPHARM registrationSPHARM registration• Matlab implementation
Cell implementation• Cell implementation• Performance Analysis• Conclusion
![Page 3: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/3.jpg)
SPHARM Surfaces
R di l d t ll f• Radial and stellar surfaces• Simply connected, arbitrarily shaped• Vision, graphics, imaging, bioinformatics
![Page 4: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/4.jpg)
SPHARM Expansion
( ) (x y z)( ) ( )(,) (x,y,z)
Area-preserving
(,) (x,y,z)
mapping
![Page 5: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/5.jpg)
SHREC
(a) template, (b) object, (c) after ICP, (d) after registration of parameterizationg p
![Page 6: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/6.jpg)
Calculation of coefficients• After rotating the parameter net on the surface inAfter rotating the parameter net on the surface in
Euler angles (α, β, γ), new coefficients will be:l
l
ln
nl
lmn
ml cDc )()(
where
ln
)min( mlnl
))()1(()(),min(
),0max(
)( lmnt
mlnl
mnt
tnimilmn deD
and
)!()!()!()!( llll )2()22( )2
(sin)2
(cos!)!()!()!(
)!()!()!()!()( nmttmnll
mnt tnmttmltnlmlmlnlnl
d
![Page 7: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/7.jpg)
RMSD• RMSD (Root Mean Square Distance): distanceRMSD (Root Mean Square Distance): distance
between two SPHARM models
max
2,2,1 ||||
41 L l
ml
ml ccRMSD
04 l lm
m mand are coefficients of two
SPHARM models
mlc ,1
mlc ,2
![Page 8: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/8.jpg)
Matlab implementation• A straightforward implementation in Matlab:A straightforward implementation in Matlab:
for l = 0 Lfor l = 0, Lmaxfor m = -l, l
for n = l lfor n = -l, lfor t = max(0, n-m), min(l+m, l-n)
performing calculations... performing calculations ...
• One rotation for L = 50 took 823 seconds on 2GHz quad• One rotation for Lmax = 50 took 823 seconds on 2GHz quad-core Intel Xeon E5335
![Page 9: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/9.jpg)
Cell B.E.
![Page 10: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/10.jpg)
Cell implementation• Domain decomposition:Domain decomposition:
for l = 0, Lmaxfor m = -l lfor m l, l
for n = -l, lfor t = max(0 n-m) min(l+m l-n)for t max(0, n m), min(l+m, l n)... calculations ...
• Decomposition along l leads to work load imbalance among SPUsimbalance among SPUs
• Decomposition along m creates unnecessary data p g ycommunication
![Page 11: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/11.jpg)
Cell implementation• Loop fusion:Loop fusion:
for l = 0, Lmaxfor m = -l lfor m l, l
for n = -l, lfor t = max(0 n-m) min(l+m l-n)for t max(0, n m), min(l+m, l n)... calculations ...
• Unique index for combined loop:• Unique index for combined loop: f(l, m) = l2 + m + l
W kl d f h SPE• Workload for each SPE :(Lmax + 1)2/(total # of SPEs)
![Page 12: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/12.jpg)
Cell implementation• Lookup table T for factorialLookup table T for factorial• Transform exponentials & multiplications into
multiplications & additions respectivelymultiplications & additions, respectively.
)2()22( )(sin)(cos)!()!()!()!(
)( nmttmnll mlmlnlnld
)()( )2
(sin)2
(cos!)!()!()!(
)(mnt tnmttmltnld
exp(
))()()()((21
exp(
mlTmlTnlTnlT
)()()()(2
tTnmtTtmlTtnlT
))2
log(sin)2()2
log(cos)22( nmttmnl
![Page 13: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/13.jpg)
Cell implementation• Others that specific to Cell:Others that specific to Cell:
• Vectorization & data alignmentDMA data transfer between main memory &• DMA data transfer between main memory & local storeSPU d t• SPU decrementer
![Page 14: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/14.jpg)
Cell implementation• Single precision vs. double precision: all data in single precisiong p p g p
![Page 15: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/15.jpg)
Cell implementation• Single precision vs. double precision: partial data in double precisiong p p p p
![Page 16: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/16.jpg)
Cell implementation• Single precision vs. double precision: all critical data in double precisiong p p p
![Page 17: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/17.jpg)
Performance analysis
1 8
Performance of one rotation on Cell BE
1.41.61.8
s)
11.2
econ
ds
0 40.60.8
Tim
e (s
00.20.4T
1 2 4 8 16Number of SPEs
![Page 18: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/18.jpg)
Performance analysisPerformance of finding the shortest
7000
Performance of finding the shortest distance at Level 3 on Cell BE
5000
6000
s)
4000
5000
seco
nds
2000
3000
Tim
e (s GNU gcc
IBM xlc
0
1000
04 8 12 16
Number of SPEs
![Page 19: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/19.jpg)
Conclusion• Performance increases dramatically on Cell due toPerformance increases dramatically on Cell due to
its unique architecture and algorithm optimization.• Carefulness must be taken for data placement due• Carefulness must be taken for data placement due
to limited local store.• Carefulness must also be taken for data transfer• Carefulness must also be taken for data transfer
between local store and main memory.
![Page 20: Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor](https://reader033.vdocuments.us/reader033/viewer/2022060119/558cb611d8b42ae1408b45b1/html5/thumbnails/20.jpg)
The End
Questions?Questions?