speed-up of the ring recognition algorithm semeon lebedev gsi, darmstadt, germany and lit jinr,...

11
Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Upload: rolf-porter

Post on 02-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed-up of the ring recognition algorithm

Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia

Gennady OsoskovLIT JINR, Dubna, Russia

Page 2: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 2

Motivation

• Fast algorithm -> less computers requirements• Possibility to use on-line reconstruction• Many cores CPUs -> algorithms can be parallelized

I. Kisel, March 2009,CBM Coll. Meeting

Page 3: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 3

Ring recognition algorithm

Global search. Filter: algorithm compares all ring-candidates and chooses only good rings, rejecting clone and fake rings.

Standalone ring finder.

Local search of ring-candidates, based on local selection of hits and Hough Transform.

Two steps:

99%

1%

Time consumption

Page 4: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 4

Ring recognition algorithm, local searchPreliminary selection of hitsPreliminary selection of hits Histogram of ring centersHistogram of ring centers

HoughHoughTransformTransform

Ellipse fitterEllipse fitter

Ring quality Ring quality calculation calculation

Remove hits Remove hits of found ringof found ring (only best (only best matched hits)matched hits)

Ring arrayRing array

Page 5: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 5

Time consumption

Define local area and hits Hough Transform Peak finder

30% 69% 1%Timeconsumption

• Hits search• Arrays initialization

• Triple loop of ring parameters calculation

• peak finding in 2D and 1D array

Optimize hits search and arrays sizes and dimensions, remove dynamic memory allocation

Optimize calculations inside loops, decrease combinatory

Where?

Page 6: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 6

Optimization of Hough Transform

• Divide hits into a several parts• Make Hough Transform of each part independently

First part of hits Second part of hits

Hough Transform Hough Transform

Sum up histogram

Page 7: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 7

Optimization of Hough Transform: SIMD and SSE2

SSE 128-bit registers can represent:• sixteen 8-bit signed or unsigned chars,• eight 16-bit signed or unsigned shorts,• four 32-bit integers, or• four 32-bit floating point variables.

128 bit register

Four concurrent add operations

Algorithm must work with single precision type (float)

Page 8: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 8

Ring finder and SIMD

SIMD version of CalculateRingParameters(x[3], y[3], &xc, &yc, &r) was implemented.

CalculateRingParameters(x[3], y[3], &xc, &yc, &r),where x, y, xc, yc, r are floats

CalculateRingParameters(xv[3], yv[3], &xcv, &ycv, &rv),where xv, yv, xcv, ycv, rv are F32vec4

Page 9: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 9

Optimization and performance

Time per 1 events, ms 750.4 673.4 632.0 613.0 507.4

Comments Initial version Double->floatRefactoring + base class for

HT

Remove modf()

SIMD

533.6 413.0 299.4 167.6 146.1 133.4

Hits presearch

Remove dynamic allocation

SIMDDivide hits into several parts

SIMDCalc. ring

params inside loop

115.6 97.0Change

division to multiplication

RF parameters optimization

Speed up factor: 7.7Processor Intel Pentium Core2 6400 2.13 GHz

Page 10: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 10

Electron ring finding efficiency

Au-Au central collision at 25 AGev plus 5e+ and 5e- Au-Au central collision at 25 AGev plus 5e+ and 5e- Compact RICH geometryCompact RICH geometry

Page 11: Speed-up of the ring recognition algorithm Semeon Lebedev GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia Gennady Ososkov LIT JINR, Dubna, Russia

Speed up of the ring finder Dubna, 21.05.2009 11

Summary

• Ring finder was significantly optimized in terms of calculation speed without loosing an efficiency

• Next step:– HT parameters optimization– Parallelization on multi core CPU– Continue investigation of SIMD version