parallel accelerator project
DESCRIPTION
Parallel accelerator project. Final presentation Summer 2008 Student Vitaly Zakharenko Supervisor Inna Rivkin Duration semester. System functionality Large picture. Multiple signal sources share the same media. Each source produces a periodic pulse sequence in the media. - PowerPoint PPT PresentationTRANSCRIPT
Parallel accelerator project
Final presentationSummer 2008
Student Vitaly ZakharenkoSupervisor Inna Rivkin Duration semester
System functionality Large picture
◦ Multiple signal sources share the same media.◦ Each source produces a periodic pulse sequence in
the media. ◦ Observer of the media senses superposed pulse
sequences with the addition of noise. ◦ Preprocessor detects pulses in the signal and
stores each pulse as pulse TOA (time of arrival). ◦ The pulse TOA array produced by the preprocessor
is conveyed to the system.
◦The system separates pulses into original signals (i.e. into periodic pulse sequences).
Signal produced by source # 1
Signal produced by source # 2
Signal as seen by observer
TOA1 TOA2 TOA3 TOA4 TOA5 TOA6 TOA7 TOA8 TOA9 TOA10 TOA11
TOA1 TOA2 TOA3 TOA4 TOA5 TOA6 TOA7 TOA8 TOA9
Data structure for signal representation
Missing pulse effect Missing pulse effect
TOA1 TOA2 TOA3 TOA4 TOA5 TOA6 TOA7 TOA8 TOA9
System output : pulses separated by source
System components
SimulatorOn a PC constructs datagrams.
Datagram switchOn the FPGA manages flow of datagrams between the simulator and the processing units.
Data processing unitsOn the FPGA each unit processes datagrams.
Main system components
Simulator
Switch
Processing unit
Processing unit
Processing unit
Processing unit
Processing unit
Processing unit
FPGA
PC
Data processing unitsEach unit contains Nios II processor and C2H generated H/W accelerators.
Sequence search C2H generated accelerator
Histogram builder C2H generated accelerator
Nios II embedded processor
Avalon switchfabric
Avalon switchfabric
Data processing algorithm
for {level} := 1 up to {maximum level} do 1. Build histogram of differences (SDIF) of level:= {level}.2. Add SDIF to cumulative histogram (CDIF).
3. Find lowest periodicity column of CDIF above threshold.4. if {column found} = TRUE then
4.1. Detect all pulse sequences of the periodicity.4.2. Mark pulses as associated.
end if 5. Check whether to break the loop.
end for
Source 1 signal
Source 2 signal
Source 3 signal
Observed signal
a b c a b c a b c a b c a b c
Data processing example
Observed signal
a b c a b c a b c a b c a b c
c
ab
SDIF(LEVEL = 1) CDIF
c
ab
CDIF
Cumulative histogram (CDIF) update
Data processing example
c
ab
CDIF
Threshold crossing check
Threshold function
No periodicity candidateNo sequence search
Data processing example
Observed signal
a b c a b c a b c a b c a b c
a+b c+a b+c
ca b
CDIF
Cumulative histogram (CDIF) update
b+cc+a
a+b
SDIF(LEVEL = 2)
ca b
CDIF
b+cc+a
a+b
Data processing example
Threshold crossing check
No periodicity candidateNo sequence search
Threshold function
ca b
CDIF
b+cc+a
a+b
Data processing example
Observed signal
a b c a b c a b c a b c a b c
a+b c+a b+c
a+b+c
Cumulative histogram (CDIF) update
ca b
CDIF
b+cc+a
a+b
SDIF(LEVEL = 3)
a+b+c
a+b+c
ca b
CDIF
b+cc+a
a+b
Data processing example
Threshold crossing check
Threshold function
Search for all sequences of periodicity (a+b+c)
a+b+c
ca b
CDIF
b+cc+a
a+b
Threshold satisfied by periodicity (a+b+c)
Data processing example
Detected sequence # 1
Data processing example
Detected sequence # 2
Detected sequence # 3
Sequence search results (final results)
Input datagram format
TOA 1
IDControl Bits Len
TOA 2
... TOA N
64 bits
Output datagram format
Control fields set Length IDTotal pulses associated Total sequences detected
Association of pulse 1Association of pulse 2…Association of pulse N
Total pulses associated with sequence 1 PRI of sequence 1Jitter of sequence 1Confidence level 1 of sequence 1Confidence level 3 of sequence 1
PRI of sequence 2…
2 2
4
4
4
2
4
2
1
1
… 1
4 4
4 …
Field name Size (bytes)
Implementation for Nios II Testing and profiling
• In Visual Studio (VS) floating point calculations were replaced by fixed point
• C code of the algorithm was ported from VS to Nios IDE
• Algorithm was profiled on Nios II
SoPC system generation
H/w design was generated inAltera SoPC Builder environment
Different SoPC system configurations were compared
SoPC system was optimized ◦multiple clock domains were provided
for◦interconnect was minimized◦different processor types were
compared
SoPC system generation
C2H Acceleration C2H h/w accelerators were
generated for two blocks of the algorithm: ◦Sequence search function (FindSeqs) ◦Histogram builder function
(BuildHist)
C2H acceleratorsPerformance optimization
Sequence search (FindSeqs) function acceleration◦Accelerator results unsatisfactory◦Consumes great amount of FPGA
logic ◦Low acceleration gain (X4 at most)◦Discarded after much efforts wasted
in optimization
C2H acceleratorsPerformance optimization
Sequence search (BuildHist) function acceleration◦Good acceleration results ◦X50 acceleration gain◦Moderate FPGA logic consumption
Design performanceFPGA resources
6% logic consumption 5% memory
consumption
Design performance Timing
1 up to 7 ms processing time3 Nios systems significantly
outperform Pentium 4 processor