ongoing developments of fpgas and pcie technologies - cfd...
Post on 13-Oct-2019
0 Views
Preview:
TRANSCRIPT
OngoingOngoing developmentsdevelopments ofof FPGAsFPGAsandand PCIePCIe technologiestechnologies
-- CFD CFD accelerationacceleration --
Gabriel Gabriel CaffarenaCaffarenaLaboratoryLaboratory ofof IntegratedIntegrated SystemsSystems (LSI)(LSI)
Universidad PolitUniversidad Politéécnica de Madridcnica de Madrid
CFD CFD onon FutureFuture ArchitecturesArchitecturesCC22AA22SS22E E –– DLR BraunschweigDLR Braunschweig
OctoberOctober 20092009
16/10/2009 2FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
AgendaAgenda
CFD with CFD with FPGAsFPGAs
Hardware Design MethodologyHardware Design Methodology
Current resultsCurrent results
Future workFuture work
1 Senior researcher1 Senior researcher3 Researchers (PhD)3 Researchers (PhD)
4 PhD candidates4 PhD candidates4 Students4 Students
Scientific applications acceleration: CFD, Scientific applications acceleration: CFD, bioinformaticsbioinformatics CAD tools (Precision analysis, CCAD tools (Precision analysis, C--toto--HW, power, etc.)HW, power, etc.) FPGA prototyping (Wireless, Cryptanalysis, etc.)FPGA prototyping (Wireless, Cryptanalysis, etc.)
Universidad Universidad PolitecnicaPolitecnica de Madrid:de Madrid: www.upm.eswww.upm.esLSI:LSI: www.lsi.die.upm.eswww.lsi.die.upm.es
gabriel@die.upm.esgabriel@die.upm.es
16/10/2009 4FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
AgendaAgenda
CFD with CFD with FPGAsFPGAs
Hardware Design MethodologyHardware Design Methodology
Current resultsCurrent results
Future workFuture work
16/10/2009 5FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
CFD with CFD with FPGAsFPGAs
CFD is essential for the aeronautics industryCFD is essential for the aeronautics industry
A huge amount of A huge amount of computational power computational power is is requiredrequired
PC HOST
CFD with CFD with FPGAsFPGAs
Use of Use of customcustom hardwarehardwareFieldField--Programmable Gate ArraysProgrammable Gate Arrays ((FPGAFPGAss))
PCIePCIe InIn--socketsocket
CFD with CFD with FPGAsFPGAsFPGAsFPGAs
ASIC ASIC Fixed hardwareFixed hardware FPGA FPGA Configurable hardware Configurable hardware Microprocessor Microprocessor Programmable hardwareProgrammable hardware
16/10/2009 8FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
CFD with CFD with FPGAsFPGAsDesign flowDesign flow
Analyze CFD code (C)Analyze CFD code (C) Design Design custom processorcustom processor
Computation: Computation: CFD, CFD, mathematical precisionmathematical precision Communications: Communications: Host Host FFPGA, RAM PGA, RAM FPGAFPGA
Intellectual Property (IP)Intellectual Property (IP)
Implement HW: VHDL, CAD toolsImplement HW: VHDL, CAD tools Develop HWDevelop HW--SW interface: SW interface: APIAPI Integrate and validateIntegrate and validate
16/10/2009 9FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
CFD with CFD with FPGAsFPGAs
Total controlTotal control CFD HW processorCFD HW processor Custom precisionCustom precision High speedupsHigh speedups
Many decisions to makeMany decisions to make Complexity Complexity Longer design timesLonger design times
Solution:Solution:HighHigh--level inlevel in--house and commercial toolshouse and commercial tools
16/10/2009 10FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
AgendaAgenda
CFD with CFD with FPGAsFPGAs
Hardware Design MethodologyHardware Design Methodology
Current resultsCurrent results
Future workFuture work
16/10/2009 11FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Hardware design methodologyHardware design methodology
Analyze CFD code (C)Analyze CFD code (C) HWHW--oriented C codeoriented C code Precision analysis Precision analysis FixedFixed--pointpoint VHDL code and CFD processor architectureVHDL code and CFD processor architecture FPGA programming fileFPGA programming file
Synthesis + Debugging Synthesis + Debugging VHDLVHDL--toto--gatesgates Place and Route Place and Route Location and interconnectionLocation and interconnection PostPost--place and route debuggingplace and route debugging BitstreamBitstream generationgeneration
16/10/2009 12FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Hardware design methodologyHardware design methodology
Analyze CFD code (C) Analyze CFD code (C) MANUALMANUAL
HWHW--oriented C code oriented C code MANUALMANUAL
Precision analysis Precision analysis AUTOMATICAUTOMATIC
VHDL codeVHDL code Computation Computation AUTOMATICAUTOMATIC Communications Communications MANUALMANUAL
FPGA programming FPGA programming Standard FPGA flowStandard FPGA flow
16/10/2009 13FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Hardware Design MethodologyHardware Design MethodologyPrecision AnalysisPrecision Analysis
FloatingFloating--point point vsvs fixedfixed--pointpoint FixedFixed--point results in faster, lowpoint results in faster, low--power, lowpower, low--area designsarea designs
Automatic precision analysis toolAutomatic precision analysis tool Design time reduction: from months to hoursDesign time reduction: from months to hours
Custom precisionCustom precision Custom precision for each variable/blockCustom precision for each variable/block Control on results accuracy: 10Control on results accuracy: 10--55, 10, 10--66, , …… ““SmallerSmaller”” HW resources HW resources Faster processorFaster processor
Detailed precision requirements for CFD?Detailed precision requirements for CFD?
16/10/2009 14FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Hardware Design MethodologyHardware Design MethodologyPrecision AnalysisPrecision Analysis
SensibilityAnalysis
SensibilityAnalysis
Error=f(parameters, bits)Error=f(parameters, bits)Fast
WL OptimizationFast
WL Optimization
CFD codeCFD code
Accuracy CheckAccuracy Check Double-precision vs Fixed-point
Error parametersError parameters
16/10/2009 15FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
if (z!=b-c)z=(a+b)*c;
if (z!=b-c)z=(a+b)*c;
Hardware Design MethodologyHardware Design MethodologyC to VHDLC to VHDL
Automation of processAutomation of process CCIntermediateIntermediate LanguageLanguageVHDLVHDL
Design time reduction (x20)Design time reduction (x20)
Reduction of human errorsReduction of human errors
FPGA optimized arithmetic blocksFPGA optimized arithmetic blocks HighHigh--speed approach: speed approach: Full pipelineFull pipeline AdAd--hoc methodology for CFDhoc methodology for CFD
Improvement over general purpose Improvement over general purpose commercial toolscommercial tools
16/10/2009 16FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Hardware Design MethodologyHardware Design MethodologyFPGA flow: FPGA flow: bitstreambitstream generationgeneration
XilinxXilinx ((www.www.xilixilinx.comnx.com))
ISEISE EDKEDK
Altera Altera ((www.altera.comwww.altera.com))
QuartusQuartus IIII NiosNios--II IDEII IDE
16/10/2009 17FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
AgendaAgenda
CFD with CFD with FPGAsFPGAs
Hardware Design MethodologyHardware Design Methodology
Current resultsCurrent results
Future workFuture work
16/10/2009 18FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Current resultsCurrent results
HightechHightech Global board Global board ((HTG-V5-PCIE-110)
XilinxXilinx FPGA (Virtex5FPGA (Virtex5--LX)LX) 1x512 MB DDR1x512 MB DDR--2 RAM2 RAM
Implementation of Sod Shock tubeImplementation of Sod Shock tube Procedural: Procedural: Euler 1D (SW)Euler 1D (SW) + + Roe 1D (HW)Roe 1D (HW) Algorithmic: Algorithmic: Euler 1D + Roe 1DEuler 1D + Roe 1D (HW)(HW)
CFD Coding style guidelines for CFD Coding style guidelines for FPGAsFPGAs
16/10/2009 19FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Current resultsCurrent resultsProcedural approachProcedural approach
Host
PCI-e
Accelerator Board
Memory
CPU
FPGA
PCI-e(rho, u, p)i
(fRoe)1,2,3
Roe 1D
PCI-e RoeInterface
Host
PCI-e
Accelerator Board
Memory
CPU
FPGA
PCI-e(rho, u, p)i
(fRoe)1,2,3
Roe 1D
PCI-e RoeInterface
Speedup x1.6 Speedup x1.6 (Theoretical limit x2.8: (Theoretical limit x2.8: AmdahlAmdahl’’s laws law))Precision 10Precision 10--55
PCIePCIe bottleneck (16 bottleneck (16 GbpsGbps 66--88 GbpsGbps))
16/10/2009 20FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Current resultsCurrent resultsAlgorithmic approachAlgorithmic approach
Speedup x7.5 Speedup x7.5 Theoretical limit Theoretical limit x30 x30 Precision 10Precision 10--44
SSingle memory and alternate read/write bottleneckingle memory and alternate read/write bottleneck Clock frequency limitations (current board and commercial tools)Clock frequency limitations (current board and commercial tools)
100 MHz 100 MHz vsvs 300 MHz300 MHz
FPGA
[rho, u, p]in
RAM
[rho, u, p]t
[rho, u, p]out
PCI-ePCI-e
RAM
DMA
CPUEuler 1D
DDRFIFOs
16/10/2009 21FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Current resultsCurrent resultsLessons learntLessons learnt
Algorithmic better than procedural approachAlgorithmic better than procedural approach x2.8 x2.8 x30 (Theoretical limits)x30 (Theoretical limits)
More than one memoryMore than one memory x30 x30 x60 (Theoretical limits)x60 (Theoretical limits)
Larger DSPLarger DSP--oriented oriented FPGAsFPGAs Communications Communications IPsIPs
Design time reductionDesign time reduction
OnOn--going research on ingoing research on in--house toolshouse tools Design time reductionDesign time reduction
New FPGA boards
DINI - XILINX
GIDEL - ALTERA
16/10/2009 22FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
AgendaAgenda
CFD with CFD with FPGAsFPGAs
Hardware Design MethodologyHardware Design Methodology
Current resultsCurrent results
Future workFuture work
16/10/2009 23FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Future workFuture work Roe 2D, Euler 2DRoe 2D, Euler 2D
Speedup depends on actual codeSpeedup depends on actual code
Algorithmic approach is the main targetAlgorithmic approach is the main target HPC FPGA boardHPC FPGA board
2 large 2 large FPGAsFPGAs 2x4 GB DDR2x4 GB DDR--2 Memories per FPGA2 Memories per FPGA Communication Communication IPsIPs
Evaluation of DDREvaluation of DDR--3 memories3 memories
16/10/2009 24FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Future workFuture work Reduce precision analysis timeReduce precision analysis time Create real CCreate real C--toto--VHDL compilerVHDL compiler Research on architecture optimizationResearch on architecture optimization
CFD processorCFD processor Memory accessesMemory accesses Multiple Multiple FPGAsFPGAs
Integration in cluster nodesIntegration in cluster nodes Communications between nodesCommunications between nodes
Future workFuture work Analyze CFD code (C)Analyze CFD code (C)
HWHW--oriented C code oriented C code AUTOMATICAUTOMATIC Precision analysis Precision analysis AUTOMATICAUTOMATIC VHDL codeVHDL code
Computation Computation AUTOMATICAUTOMATIC Communications Communications IPIP--basedbased
FPGA programming FPGA programming Standard FPGA flowStandard FPGA flow
16/10/2009 26FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Projects and collaborationsProjects and collaborations DOVRES (DOVRES (FusimFusim--E)E) DOMINODOMINO AMEBA 3AMEBA 3 AMURAAMURA
Airbus SpainAirbus Spain Universidad Universidad PolitPolitéécnicacnica de Madridde Madrid Universidad Universidad AutAutóónomanoma de Madridde Madrid INTAINTA
16/10/2009 27FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Projects and collaborationsProjects and collaborations DOVRES (DOVRES (FusimFusim--E)E) DOMINODOMINO AMEBA 3AMEBA 3 AMURAAMURA
Airbus SpainAirbus Spain Universidad Universidad PolitPolitéécnicacnica de Madridde Madrid Universidad Universidad AutAutóónomanoma de Madridde Madrid INTAINTA
16/10/2009 28FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Projects and collaborationsProjects and collaborations DOVRES (DOVRES (FusimFusim--E)E) DOMINODOMINO AMEBA 3AMEBA 3 AMURAAMURA
Airbus SpainAirbus Spain Universidad Universidad PolitPolitéécnicacnica de Madridde Madrid Universidad Universidad AutAutóónomanoma de Madridde Madrid INTAINTA
16/10/2009 29FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
Projects and collaborationsProjects and collaborations DOVRES (DOVRES (FusimFusim--E)E) DOMINODOMINO AMEBA 3AMEBA 3 AMURAAMURA
Airbus SpainAirbus Spain Universidad Universidad PolitPolitéécnicacnica de Madridde Madrid Universidad Universidad AutAutóónomanoma de Madridde Madrid INTAINTA
16/10/2009 30FPGA and PCIe technologies - Gabriel Caffarena - LSI-UPM
FutureFuture lineslines
FPGA
[rho, u, p]in
RAM
[rho, u, p]t
[rho, u, p]out
PCI-ePCI-e
RAM
DMA
CPUEuler
DDRFIFOs
RAM
GidelGidel PROCStarPROCStar IIIIII
top related