june 2007 ramp tutorial bee3 update chuck thacker john davis microsoft research 10 june, 2007
Post on 19-Dec-2015
218 views
TRANSCRIPT
BEE3 Update
Chuck ThackerJohn Davis
Microsoft Research10 June, 2007
Outline
• What is BEE3?
• BEE3 board properties
• BEE3 gateware
• BEE3 schedule
What is BEE3?
• Follow-on to BEE2 (BWRC, 2004)• Board with several highly-connected FPGAs• Vehicle for computer architecture research
– Microsoft’s primary interest
• Potential platform for high performance DSP applications– Astronomers, and perhaps others.
• Allows large scale architectural experiments– Although perhaps not as large as originally hoped– And certainly not at the speed of a real implementation
• Can scale smoothly from a single board to 64 boards (256 FPGAs)
BEE2
BEE2 – BEE3 Differences
• 4 Xilinx Virtex 5 vs 5 Virtex 2 Pro FPGAs– We use XC5VLX110T-ff1136– V2Pro is now obsolete (130nm)– V5 is a major improvement (65nm)
• 6-input LUT (64 bit DP RAM)• Better Block RAMs• Improved interconnect• Better signal integrity
• 8 Infiniband/CX4 channels vs 18• 4 x8 PCI Express Low Profile slots
BEE3 – BEE2 Differences (2)• 2 Banks DDR2 x 2 vs 4 Banks DDR2 x 1
– 64 GB capacity with 4GB DIMMs– Lower total bandwidth, but higher per-channel rate
• 500 MT/s vs 400
– Mandated by fewer signal pins on V5
• 4 10/100/1000 Ethernet channels• No PowerPCs
– This version has not yet been released by Xilinx• When it is, we can use it
BEE2 – BEE3 Differences (3)• Divided the system into two boards, Main and Control
– Main board has FPGAs, all high speed logic– Control board handles downloading, monitoring
• Being designed at BWRC– Simplifies main board engineering – can design control board in parallel– Initially, will have a simplified control board. System bring-up uses JTAG.
• Smaller main board– 211 vs 374 in2
– Fewer layers for lower cost• Much more “PC-like”• Fewer on-board peripheral interfaces
– Those that are there will work• Uses PC power supplies, peripherals• Schematic is complete, layout is in progress
– Fits in 2U enclosure– Much more attention is being given to thermal design– Must pass UL, FCC
BEE3 Main Board
User15VLXT
User25VLXT
User35VLXT
User45VLXT
DDR2 DIMM0DDR2 DIMM1
DDR2 DIMM0DDR2 DIMM1
72*
72*
72*
72*
133 133
DDR2 DIMM2DDR2 DIMM3
133133
DDR2 DIMM2DDR2 DIMM3
40x2
DDR2 DIMM0DDR2 DIMM1
DDR2 DIMM0DDR2 DIMM1
133 133
DDR2 DIMM2DDR2 DIMM3
133133
DDR2 DIMM2DDR2 DIMM3
QSH-DP-040
40x2
40x2QSH-DP-
040QSH-DP-
040
PCI-E8X
CX4
CX4
CX4
CX4
CX4
CX4 PCI-E
8X PCI-E
8X
40x2QSH-DP-
040CX4
CX4
PCI-E8X
BEE3 Board Layout
4 G
B D
DR
2-66
7 D
RA
M4
GB
DD
R2-
667
DR
AM
4 G
B D
DR
2-66
7 D
RA
M4
GB
DD
R2-
667
DR
AM
5VLXTFF1136
4 G
B D
DR
2-66
7 D
RA
M4
GB
DD
R2-
667
DR
AM
4 G
B D
DR
2-66
7 D
RA
M4
GB
DD
R2-
667
DR
AM
5VLXTFF1136
24 pin AT
X P
WR
Fujitsu 2x2 CX4
Fujitsu 2x2 CX4
PC
I-Express 8
x 50 p
in 2m
m H
ead
er1
2V8
-pin
PC
I-Express 8
x
PC
I-Express 8
x
PC
I-Express 8
x
4 GB
DD
R2-667 D
RA
M4 G
B D
DR
2-667 DR
AM
4 GB
DD
R2-667 D
RA
M4 G
B D
DR
2-667 DR
AM
5VLXTFF1136
4 GB
DD
R2-667 D
RA
M4 G
B D
DR
2-667 DR
AM
4 GB
DD
R2-667 D
RA
M4 G
B D
DR
2-667 DR
AM
5VLXTFF1136
1.0V
1.8V
1.0V
1.8V
2.5V
RJ45 RJ45
1.8V
1.8V
JTA
G
QS
H-D
P-04
0
QS
H-D
P-0
40Q
SH
-DP
-040
12V4-pin
305.00
380.00
20.00
30.00
25.00
105.00
25.00
105.00
15.00
70.00
40.00
QS
H-D
P-0
40
78.00
150.00
60.00
100.00
18.00
102.00
23.00
107.00
35.00
65.00
180.00
40.00
10.00
180.00
21.00
29.00
BEE3 Package
BEE3 Package Front View
Bandwidths (per-FPGA)• Memory
– 500 MT/s * 9B/T * 2 channels: 9.0 GB/s
• Ring– 500 MT/s * 9 B/T * 2 channels: 9.0 GB/s
• QSH– 400 MT/s * 10 B/T: 4 GB/s
• Ethernet– 125 MB/s
• CX4– 1.25 GB/s * 2 directions * 2 channels: 5GB/s
• PCI Express– Same as CX4
Initial Gateware• Mostly things required for production testing and
board characterization:• Signal connectivity checks
– Some are AC coupled, must test at speed– QSH tested with a crossover card
• Temperature and power supply monitoring• DDR-2 Controller
– Useful in other designs
• CX4 and PCI Express for at-speed tests• Xilinx and others have lots of IP
Project Participants and Roles• Microsoft Research (Silicon Valley)
– Funds, manages system engineering, does some gateware• Celestica (Ottawa and Shanghai)
– Does main board engineering, prototype fabrication– Microsoft has a very deep relationship with Celestica
• TBD (Maybe Celestica, maybe ???)– Builds and delivers functioning systems
• Function Engineering (Palo Alto)– Does thermal and mechanical engineering
• Xilinx (San Jose)– Provides FPGAs for academic machines (slowest grade)– Provides FPGA application expertise
• Ramp Group (BWRC)– Control board, basic software
• Ramp Community– Uses the systems for research– Expanding to industrial users (e.g., us)
Schedule
• Generate Specification – Done• Schematic Entry – Done• Board Layout – Started• Thermal modeling, heat sink design – Started• Chassis design -- Started• Signal Integrity – Imminent• Prototypes: Late Summer – Bring-up starts• Production: Start winter ‘07
Why is Microsoft interested?• We believe the overall RAMP effort can have significant impact, and
want to support it in the most effective way we can.– Simply paying for grad students seems suboptimal
• We observe that universities aren’t very good at this sort of system engineering and production.– Grad students are great for many things, but doing things like board
layout aren’t among them.– Requires deep understanding of tools and production processes. Pros
have this.– We can open doors that academia can’t.– We have experience in managing this sort of program.
• We want the systems ourselves– As infrastructure for our new effort in computer architecture (yes, this is
a recruiting pitch).• We also want systems to be available to other industrial users
– This might be more difficult if the systems came from academia.– But we don’t want to be in the hardware business.