large-scale reconfigurable computing in a microsoft · pdf filedocument scoring request...
TRANSCRIPT
![Page 1: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/1.jpg)
Large-Scale Reconfigurable Computing in a Microsoft Datacenter
![Page 2: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/2.jpg)
Capabilities, Costs ∝𝑷𝒆𝒓𝒇𝒐𝒓𝒎𝒂𝒏𝒄𝒆/𝑾𝒂𝒕𝒕
$
![Page 3: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/3.jpg)
![Page 4: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/4.jpg)
![Page 5: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/5.jpg)
ASICs FPGAs
Source: Bob Broderson, Berkeley Wireless group
![Page 6: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/6.jpg)
Xeon CPU NIC
![Page 7: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/7.jpg)
Xeon CPU NIC Search Acc. (FPGA)
Search Acc. (ASIC)
Wasted Power,
Holds back SW
Xeon CPU NIC Search Acc. v2 (FPGA)
NIC Xeon CPU Math Accelerator
Wasted Power, One more thing that
can break
![Page 8: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/8.jpg)
![Page 9: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/9.jpg)
![Page 10: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/10.jpg)
•
•
•
•
•
• 1U, 2U, or 4U rack-mounted
• 1/2/4 x 10Ge ports
• Up to 4 PCIe x16 slots
• 2 sockets, 6-core Intel Westmere
![Page 11: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/11.jpg)
![Page 12: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/12.jpg)
![Page 13: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/13.jpg)
http://www.globalfoundationservices.com/posts/2014/january/27/microsoft-contributes-cloud-server-specification-to-open-compute-project.aspx
![Page 14: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/14.jpg)
• Two 8-core Xeon 2.1 GHz CPUs
• 64 GB DRAM
• 4 HDDs @ 2 TB, 2 SSDs @ 512 GB
• 10 Gb Ethernet
• No cable attachments to server 68 ⁰C
![Page 15: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/15.jpg)
• Altera Stratix V GS D5 • 172k ALMs, 2,014 M20Ks, 1,590 DSPs
• 8GB DDR3-1333
• 32 MB Configuration Flash
• PCIe Gen 3 x8
• 8 lanes to Mini-SAS SFF-8088 connectors
• Powered by PCIe slot
Stratix V
8GB DDR3
PCIe Gen3 x8
4x 20 Gbps Torus Network
Config Flash
![Page 16: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/16.jpg)
FPGA
Mezz Conn.
1U
![Page 17: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/17.jpg)
![Page 18: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/18.jpg)
![Page 19: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/19.jpg)
Data Center Server (1U, ½ width)
![Page 20: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/20.jpg)
FPGA FPGA FPGA FPGA
Web Search Pipeline
![Page 21: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/21.jpg)
FPGA FPGA FPGA FPGA
Math Acceleration
Service Comp.
Vision
Service
Physics
Engine
Web Search Pipeline
![Page 22: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/22.jpg)
West SLIII
East SLIII
South SLIII
North SLIII
x8 PCIe Core
DMA Engine
Config Flash (RSU)
DDR3 Core 1 DDR3 Core 0
JTAG
LEDs
Temp Sensors
Application
Shell
I2C
xcvr reconfig
2 2 2 2
4 256 Mb
QSPI Config Flash
4 GB DDR3-1333 ECC SO-DIMM
4 GB DDR3-1333 ECC SO-DIMM
Host CPU
72 72
Role
8
Inter-FPGA Router SEU
![Page 23: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/23.jpg)
IFM 1
IFM 2
IFM 44
IFM 3
IFM 1
IFM 2
IFM 44
IFM 3
IFM 1
IFM 2
IFM 44
IFM 3
SaaS 1
SaaS 2
SaaS
48
SaaS 3
Ranking-as-a-Service (RaaS)
- Compute scores for how relevant each selected
document is for the search query
- Sort the scores and return the results
Selection-as-a-Service (SaaS)
- Find all docs that contain query terms,
- Filter and select candidate documents for
ranking
Selection as a Service (SaaS)
IFM 1
IFM 2
IFM 44
IFM 3
IFM 1
IFM 2
IFM 44
IFM 3
IFM 1
IFM 2
IFM 44
IFM 3
RaaS 1
RaaS 2
RaaS
48
RaaS 3
Ranking as a Service (RaaS)
Query
Selected
Documents 10 blue links
Ported to Catapult
![Page 24: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/24.jpg)
Query: “FPGA Configuration”
NumberOfOccurrences_0 = 7 NumberOfOccurrences_1 = 4 NumberOfTuples_0_1 = 1 {Query, Document}
L2 Score
Document
Score
![Page 25: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/25.jpg)
FFE #1 =(2*NumberOfOccurrences_0 + NumberOfOccurrences_1)
(2 * NumberOfTuples_0_1)
{Query, Document}
L2 Score
Document
Score
NumberOfOccurrences_0 = 7 NumberOfOccurrences_1 = 4 NumberOfTuples_0_1 = 1
Metafeature #1 = 9
![Page 26: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/26.jpg)
PCIe
Distribution latches Control/Data
Tokens
Compressed
Document
Feature
Gathering
Network
Free Form
Expression
(FFE)
Stream
Preprocessing
FSM
• 196 feature families
• 54 state machines
• 2.6K dynamic features extracted in
less than 4us (~600us in SW)
![Page 27: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/27.jpg)
Thread 0 Thread 1 Thread 2 Thread 3
F
Feature
Store
E M W D
I-Mem Scheduler
Core 0 Core 1 Core 2
Core 3 Core 4 Core 5
Complex FST
Ou
tp
ut
Cluster
0
![Page 28: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/28.jpg)
FFE: Free-Form
Expressions
FE: Feature Extraction
FPGA 0
FPGA 1
FPGA 2
FPGA 3
FPGA 4
FPGA 5
FPGA 6
FPGA 7
Server
Server
Server
Server
Server
Server
Server
Server
Document
Scoring
Request
8-Stage Pipeline
Compute
Score
Route to
Head
Return
Score
RaaS Servers Document
Score
Document
Scoring
Request
Compute
Score
Route to
Head
Return
Score
![Page 29: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/29.jpg)
![Page 30: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/30.jpg)
Accelerating Large-Scale Services – Bing Search 1,632 Servers with FPGAs Running Bing Page Ranking Service (~30,000 lines of C++)
More compute time for improving relevance
Reduced # of servers
![Page 31: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/31.jpg)
![Page 32: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/32.jpg)
West SLIII
East SLIII
South SLIII
North SLIII
x8 PCIe Core
DMA Engine
Config Flash (RSU)
DDR3 Core 1 DDR3 Core 0
JTAG
LEDs Temp Sensor
s
Application
Shell
I2C xcvr
reconfig
2 2 2 2
4 256 Mb
QSPI Conf
ig Flash
4 GB DDR3-1333 ECC SO-DIMM
4 GB DDR3-1333 ECC SO-DIMM
Host
CPU
72 72
Role
8
Inter-FPGA Router SEU
Core 0 Core 1 Core 2
Core 3 Core 4 Core 5
Complex FST
Ou
tp
ut
Cluster 0
![Page 33: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/33.jpg)
![Page 34: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/34.jpg)
![Page 35: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/35.jpg)
Huge thanks to our partners at
Top Row: Eric Peterson, Scott Hauck, Aaron Smith, Jan Gray, Adrian M. Caulfield, Phillip Yi Xiao, Michael Haselman, Doug Burger Bottom Row: Joo-Young Kim, Stephen Heil, Derek Chiou, Sitaram Lanka, Andrew Putnam, Eric S. Chung, Not Pictured: Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Amir Hormati, James Larus, Simon Pope, Jason Thong
![Page 36: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/36.jpg)
Enter your questions here
![Page 37: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/37.jpg)
![Page 38: Large-Scale Reconfigurable Computing in a Microsoft · PDF fileDocument Scoring Request 8-Stage Pipeline Compute Score Route to Head Return Score RaaS Servers Document Score Document](https://reader033.vdocuments.us/reader033/viewer/2022051722/5aa905f27f8b9a6c188c49ed/html5/thumbnails/38.jpg)