theia:networkingforultra3 densedatacenters€¦ · theia:networkingforultra3 densedatacenters meg...

24
Theia: Networking for Ultra Dense Data Centers meg walraedsullivan, Jitendra Padhye, David A. Maltz Microso= HotNets 2014 Simple and Cheap

Upload: others

Post on 10-Mar-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Theia:  Networking  for  Ultra-­‐Dense  Data  Centers

meg  walraed-­‐sullivan,  Jitendra  Padhye,  David  A.  Maltz  Microso=  

 HotNets  2014  

Simple  and  Cheap

Page 2: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

SeaMicro  

Ultra-­‐Dense  Data  Centers    (UDDCs)

• Data  centers  are  expensive  to  build  •  So  we  try  to  pack  more  hardware  into  exisIng  data  centers  • One  way:  pack  more  CPUs  into  a  rack  

FireBox  HP  Moonshot  Intel  RSA  

Page 3: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

UDDC  Challenges

•  System  management  • Power  and  cooling  •  Failure  recovery  • How  to  tailor  applicaIons  • Networking  

•  System  management  • Power  and  cooling  •  Failure  recovery  • How  to  tailor  applicaIons  • Networking  

TradiIonal  ToR-­‐based  architectures  no  longer  appropriate  due  to    monetary  cost  

physical  space  requirements    oversubscripIon  

 

Page 4: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Rack  

servers  

ToR  Rack  

servers  

ToR  Rack  

servers  

ToR  Rack  

servers  

ToR  

………  

Rack  

servers  

ToR  Rack  

servers  

ToR  Rack  

servers  

ToR  Rack  

servers  

ToR  

………  

Data  Center  Networks  Today

………  

Page 5: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Why  Rethink  the  Architecture?

Rack  

servers  

ToR  

Rack  

servers  

servers  

ToR  

Server  

servers  servers  servers  

servers  servers  servers  

Server  servers  servers  servers  

servers  servers  servers  servers  

Server  servers  servers  servers  

servers  servers  servers  servers  

Rack  

Hundreds/Thousands  of  Servers  

or  SoCs  

ToR  

Rack  

Fewer  servers  

ToR  ToR  

ToR  

ToR  

ToR   ToR  ToR  

ToR  

Page 6: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Why  Rethink  the  Architecture?

• Problem:  need  to  connect  hundreds/thousands  of  servers    •  To  each  other  •  To  rest  of  data  center  

• Naïve  soluIons  won’t  work  (cost,  power,  space)  •  Can’t  build  a  thousand-­‐port  ToR  •  Can’t  add  many  ToRs  per  rack  

•  Trade  star  topology  for  fixed,  direct-­‐connect  topology  •  Upside:  cheap,  no  power,  small  physical  space  •  Downside:  lose  full  bisecIon  bandwidth,  flexible  topology  

Rack  

Hundreds/Thousands  of  Servers  

or  SoCs  

ToR  

Rack  

Fewer  servers  

ToR  ToR  

ToR  

ToR  

ToR   ToR  ToR  

ToR  

Page 7: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Theia

• Preliminary  design  for  UDDC  network  architecture  •  Building  out  and  evaluaIng  new  vendor  hardware  •  Design  will  undoubtedly  change  as  we  progress  

• Goal  is  a  simple,  pracIcal,  cheap  design  •  Beg,  borrow,  and  steal  from  exisIng  technologies  •  Throw  hardware  at  the  problem  when  it  is  cheap,  so=ware  when  not  

•  Theia  is  meant  to  start  a  conversaIon  about  UDDCs    

Page 8: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Rack  

Hundreds  of  servers  

The  Theia  Architecture ToR  

………  

•  Start  with  tradiIonal  rack  

Page 9: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Rack  The  Theia  Architecture ToR  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

………  

•  Start  with  tradiIonal  rack  • Divide  servers  into  SubRacks  

Page 10: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Rack  The  Theia  Architecture

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

………  

•  Start  with  tradiIonal  rack  • Divide  servers  into  SubRacks  • Replace  ToR  with  fixed  circuit  interconnect  (patch  panel)  

Page 11: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Rack  The  Theia  Architecture

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

………  

•  Start  with  tradiIonal  rack  • Divide  servers  into  SubRacks  • Replace  ToR  with  fixed  circuit  interconnect  (patch  panel)  • Connect  racks  to  one  another  using  spare  patch  panel  ports  

ToR  

………  

………  

ToR  

………  

Page 12: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Theia  Architecture:  SubRacks Rack  

………  

•  SubRack  ≈  1-­‐2  rack  units    • CPUs  connected  via  In-­‐Chassis  Switch  (ICS)  

•  Like  our  own  “mini  ToR”  but..  •  ICS-­‐to-­‐CPU  connecIons  are  copper,  not  cable  •  Very  licle  physical  space  required  

 SubRack  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

ICS  

………  10s  of  

SubRacks/rack  

SubRack  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

ICS  10s  of    CPUs/SubRack  

Page 13: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Theia  Architecture:  SubRacks Rack  

………  

•  Can  tune  (at  deployment  Ime)  number  of  downlinks  (ICS-­‐CPU)  vs.  uplinks  (ICS-­‐patch  panel  and  rest  of  rack)  

 

•  Tradeoff  at  ICS:  aggregaIon  vs.  oversubscripIon  •  OversubscripIon  raIo:  #  uplinks  :  #  CPUs  

  SubRack  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

ICS  

………  

SubRack  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

CPU  

ICS  IniIally,  ≤  ten  

uplinks    

Page 14: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Theia  Architecture:  Patch  Panel Rack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

SubRack  

………  

•  Patch  panel  connects  SubRacks    to  one  another  •  10s  of  SubRacks  with  ~10  uplinks  =  hundreds  of  ports    

•  OpIcal  patch  panel  implements  a  fixed  circuit  topology  •  No  acIve  components  •  Draws  no  power  •  Compact  •  Adds  no  queuing  delay  •  Cabling  is  simple  (underlying  topology  is  hidden)  

•  Tradeoff:                cost  (power,  space,  $)  vs.  fixed,  direct  topology  

 

                                                                         

Page 15: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Theia  Architecture:  Inter-­‐Rack  ConnecHvity

• Repurpose  “le=over”  patch  panel  ports  to  interconnect  racks  •  Link  between  2  racks  may  be  groups  of  mulIple  links    

ToR  

………  

ToR  

………  

ToR  

………  

Page 16: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Theia  Architecture:  Inter-­‐Rack  ConnecHvity

• Repurpose  “le=over”  patch  panel  ports  to  interconnect  racks  •  Link  between  2  racks  may  be  groups  of  mulIple  links  • Build  larger  topology  w/  each  rack  as  a  “super  node”  

 

Page 17: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

What  about  oversubscripHon?

• At  this  scale,  oversubscripIon  is  unavoidable  • More  rack-­‐locality  can  be  expected     ToR  

………  

ToR  

………  

Tune  this  oversubscripIon  by  allocaIng  patch  panel  ports  to  in-­‐rack  interconnect  (purple)  or  inter-­‐rack  interconnect  (red)  

Page 18: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

What  about  through-­‐traffic?

•  Traffic  passes  through  intermediate  racks    •  Traffic  traverses  the  patch  panel  (and  therefore  ICSs)  

ToR  

………  

ToR  

………  

ToR  

………  

Page 19: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Patch  Panel  Topology

• Graph  in  which  •  Each  node  is  an  ICS  (and  its  corresponding  SubRack)  •  Links  are  implemented  by  patch  panel  internals  

• What  we  care  about:  •  Minimize  through-­‐traffic:  latency  and  failure  resilience  •  Support  wide-­‐range  of  graph  sizes:  UDDCs  are  sIll  new  •  No  dependency  between  number  of  nodes  and  ports  per  node  •  Reduce  disrupIons  caused  by  failures  and  miscablings  

 

Page 20: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Patch  Panel  Topology  OpHons

                                                                         

Hypercube:  constraints  on  number  of  nodes,  port  counts,  dependency  between  the  two  (similar  for  torus,  Dcell,  Bcube,  etc)  

Jelly  fish:  allows  for  organic  growth,  but  this  is  not  needed  with  fixed  topology  patch  panel  

Circulant  Graph:  Can  build  a  performant  graph  w/  any  number  of  nodes,  port  counts.  

Page 21: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

IniHal  Topology:  Circulant  Graph

                                                                         

•  Nodes  N={0,…,N}  •  With  p  ports/node    

•  Strides  S={…s…}  s.t.  node  i  connects  to  nodes  i±s  • …A  ring  with  “short  cuts”  

•  Key  is  to  pick  good  shortcuts  given  N  and  p    

S={1,6}  Avg  Path  Len  =  1.933  ½  are  2-­‐hops  Worst  =  3  hops    

S={3,8}  Avg  Path  Len  =  2.6  ~Even  split  btwn  1,2,3,4  

Page 22: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Circulant  Graph

•  IniIal  reasons  for  choosing:  Flexibility  • Wide  range  of  graph  sizes  •  No  dependency  between  port  count  and  number  of  nodes  

•  Turns  out  to  be  quite  performant  •  Low  amount  of  “through”  traffic  •  Resilient  to  failure  in  connecIvity,  performance,  and  consistency  •  Simple,  elegant  rouIng  and  forwarding  •  Miswirings  likely  to  cause  isomorphic  graphs  

                                                                         

Page 23: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Circulant  Graph  Average  Path  Lengths    

0  

2  

4  

6  

8  

10  

12  

14  

16  

18  

16,  1  

16,  2  

16,  3  

16,  4  

16,  5  

18,  1  

18,  2  

18,  3  

18,  4  

18,  5  

20,  1  

20,  2  

20,  3  

20,  4  

20,  5  

22,  1  

22,  2  

22,  3  

22,  4  

22,  5  

24,  1  

24,  2  

24,  3  

24,  4  

24,  5  

26,  1  

26,  2  

26,  3  

26,  4  

26,  5  

28,  1  

28,  2  

28,  3  

28,  4  

28,  5  

30,  1  

30,  2  

30,  3  

30,  4  

30,  5  

32,  1  

32,  2  

32,  3  

32,  4  

32,  5  

48,  1  

48,  2  

48,  3  

48,  4  

48,  5  

64,  1  

64,  2  

64,  3  

64,  4  

64,  5  

Best  Avg.  P

ath  Length  Across  

Strid

e  Sets  

Circulant  Graph  Size  <#  Nodes,  #  Strides>  

Latency  and  Through-­‐Traffic

Page 24: Theia:NetworkingforUltra3 DenseDataCenters€¦ · Theia:NetworkingforUltra3 DenseDataCenters meg walraed*sullivan, Jitendra$Padhye,$David$A.$Maltz Microso=$ $ HotNets2014

Summary

•  ToR-­‐based  architecture  won’t  work  for  UDDCs  

•  Theia:  Preliminary  architecture  to  support  1000s  of  CPUs/rack  •  Flexibility  of  packet-­‐switched  network  over  fixed  circuit  topology  

•  Just  the  beginning  of  this  conversaIon:  •  Other  in-­‐rack  topologies…  •  Inter-­‐rack  connecIvity:  will  our  proposal  scale  to  data  center  size?  •  RouIng  and  addressing:  different  protocols  for  inter-­‐  and  intra-­‐  rack?  •  Tailoring  topology  to  workload  and  workload  to  (dense)  topology