isc 12 bof: infiniband? problems? do you care?
DESCRIPTION
TRANSCRIPT
science + computing agIT services for sophisticated computer environmentsTübingen | München | Berlin | Düsseldorf
InfiniBand? Problems? Do you care?
Christian Kniep / Jan Wender
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Agenda
This is an interactive session!▪ Who is on the podium?▪ Living Histogram?▪ Getting some statistics
▪ Living Histogram
▪ Existing Monitoring Solutions▪ Discussion
▪ Quick and Dirty Analysis▪ Conclusions
2
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
On the podium
3
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
4
Founding Year
Locations
Employees ShareholderRevenue 10/11
Partners
science + computing at a glance
1989
TübingenMünchen Berlin Düsseldorf
270Bull S.A. (100%)27 Mio. Euro
Daikin Industries, JapanNICE srl, Italien Exa Corporation, USAPlatform Computing, Kanada
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
5
Brian L. Joiner, International Statistical Review / Revue Internationale de Statistique, Vol. 43, No. 3. (Dec.,1975), pp. 339-340.
Living Histogram?
Page
BoF InfiniBand | 2012-06-19 © 2012 science + computing ag
Living Histogram
6
Size of Fabric▪ <10▪ <50▪ <500▪ >500
Switch Structure▪ Switch size
▪ singular switch (mlx4036, qlogic12300)
▪ Modular switch (mlx5600, qlogic12800)
▪ Amount▪ few▪ many
Page
BoF InfiniBand | 2012-06-19 © 2012 science + computing ag
7
Living Histogram
Page
BoF InfiniBand | 2012-06-19 © 2012 science + computing ag
8
Focus▪ Stability
➡ maintenance cost▪ High-Perfomance
➡ extremly optimized
Living Histogram
Type of Use▪ Cluster Purpose
▪ Single Purpose Cluster▪ Multi Purpose Cluster
▪ Usage▪ One Job at a time▪ Multiple Jobs
Page
BoF InfiniBand | 2012-06-19 © 2012 science + computing ag
9
Living Histogram
Kind/Amount of Problems▪ Impact
▪ minor▪ major
▪ Amount▪ few▪ many
Page
BoF InfiniBand | 2012-06-19 © 2012 science + computing ag
10
Living Histogram
Page
BoF InfiniBand | 2012-06-19 © 2012 science + computing ag
11
Problem solving▪ Iterative
➡ reseat / reboot▪ Analytic
➡ dig into the problem➡ try to wipe it out
Living Histogram
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Monitoring Solutions
stable (but not useful to admins?)
▪ infiniband-diags▪ ibcheckerrors▪ ibdiagpath
▪ plugin to non-IB systems▪ nagios▪ collectl
▪ hardware vendor suites▪ Unified Fabric Manager (Mellanox)▪ InfiniBand Fabric Suites (QLogic)
12
unstable (individually carved)
▪ wrapper of infiniband-diags▪ INAM (Ohio-State-University)▪ QNIB▪ .....
not listed stuff▪ ...
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Monitoring Solutions
stable (but not useful to admins?)
▪ infiniband-diags▪ ibcheckerrors▪ ibdiagpath
▪ plugin to non-IB systems▪ nagios▪ collectl
▪ hardware vendor suites▪ Unified Fabric Manager (Mellanox)▪ InfiniBand Fabric Suites (QLogic)
13
unstable (individually carved)
▪ wrapper of infiniband-diags▪ INAM (Ohio-State-University)▪ QNIB▪ .....
not listed stuff▪ ...
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
14
Modular Switches
switchguid=0xac1(ac1)! # Spine 1Switch! 36 "S-ac1"! # "A1" enhanced port 0 lid 11 lmc 0[1]! "S-bc1"[1]! # "B1" lid 21 4xQDR[2]! "S-bc2"[1]! # "B2" lid 22 4xQDR[3]! "S-bc3"[1]! # "B3" lid 23 4xQDR
switchguid=0xac2(ac2)! # Spine 2Switch! 36 "S-ac2"! # "A2" enhanced port 0 lid 12 lmc 0[1]! "S-bc1"[2]! # "B1" lid 21 4xQDR[2]! "S-bc2"[2]! # "B2" lid 22 4xQDR[3]! "S-bc3"[2]! # "B3" lid 23 4xQDR
switchguid=0xbc1(bc1)! # Line 1Switch 36 "S-bc1"! # "B1" enhanced port 0 lid 21 lmc 0[1]! "S-ac1"[1]! # "A1" lid 11 4xQDR[2] "S-ac2"[1] # "A2" lid 12 4xQDR[3] "H-1"[1](f1) # "Host1" lid 101 4xQDR
switchguid=0xbc2(bc2)! # Line 2Switch! 36 "S-bc2"! # "B2" enhanced port 0 lid 22 lmc 0[1]! "S-ac1"[2]! # "A1" lid 11 4xQDR[2] "S-ac2"[2] # "A2" lid 12 4xQDR[3] "H-2"[1](f2) # "Host2" lid 102 4xQDR
switchguid=0xbc3(bc3)! # Line 3Switch! 36 "S-bc3"! # "B3" enhanced port 0 lid 23 lmc 0[1]! "S-ac1"[3]! # "A1" lid 11 4xQDR[2] "S-ac2"[3] # "A2" lid 12 4xQDR[3] "H-3"[1](f3) # "Host3" lid 103 4xQDR
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
15
Modular Switches
Chassis1switchguid=0xac1(ac1)! # Spine 1Switch! 36 "S-ac1"! # "A1" enhanced port 0 lid 11 lmc 0[1]! "S-bc1"[1]! # "B1" lid 21 4xQDR[2]! "S-bc2"[1]! # "B2" lid 22 4xQDR[3]! "S-bc3"[1]! # "B3" lid 23 4xQDR
switchguid=0xac2(ac2)! # Spine 2Switch! 36 "S-ac2"! # "A2" enhanced port 0 lid 12 lmc 0[1]! "S-bc1"[2]! # "B1" lid 21 4xQDR[2]! "S-bc2"[2]! # "B2" lid 22 4xQDR[3]! "S-bc3"[2]! # "B3" lid 23 4xQDR
switchguid=0xbc1(bc1)! # Line 1Switch 36 "S-bc1"! # "B1" enhanced port 0 lid 21 lmc 0[1]! "S-ac1"[1]! # "A1" lid 11 4xQDR[2] "S-ac2"[1] # "A2" lid 12 4xQDR[3] "H-1"[1](f1) # "Host1" lid 101 4xQDR
switchguid=0xbc2(bc2)! # Line 2Switch! 36 "S-bc2"! # "B2" enhanced port 0 lid 22 lmc 0[1]! "S-ac1"[2]! # "A1" lid 11 4xQDR[2] "S-ac2"[2] # "A2" lid 12 4xQDR[3] "H-2"[1](f2) # "Host2" lid 102 4xQDR
switchguid=0xbc3(bc3)! # Line 3Switch! 36 "S-bc3"! # "B3" enhanced port 0 lid 23 lmc 0[1]! "S-ac1"[3]! # "A1" lid 11 4xQDR[2] "S-ac2"[3] # "A2" lid 12 4xQDR[3] "H-3"[1](f3) # "Host3" lid 103 4xQDR
Spine1 Spine2
Line1 Line2 Line3
Host1 Host2 Host3
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
16
Chassis1switchguid=0xac1(ac1)! # Spine 1Switch! 36 "S-ac1"! # "A1" enhanced port 0 lid 11 lmc 0[1]! "S-bc1"[1]! # "B1" lid 21 4xQDR[2]! "S-bc2"[1]! # "B2" lid 22 4xQDR[3]! "S-bc3"[1]! # "B3" lid 23 4xQDR
switchguid=0xac2(ac2)! # Spine 2Switch! 36 "S-ac2"! # "A2" enhanced port 0 lid 12 lmc 0[1]! "S-bc1"[2]! # "B1" lid 21 4xQDR[2]! "S-bc2"[2]! # "B2" lid 22 4xQDR[3]! "S-bc3"[2]! # "B3" lid 23 4xQDR
switchguid=0xbc1(bc1)! # Line 1Switch 36 "S-bc1"! # "B1" enhanced port 0 lid 21 lmc 0[1]! "S-ac1"[1]! # "A1" lid 11 4xQDR[2] "S-ac2"[1] # "A2" lid 12 4xQDR[3] "H-1"[1](f1) # "Host1" lid 101 4xQDR
switchguid=0xbc2(bc2)! # Line 2Switch! 36 "S-bc2"! # "B2" enhanced port 0 lid 22 lmc 0[1]! "S-ac1"[2]! # "A1" lid 11 4xQDR[2] "S-ac2"[2] # "A2" lid 12 4xQDR[3] "H-2"[1](f2) # "Host2" lid 102 4xQDR
switchguid=0xbc3(bc3)! # Line 3Switch! 36 "S-bc3"! # "B3" enhanced port 0 lid 23 lmc 0[1]! "S-ac1"[3]! # "A1" lid 11 4xQDR[2] "S-ac2"[3] # "A2" lid 12 4xQDR[3] "H-3"[1](f3) # "Host3" lid 103 4xQDR
Spine1 Spine2
Line1 Line2 Line3
Host1 Host2 Host3
Chassis1
Host1 Host2 Host3
Modular Switches
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Monitoring Solutions
stable (but not useful to admins?)
▪ infiniband-diags▪ ibcheckerrors▪ ibdiagpath
▪ plugin to non-IB systems▪ nagios▪ collectl
▪ hardware vendor suites▪ Unified Fabric Manager (Mellanox)▪ InfiniBand Fabric Suites (QLogic)
17
unstable (individually carved)
▪ wrapper of infiniband-diags▪ INAM (Ohio-State-University)▪ QNIB▪ .....
not listed stuff▪ ...
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Monitoring Solutions
stable (but not useful to admins?)
▪ infiniband-diags▪ ibcheckerrors▪ ibdiagpath
▪ plugin to non-IB systems▪ nagios▪ collectl
▪ hardware vendor suites▪ Unified Fabric Manager (Mellanox)▪ InfiniBand Fabric Suites (QLogic)
18
unstable (individually carved)
▪ wrapper of infiniband-diags▪ INAM (Ohio-State-University)▪ QNIB▪ .....
not listed stuff▪ ...
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Monitoring Solutions
stable (but not useful to admins?)
▪ infiniband-diags▪ ibcheckerrors▪ ibdiagpath
▪ plugin to non-IB systems▪ nagios▪ collectl
▪ hardware vendor suites▪ Unified Fabric Manager (Mellanox)▪ InfiniBand Fabric Suites (QLogic)
19
unstable (individually carved)
▪ wrapper of infiniband-diags▪ INAM (Ohio-State-University)▪ QNIB▪ .....
not listed stuff▪ ...
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Monitoring Solutions
stable (but not useful to admins?)
▪ infiniband-diags▪ ibcheckerrors▪ ibdiagpath
▪ plugin to non-IB systems▪ nagios▪ collectl
▪ hardware vendor suites▪ Unified Fabric Manager (Mellanox)▪ InfiniBand Fabric Suites (QLogic)
20
unstable (individually carved)
▪ wrapper of infiniband-diags▪ INAM (Ohio-State-University)▪ QNIB▪ .....
not listed stuff▪ ...
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Monitoring Solutions
stable (but not useful to admins?)
▪ infiniband-diags▪ ibcheckerrors▪ ibdiagpath
▪ plugin to non-IB systems▪ nagios▪ collectl
▪ hardware vendor suites▪ Unified Fabric Manager (Mellanox)▪ InfiniBand Fabric Suites (QLogic)
21
unstable (individually carved)
▪ wrapper of infiniband-diags▪ INAM (Ohio-State-University)▪ QNIB▪ .....
not listed stuff▪ ...
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Quick Analysis
Fabricsize▪ small -> easy as pie?▪ big -> crit. mass for
real analysis?Switch structure▪ what is your
routing algorithm?Focus▪ 80:20 rule?
22
Type of use▪ willing/forced to shareProblemkind / -amount▪ runs smoothly enoughProblemsolving▪ learncurve starts step
performancemaintenance
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Quick Analysis
Fabric size▪ small -> easy as pie?▪ big -> crit. mass for
real analysis?Switch structure▪ what is your
routing algorithm?Focus▪ 80:20 rule?
23
Type of use▪ willing/forced to shareProblem type / amount▪ runs smoothly enoughProblem solving▪ learning curve starts steep
performancemaintenance
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Quick Analysis
Fabric size▪ small -> easy as pie?▪ big -> crit. mass for
real analysis?Switch structure▪ what is your
routing algorithm?Focus▪ 80:20 rule?
24
Type of use▪ willing/forced to shareProblem type / amount▪ runs smoothly enoughProblem solving▪ learning curve starts steep
performancemaintenance
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Quick Analysis
Fabric size▪ small -> easy as pie?▪ big -> crit. mass for
real analysis?Switch structure▪ what is your
routing algorithm?Focus▪ 80:20 rule?
25 0
25
50
75
100
performancemaintenance
Type of use▪ willing/forced to shareProblem type / amount▪ runs smoothly enoughProblem solving▪ learning curve starts steep
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Quick Analysis
26
Type of use▪ willing/forced to shareProblem type / amount▪ runs smoothly enoughProblem solving▪ learning curve starts steep
Fabric size▪ small -> easy as pie?▪ big -> crit. mass for
real analysis?Switch structure▪ what is your
routing algorithm?Focus▪ 80:20 rule?
performancemaintenance
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Quick Analysis
27
Type of use▪ willing/forced to shareProblem type / amount▪ runs smoothly enoughProblem solving▪ learning curve starts steep
Fabric size▪ small -> easy as pie?▪ big -> crit. mass for
real analysis?Switch structure▪ what is your
routing algorithm?Focus▪ 80:20 rule?
performancemaintenance
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Quick Analysis
28
Type of use▪ willing/forced to shareProblem type / amount▪ runs smoothly enoughProblem solving▪ learning curve starts steep
Fabric size▪ small -> easy as pie?▪ big -> crit. mass for
real analysis?Switch structure▪ what is your
routing algorithm?Focus▪ 80:20 rule?
performancemaintenance
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Conclusions
Monitoring▪ what approach?
Do we scare you?▪ not intending to spread Fear, Uncertainty and Doubt
Our conclusions
Your conclusions
29
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Conclusions
Monitoring▪ what approach?
Do we scare you?▪ not intending to spread Fear, Uncertainty and Doubt
Our conclusions
Your conclusions
30
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Conclusions
Monitoring▪ what approach?
Do we scare you?▪ not intending to spread Fear, Uncertainty and Doubt
Our conclusions
Your conclusions
31
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Conclusions
Monitoring▪ what approach?
Do we scare you?▪ not intending to spread Fear, Uncertainty and Doubt
Our conclusions
Your conclusions
32
© 2012 science + computing ag
Page
BoF InfiniBand | 2012-06-19
Discussion - Conclusions
Monitoring▪ what approach?
Do we scare you?▪ not intending to spread Fear, Uncertainty and Doubt
Our conclusions
Your conclusions
33
Thank you for your attention and participation!
science + computing agwww.science-computing.de
Telefon: +49 (0)7071 9457 - 0E-Mail: [email protected]