advancing the art of internet edge outage detection (slides) › richterp ›...

49
Advancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai Ramakrishna Padmanabhan University of Maryland Neil Spring University of Maryland Arthur Berger Akamai / MIT David Clark MIT ACM Internet Measurement Conference 2018

Upload: others

Post on 06-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Advancing the Art of Internet Edge Outage Detection

Philipp Richter MIT / Akamai

Ramakrishna Padmanabhan University of Maryland

Neil Spring University of Maryland

Arthur Berger Akamai / MIT

David Clark MIT

ACM Internet Measurement Conference 2018

Page 2: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Internet Edge Outages

1

Page 3: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Internet Edge Outages

➡ Uninterrupted Internet availability becomes increasingly critical ➡ Growing interest in systems to detect Internet edge outages➡ Regulatory Bodies, Governments, ISPs, Academic Research

1

Page 4: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Internet Edge Outages

➡ Uninterrupted Internet availability becomes increasingly critical ➡ Growing interest in systems to detect Internet edge outages➡ Regulatory Bodies, Governments, ISPs, Academic Research

Goal: Track Internet edge outages on (i) a broad scale and (ii) a detailed level

1

Page 5: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Edge Outage Detection: Existing Approaches

• Detecting outages in the control plane? ➡ Edge outages often invisible in BGP

• Deploying hardware in end user premises? ➡ Potentially highly accurate, but does not scale

• Actively probing addresses? ➡ Challenging to scale, unresponsive addresses, difficult to interpret

2

Page 6: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Edge Outage Detection: Existing Approaches

• Detecting outages in the control plane? ➡ Edge outages often invisible in BGP

• Deploying hardware in end user premises? ➡ Potentially highly accurate, but does not scale

• Actively probing addresses? ➡ Challenging to scale, unresponsive addresses, difficult to interpret

This work: Passive edge outage detection based on CDN request patterns

2

Page 7: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Outline

•Disruption Detection

•Global View on Disruptions

•Device-Centric View on Disruptions

•U.S. Broadband Case Study

Page 8: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Disruption Detection using CDN Access Logs

200,000+ servers1500+ ASes120+ countries> 3 trillion daily requests

Dataset: Hourly request counts per IPv4 /24 address block

3

Page 9: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Disruption Detection using CDN Access Logs

200,000+ servers1500+ ASes120+ countries> 3 trillion daily requests

Dataset: Hourly request counts per IPv4 /24 address block

Assumption: Edge outages will be reflected in absence/reduction of CDN requests.

3

Page 10: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Active IP Addresses per /24

hours

activ

e IP

v4 a

ddre

sses

per

hou

r

1 168 336 504 672

050

100

150

200

250

typical residential /24 address block

4

Page 11: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Active IP Addresses per /24: Baseline Activity

hours

activ

e IP

v4 a

ddre

sses

per

hou

r

1 168 336 504 672

050

100

150

200

250

typical residential /24 address block

baseline value:93 IP addresses

Number of addresses contacting the CDN every hour never drops below ‘baseline value’ value per block.

4

Page 12: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Active IP Addresses per /24: Baseline Activity

hours

activ

e IP

v4 a

ddre

sses

per

hou

r

1 168 336 504 672

050

100

150

200

250

typical residential /24 address block

baseline value:93 IP addresses

home networkCDN

content requests,update requests,

widgets, beacons, etc.

icons: mavadee / flaticon4

Page 13: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Active IP Addresses per /24: Baseline Activity

hours

activ

e IP

v4 a

ddre

sses

per

hou

r

1 168 336 504 672

050

100

150

200

250

typical residential /24 address block

baseline value:93 IP addresses

home networkCDN

content requests,update requests,

widgets, beacons, etc.

• Some 2.3M /24 address blocks (83% of all client IPs) baseline > 40• Robust signal, largely independent of user-triggered activity• Dependent on a functioning network 4

Page 14: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

Disruption Detection

5

Page 15: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

Disruption Detection

5

Page 16: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

activity in next hour violates baseline criterion

Disruption Detection

5

Page 17: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

baseline b1: sliding window

beta * min [last 168 hours]

Disruption Detection

5

Page 18: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

Disruption Detection

baseline b1: sliding window

beta * min [last 168 hours]

5

Page 19: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

new baseline found

Disruption Detection

baseline b1: sliding window

beta * min [last 168 hours]

5

Page 20: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

}

non-steadystate

Disruption Detection

baseline b1: sliding window

beta * min [last 168 hours]

5

Page 21: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

baseline b0: sliding window

alpha * min [last 168 hours]

}

Disruption Detection

non-steadystate

baseline b1: sliding window

beta * min [last 168 hours]

5

Page 22: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Did this Address Block really go offline?

hour [starting 2017−08−30]

activ

e IP

v4 a

ddre

sses

0 50 100 150 200 250 300 350

050

100

150

ICMP responsiveCDN active

6

Page 23: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Did this Address Block really go offline?

hour [starting 2017−08−30]

activ

e IP

v4 a

ddre

sses

0 50 100 150 200 250 300 350

050

100

150

ICMP responsiveCDN active

➡ “Address Space Survey Data” (provided by ISI) ➡ Ping every address in 1% of the allocated IPv4 /24s every 11 mins➡ Some address blocks show a very steady number of responsive IPs

6

Page 24: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

hour [starting 2017−08−30]

activ

e IP

v4 a

ddre

sses

0 50 100 150 200 250 300 350

050

100

150

ICMP responsiveCDN active

Did this Address Block really go offline?

➡ “Address Space Survey Data” (provided by ISI) ➡ Ping every address in 1% of the allocated IPv4 /24s every 11 mins➡ Some address blocks show a very steady number of responsive IPs

6

Page 25: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

CDN Disruptions vs. ICMP180.253.118.0/24

hour

activ

e IP

v4 a

ddre

sses

CDN active IPsICMP responsive IPs

0 168 336

050

100

150

200

250

170.250.45.0/24

hour

activ

e IP

v4 a

ddre

sses

CDN active IPsICMP responsive IPs

0 168 336

050

100

150

200

250

68.200.4.0/24

hour

activ

e IP

v4 a

ddre

sses

CDN active IPsICMP responsive IPs

0 168 336

050

100

150

200

250

73.204.202.0/24

hour

activ

e IP

v4 a

ddre

sses

CDN active IPsICMP responsive IPs

0 168 336

050

100

150

200

250

6

Page 26: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Global View on Disruptions

Apr 2017 May 2017 Jun 2017 Jul 2017 Aug 2017 Sep 2017 Oct 2017 Nov 2017 Dec 2017 Jan 2018 Feb 2018

0

4K(~0.35%)

8K(~0.7%)

hour

ly di

srup

ted

/24s partial /24 disrupted

entire /24 disrupted

1.5M disruption events over the course of one year

8

Page 27: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Global View on Disruptions

• At least 0.2% of the monitored space faces a disruption• Natural disasters, intended Internet shutdowns• Weekly pattern, mostly absent during Christmas/NYE

Apr 2017 May 2017 Jun 2017 Jul 2017 Aug 2017 Sep 2017 Oct 2017 Nov 2017 Dec 2017 Jan 2018 Feb 2018

0

4K(~0.35%)

8K(~0.7%)

hour

ly di

srup

ted

/24s partial /24 disrupted

entire /24 disrupted

1.5M disruption events over the course of one year

8

Page 28: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Global View on Disruptions

• At least 0.2% of the monitored space faces a disruption• Natural disasters, intended Internet shutdowns• Weekly pattern, mostly absent during Christmas/NYE

Apr 2017 May 2017 Jun 2017 Jul 2017 Aug 2017 Sep 2017 Oct 2017 Nov 2017 Dec 2017 Jan 2018 Feb 2018

0

4K(~0.35%)

8K(~0.7%)

hour

ly di

srup

ted

/24s partial /24 disrupted

entire /24 disrupted

1.5M disruption events over the course of one year

Hurricane Irma

8

Page 29: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Weekly Pattern

Mon Tue Wed Thu Fri Sat Sun

day of the week [local time]

fract

ion

of d

isru

ptio

n ev

ents

0.00

0.05

0.10

0.15

0.20

allentire /24

9

Page 30: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Weekly Pattern

Mon Tue Wed Thu Fri Sat Sun

day of the week [local time]

fract

ion

of d

isru

ptio

n ev

ents

0.00

0.05

0.10

0.15

0.20

allentire /24

●●

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

start hour [local time]

fract

ion

of d

isru

ptio

n ev

ents

1 3 5 7 9 11 13 15 17 19 21 23

0.00

0.05

0.10

0.15

●● ●

● ●● ●

● ● ● ●● ● ● ● ● ●

allentire /24

9

Page 31: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Weekly Pattern: Scheduled Maintenance Window

Mon Tue Wed Thu Fri Sat Sun

day of the week [local time]

fract

ion

of d

isru

ptio

n ev

ents

0.00

0.05

0.10

0.15

0.20

allentire /24

●●

●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

start hour [local time]

fract

ion

of d

isru

ptio

n ev

ents

1 3 5 7 9 11 13 15 17 19 21 23

0.00

0.05

0.10

0.15

●● ●

● ●● ●

● ● ● ●● ● ● ● ● ●

allentire /24

9

Page 32: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Outline

•Disruption Detection

•Global View on Disruptions

•Device-Centric View on Disruptions

•U.S. Broadband Case Study

Page 33: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Device Perspective on Disruptions

hours

activ

e IP

v4 a

ddre

sses

1 168 336 504 672

050

100

150

200

➡ Established that this address block lost connectivity➡ Do devices really lose Internet connectivity? ➡ Or can they connect from somewhere else?

?

10

Page 34: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Device-Specific Dataset for Subset of Users

softwareID A software

ID Bsoftware

ID C

10

Page 35: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

softwareID A software

ID Bsoftware

ID C

➡ Device-information for 52K “entire /24” disruption events➡ Only consider disruptions which affected an entire /24

(no activity during the disruption)

Device-Specific Dataset for Subset of Users

disruption in1.2.3.0/24

IPbefore

∈ 1.2.3.0/24

time

IPduring

∉ 1.2.3.0/24IPafter

∈/∉ 1.2.3.0/24

10

Page 36: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Device Perspective on Disruptions

disruptions with active IDs < 1hr before

N=52K

11

Page 37: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Device Perspective on Disruptions

disruptions with active IDs < 1hr before

N=52K

no active IDs during disruption

86%

expected case, devicesdo not have connectivity

11

Page 38: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Device Perspective on Disruptions

disruptions with active IDs < 1hr before

N=52K

no active IDs during disruption

86%

expected case, devicesdo not have connectivity

active IDs during disruption

from different IP address

14%

11

Page 39: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Device Perspective on Disruptions

disruptions with active IDs < 1hr before

N=52K

no active IDs during disruption

86%

expected case, devicesdo not have connectivity

active IDs during disruption

from different IP address

14%

switch to/from cellular switch AS

20% 13%

mobility and/or multi-homed devices

11

Page 40: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Device Perspective on Disruptions

disruptions with active IDs < 1hr before

N=52K

no active IDs during disruption

86%

expected case, devicesdo not have connectivity

active IDs during disruption

from different IP address

14%

switch to/from cellular switch AS same AS

20% 13%

mobility and/or multi-homed devices

67%

? (10% of disruptions)

11

Page 41: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Anti-Disruptions: Temporary Surges in Address Activity

IPs

in d

isru

pted

/alte

rnat

e /2

4

0 20 40 60 80 100

200

100

010

020

0

disrupted /24alternate /24

time [hours]

ID XYZ

ID XYZ

12

Page 42: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Anti-Disruptions: Temporary Surges in Address Activity

IPs

in d

isru

pted

/alte

rnat

e /2

4

0 20 40 60 80 100

200

100

010

020

0

disrupted /24alternate /24

time [hours]

➡ Some disruptions not service outages but address reassignment!

➡ Developed mechanism to detect anti-disruptions➡ Rank ASes by correlation of disruptions with anti-disruptions

ID XYZ

ID XYZ

12

Page 43: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

AS-Level Disruptions and Anti-Disruptions

(ant

i−) d

isru

pted

IPs

010

0K25

0K

(ant

i−) d

isru

pted

IPs

40K

040

K

major US cable ISP: No correlation between disruptions and anti-disruptions (pearson r= 0.02)

Uruguayan ISP: Strong correlation between disruptions and anti-disruptions (pearson r= 0.63)

13

Page 44: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

AS-Level Disruptions and Anti-Disruptions

(ant

i−) d

isru

pted

IPs

010

0K25

0K

(ant

i−) d

isru

pted

IPs

40K

040

K

major US cable ISP: No correlation between disruptions and anti-disruptions (pearson r= 0.02)

Uruguayan ISP: Strong correlation between disruptions and anti-disruptions (pearson r= 0.63)

Anti-Disruptions can heavily skew per-AS and per-country assessment of Internet reliability 13

Page 45: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Case Study: US Broadband ISPs

ISP A Cable

ISP BCable

ISP CCable

ISP DDSL

ISP EDSL

ISP FDSL

ISP GDSL

anti-disruptioncorrelation 0.22 0.03 -0.02 0.03 0.00 -0.04 0.05

% disruptions w/ intermittent activity 4% 1% 1% 0% 3% 6% 14%

% /24s only disruptedduring Hurricane Irma 11% 1% 2% 23% 1% 0% 3%

% /24s only disrupted maintenance window 67% 54% 75% 29% 60% 71% 62%

14

Page 46: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Case Study: US Broadband ISPs

ISP A Cable

ISP BCable

ISP CCable

ISP DDSL

ISP EDSL

ISP FDSL

ISP GDSL

anti-disruptioncorrelation 0.22 0.03 -0.02 0.03 0.00 -0.04 0.05

% disruptions w/ intermittent activity 4% 1% 1% 0% 3% 6% 14%

% /24s only disruptedduring Hurricane Irma 11% 1% 2% 23% 1% 0% 3%

% /24s only disrupted maintenance window 67% 54% 75% 29% 60% 71% 62%

Most major US broadband ISPs do not show strong signs of anti-disruptions. 14

Page 47: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Case Study: US Broadband ISPs

ISP A Cable

ISP BCable

ISP CCable

ISP DDSL

ISP EDSL

ISP FDSL

ISP GDSL

anti-disruptioncorrelation 0.22 0.03 -0.02 0.03 0.00 -0.04 0.05

% disruptions w/ intermittent activity 4% 1% 1% 0% 3% 6% 14%

% /24s only disrupted maintenance window 67% 54% 75% 29% 60% 71% 62%

% /24s only disruptedduring Hurricane Irma 11% 1% 2% 23% 1% 0% 3%

All (but one) ISP has majority of all disrupted /24s duringscheduled maintenance window (Mo-Fr Midnight-6AM). 14

Page 48: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Case Study: US Broadband ISPs

ISP A Cable

ISP BCable

ISP CCable

ISP DDSL

ISP EDSL

ISP FDSL

ISP GDSL

anti-disruptioncorrelation 0.22 0.03 -0.02 0.03 0.00 -0.04 0.05

% disruptions w/ intermittent activity 4% 1% 1% 0% 3% 6% 14%

% /24s only disrupted maintenance window 67% 54% 75% 29% 60% 71% 62%

% /24s only disruptedduring Hurricane Irma 11% 1% 2% 23% 1% 0% 3%

ISP A: Some 80% of all disruptions fall in the maintenance window, or are caused by force majeure (Hurricane Irma).

Relevant for SLAs, policies, and reliability assessment. 14

Page 49: Advancing the Art of Internet Edge Outage Detection (slides) › richterp › outages_imc18_slides.pdfAdvancing the Art of Internet Edge Outage Detection Philipp Richter MIT / Akamai

Implications

Apr 2017 May 2017 Jun 2017 Jul 2017 Aug 2017 Sep 2017 Oct 2017 Nov 2017 Dec 2017 Jan 2018 Feb 2018

0

4K(~0.35%)

8K(~0.7%)

hour

ly di

srup

ted

/24s

partial /24 disruptedentire /24 disrupted

• Outage Detection: Methodological Insights • Baseline activity enables fine-granular detection of disruptions• Anti-disruptions due to reassignment

➡ Can bias active and passive outage detection systems

• Interpreting Service Outages • Majority of outages (for many ISPs) during scheduled maintenance

➡ Implications for SLAs, reporting requirements, regulations

15