disappointing disasters & other y2k · 2020-06-11 · microservices and loose coupling...

35
@wiredferret Y2K & Other Disappointing Disasters Risk Reduction and Harm Mitigation

Upload: others

Post on 24-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Y2K & Other Disappointing Disasters

Risk Reduction and Harm Mitigation

Page 2: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Page 3: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

An actual pager

Page 4: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Ready for the big night

Page 5: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

This was a vital job

Page 6: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

No one remembers the crisis averted.

Page 7: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Risk Reduction

Page 8: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Risk Reduction

● Stay-home orders

● Vaccination● Train gates

Page 9: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

How to reduce risk

Page 10: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Secure your zone

Page 11: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Page 12: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Predict states

Page 13: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Accept Risk

● You can’t prevent everything

● Decide what matters to save

● Make mindful tradeoffs basedon data

Page 14: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Harm Mitigation

Page 15: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

When Things Go

Wrong

● Seatbelts● Building codes● RAID

Page 16: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Page 17: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

How to mitigate harm

Page 18: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Fail safe or fail secure?

● What are you protecting?

● What is your risk?

Page 19: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Failure is inevitable.

Disaster is not.

Page 20: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

What’s a disaster?

Page 21: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Page 22: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Page 23: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Let’s design for that

Page 24: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

Fizzle factors

● Microservices and loose coupling

● Fault-tolerant messaging● Data duplication● Canary launches● Kill switches and circuit

breakers● Automatic recovery● Testing for load, stress,

and outage

Page 25: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Rubber bands, not rigid joints

Microservices and loose coupling

Page 26: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Fault-tolerant messaging

The internet routes around censorship as damage

Page 27: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Data duplication

It’s not a backup if you haven’t tested restoration

Page 29: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Kill switches and circuit breakers

Stick a fork in it, it’s done

Page 30: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Automatic recovery

Take two reboots and call me in the morning

Page 31: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

Testing for load, stress and outage

Elastic scaling only gets you so far

Page 32: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

@wiredferret

tl;rt● Expect failure● Make systems less

rigid● Plan for disaster● Degrade gracefully

Page 33: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

https://tinyurl.com/failoverconf-heidi

Page 34: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

THE careful text-books measure

(Let all who build beware!)

The load, the shock, the pressure

Material can bear.

So, when the buckled girder

Lets down the grinding span,

'The blame of loss, or murder,

Is laid upon the man.

Not on the Stuff - the Man!

The Hymn of Breaking Strain

But in our daily dealing

With stone and steel, we find

The Gods have no such feeling

Of justice toward mankind.

To no set gauge they make us-

For no laid course prepare-

And presently o'ertake us

With loads we cannot bear:

Too merciless to bear.

The prudent text-books give it

In tables at the end

'The stress that shears a rivet

Or makes a tie-bar bend-

'What traffic wrecks macadam-

What concrete should endure-

but we, poor Sons of Adam

Have no such literature,

To warn us or make sure!

Page 35: Disappointing Disasters & Other Y2K · 2020-06-11 · Microservices and loose coupling Fault-tolerant messaging Data ... Automatic recovery Testing for load, stress, and outage. Rubber

We hold all Earth to plunder -

All Time and Space as well-

Too wonder-stale to wonder

At each new miracle;

Till, in the mid-illusion

Of Godhead 'neath our hand,

Falls multiple confusion

On all we did or planned-

The mighty works we planned.

We only of Creation

(0h, luckier bridge and rail)

Abide the twin damnation-

To fail and know we fail.

Yet we - by which sole token

We know we once were Gods-

Take shame in being broken

However great the odds-

The burden of the Odds.

Oh, veiled and secret Power

Whose paths we seek in vain,

Be with us in our hour

Of overthrow and pain;

That we - by which sure token

We know Thy ways are true -

In spite of being broken,

Because of being broken

May rise and build anew

Stand up and build anew.

-- Rudyard Kipling