failure after failure

54
LURE AFTER FAILUREAF ailure AFTERFA FTER f ailure AFTERFA FTER f APRESENTATIONBY @timangladE FROMYOURFRIENDS @cloudant

Upload: tim-anglade

Post on 16-May-2015

1.364 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Failure after Failure

FAILUREAFTERFAILUREAFTERFARailureAFTERFAILUREAFTERAFTERf

ailureAFTERFAILUREAFTERAFTERf

A!PRESENTATION!BY @timangladEFROM!YOUR!FRIENDS @cloudant

Page 2: Failure after Failure

• Founded by 3 MIT grad students

• YCombinator S08

• Based in Boston, MA with employees in California, Washington State & the UK

• Hosted, distributed database service,~compatible with CouchDB

• Open Core BigCouch

• Value-added technologies are available as licenses

Cloudant?

Page 3: Failure after Failure

attachments:image, audio, ...

JSON, typed

complex relationships

MVCC:provenance,replication

and cache-ready

Page 4: Failure after Failure

• Written in Erlang

• REST API

• Bulk upload/edit

• Feed of changes

• Append-only, B+Tree, copy-on-write

• Durable MapReduce views (indices), in javascript, ruby, python, java, etc.

• ACID at the single document level

CouchDB Basics

Page 5: Failure after Failure

REPEATEDLY.We have Failed.

we’re not necessarily bad at our jobs though…

Page 6: Failure after Failure

Back in 2007…

Page 7: Failure after Failure

“The worldneeds a databaselike this”

— Adam Kocoloski, Cloudant CTO(in a moment of weakness)

Page 8: Failure after Failure

C luster of U nreliable C ommodity H ardware

Page 9: Failure after Failure

C luster of U nreliable C ommodity H ardware

without the ‘C’ it’s just “ouchDB”

Page 10: Failure after Failure

➞ BigCOuchPutting The ‘C’back inCouchDB

Page 11: Failure after Failure

Starting a newCode project:Good idea,right?

Page 12: Failure after Failure

Disregardingprior art

Failure #1?

Page 13: Failure after Failure

Y et A nother W anking N osql S olution

Page 14: Failure after Failure

DistributedSystems are HARDLet’s goshopping!

Page 15: Failure after Failure

DynAmo!

Werner, I♥you — why don’t you return my calls?

Page 16: Failure after Failure

D-D-D-Dynomite!

Cliff, I♥you, but don’t ever call me again.

Page 17: Failure after Failure

Perusingprior Art:Good idea,right?

Page 18: Failure after Failure

Failure #2

Usingprior art

Page 19: Failure after Failure

Your projecT≠

My project

Page 20: Failure after Failure

➞ MEM3ridiculouslysimple shardmanagement

Page 21: Failure after Failure

DistributedSystemsmeansdistributedTasks

Page 22: Failure after Failure

Executing CodeRemotelyNot cool, man.

Page 23: Failure after Failure

Science says:82% of coders’TimE is spentDesigning an RPCMECHANISMThe other 18% is spent at the coffee machine

Page 24: Failure after Failure

Using your language’snative RPCMechanismGood idea,right?

Page 25: Failure after Failure

Failure #3

PREMATUREImmatureoptimization

Page 26: Failure after Failure

➞ RexilightweightRPC server

Page 27: Failure after Failure

2xthroughputimprovementWell, alright. I can live with that, I guess…

Page 28: Failure after Failure

Meanwhile…

Page 29: Failure after Failure

Scrum’ing it upGood idea,right?

Page 30: Failure after Failure

Failure #4

ITerate OFTENITERATE FASTHIT&ATE the WALL

Page 31: Failure after Failure

➞ FabricDB OPS abstraction

Page 32: Failure after Failure

Meanwhile…

Page 33: Failure after Failure

“we need to bewebscale”

— ALAN HOFFMAN, Cloudant CEOFebruary 31, 2009

Page 34: Failure after Failure

Using stateof the artcloud hostingGood idea,right?

Page 35: Failure after Failure

Failure #5

AmazonlulzServices™

Page 36: Failure after Failure

The MOST INTENSETECH BOOTCAMPyou CAN EVERPUT DISTRIBUTED CODE THROUGH

Page 37: Failure after Failure

Meanwhile…

Page 38: Failure after Failure

They see meCompactin’They Hatin’

Page 39: Failure after Failure

Being GoodApache CitizensGood idea,right?

Page 40: Failure after Failure

Failure #6

Oh,Apache…

Page 41: Failure after Failure

Meanwhile…

Page 42: Failure after Failure

It’s all so Clear!

Page 43: Failure after Failure

Automatingyour monitoringGood idea,right?

Page 44: Failure after Failure

Failure #7There is nosuch thingas automatedmOnitoring

Page 45: Failure after Failure

Meanwhile…

Page 46: Failure after Failure

Which onewould yourather Serve?

Page 47: Failure after Failure

Large UserStory

Page 48: Failure after Failure

Small UserStory

Page 49: Failure after Failure

Failure #8Hope your usersare smart.

(Plan for whenThey’re STUPID.)

Page 50: Failure after Failure

Recap! 1. Don’t disregard prior art 2. Disregard prior art 3. Don’t be afraid to relax the rules 4. Clean up regularly 5. Get hosted on AWS (or not) 6. Learn tHAT APACHE CAN DO WRONG 7. Monitor what matters 8. There are no “nice” users

Page 51: Failure after Failure

What’s Next?version 0.4More PackagesGeoCouch?

Page 53: Failure after Failure

cloudant.com/europe

FaiLINGNear you,SOON!

(because we care.)