Download - Fi fo euc 2014
![Page 1: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/1.jpg)
Building a cloud with Erlang and SmartOS
How hard could it possibly be?
![Page 2: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/2.jpg)
Spoiler
![Page 3: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/3.jpg)
Spoiler
Quite hard!
![Page 4: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/4.jpg)
Who am I
• Writing Project FiFo
• Twitter : @heinz_gies
• Github: https://github.com/Licenser & https://github.com/project-fifo
• IRC: Licenser
![Page 5: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/5.jpg)
Disclaimer• This is time travel! Situations might have
changed by today.• This is about my experience not the total
truth - yes there is a chance I was double wrong!
• I don’t want to shame any technology, it is just about my experience on applying them to a specific problem.
• No dogs were harmed in the making!
![Page 6: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/6.jpg)
Intro
• What is FiFo? - Open Source Cloud orchestration
• For SmartOS: ZFS, DTrace, Crossbow, Zones, …
• In Erlang: Distributed, fault tolerant, fun to write, …
![Page 7: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/7.jpg)
The fail of Clojure Script
![Page 8: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/8.jpg)
What was done
• CLJS app in GZ
• HTTP API
![Page 9: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/9.jpg)
Reason• existing client for the API
• node.js was on the GZ (looked like additional deps).
• Wanted to try Clojure Script.
• No idea of what Project FiFo would become.
![Page 10: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/10.jpg)
The problem• lots of dependencies (version conflicts, missing libraries).
• at that time very hard to debug (no source maps etc., lack of visibility/horrible stack traces).
• Everything in the Global Zone. (big footprint)
• Only one system
![Page 11: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/11.jpg)
What I learned
• Try to plan what you do before you do it.
• Rewriting is no shame!
• What seems easy in the beginning is not always the right thing.
![Page 12: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/12.jpg)
The fail of a single host
![Page 13: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/13.jpg)
What changed
• Added wiggle, API endpoint over multiple cljs application
• running in a zone
• Allow more then 1 hypervisor!
![Page 14: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/14.jpg)
Reason
• Needed good abstract over the existing code.
• A web interface for the clojurescript code.
• Wanted to work with Erlang.
![Page 15: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/15.jpg)
The problem• HTTP between wiggle and cljs-app.• Single point of failure.
• Did not simplify the code on the hypervisor, it just forwarded.• Still not enough separation.• Authentication handled downstream in cljs.
• Synchronization is a pain.
![Page 16: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/16.jpg)
What I learned• HTTP is not the silver bullet.• Split out applications.• Modularize (not only in code, but in applications).• Handle things like authentication as high up as possible.• Remove work from leaves that should be handled in a different
layer.
![Page 17: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/17.jpg)
The fail of distribution
![Page 18: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/18.jpg)
What changed
• Split out authentication -> snarl
• Split out most logic -> sniffle
• Reduced GZ footprint -> scrap cljs replace by minimal erlang app
![Page 19: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/19.jpg)
Reason• Erlang apps are wonderfully self-contained (releases)• Distributing systems protects against SPOF• Separating concerns
• management on system • authentication• API• Management of hypervisors
![Page 20: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/20.jpg)
Problem• Synchronization is really hard
• 1st try: gproc had problems with multiple nodes[1]
• 2nd try: wrapper around grpoc -> had a SPOF
• lots of configuration needed with connecting all the systems
![Page 21: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/21.jpg)
What I learned• distributed systems are hard, who would have thought that!
• managing configuration is annoying, especially in a multi-node environment.
• in Erlang land there are great libraries for distribution.
• riak_core rocks!
![Page 22: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/22.jpg)
The fail of storing JSON
![Page 23: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/23.jpg)
Reason
• It’s “easy”, no schema, good library support for serializing and deserializing
• The fronted/UI used it anyway
• everyone uses JSON, so it must be good right?
![Page 24: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/24.jpg)
Problem• Choice based on popularity not common sense• No Pattern matching• No good libraries to manipulating JSX-JSON• Verbose and ‘big’• hard to represent data in Erlang (esp. maps/objects)• Hard to synchronize/merge (state box[2] is only a partial solution)
![Page 25: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/25.jpg)
What I learned
• Model data around the backend not the front-end• JSON is no silver bullet, it has the same problem XML had, it is
used for the sake of being used• CRDT’s are a lovely thing[4]
• Records are not perfect but a very nice storage for structured information
![Page 26: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/26.jpg)
The fail of CAP
![Page 27: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/27.jpg)
Reason
• riak_core really rocks!
• Eventual consistency is a very tempting concept
• Availability is more important then consistency when managing a cloud
![Page 28: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/28.jpg)
Problem
• Expect when it is not, like IP assignment, memory constraints on server :(
• Globally locking those things would break availability
• Not beating CAP anytime soon [3] :(
![Page 29: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/29.jpg)
What I learned
• The more control you have over your data the further you can push the ‘eventual’ in eventual consistency
• Locks don’t have to be global need to just cover enough to ensure consistency
• The locks location matters:
• Hypervisor memory on the hypervisor itself
• IP’s ‘sharded’ over the ring
![Page 30: Fi fo euc 2014](https://reader034.vdocuments.us/reader034/viewer/2022052522/554a0732b4c905e56c8b57af/html5/thumbnails/30.jpg)
Links• https://project-fifo.net
• https://docs.project-fifo.net
• [1] http://christophermeiklejohn.com/erlang/2013/06/05/erlang-gproc-failure-semantics.html
• [2] https://github.com/mochi/statebox
• [3] http://ferd.ca/beating-the-cap-theorem-checklist.html
• [4] http://aphyr.com/posts/285-call-me-maybe-riak