how hailo fuels its growth using nosql storage and analytics
Post on 15-Jan-2015
1.353 Views
Preview:
DESCRIPTION
TRANSCRIPT
How Hailo fuels its growth using NoSQL Storage and Analytics
David Gardner, Architect @ Hailo
#NoSQLNow
#NoSQLNow
#NoSQLNow
#NoSQLNow
#NoSQLNow
• The world’s highest-rated taxi app – over 10,000 five-star reviews
• Over 500,000 registered passengers
• A Hailo e-hail is accepted by a driver every four seconds around the world
• Hailo operates in ten cities from Tokyo to Toronto in just over eighteen months of operation
What is Hailo?
#NoSQLNow
• Hailo is a marketplace that facilitates over $100M in run-rate transactions and is making the world a better place for passengers and drivers
• Hailo has raised over $50M in financing from the world's best investors including Union Square Ventures, Accel, the founder of Skype (via Atomico), Wellington Partners (Spotify), Sir Richard Branson, and our CEO's mother, Janice
Hailo is growing
#NoSQLNow
• Why Hailo are using NoSQL
• How we use Cassandra
• How we use Acunu Analytics
• Challenges of NoSQL
What this talk is about
#NoSQLNow
Why choose NoSQL?
#NoSQLNow
“NoSQL DBs trade off traditional features to better support new and emerging use cases”Andy Gross, Riak
http://www.slideshare.net/argv0/riak-use-cases-dissecting-the-solutions-to-hard-problems
#NoSQLNow
• More widely used, tested and documented software
• Ad-hoc querying
• Talent pool with direct experience
What are we trading off?
#NoSQLNow
• High availability
• Scalability
• Operational simplicity
What do we get back in return?
#NoSQLNow
The path to adoption at Hailo
#NoSQLNow
Hailo launched in London in November 2011
• Launched on AWS
• Two PHP/MySQL web apps plus a Java backend
• Mostly built by a team of 3 or 4 backend engineers
• MySQL multi-master for single AZ resilience
#NoSQLNow
Why Cassandra?
• A desire for greater resilience – “become a utility”Cassandra is designed for high availability
• Plans for international expansion around a single consumer appCassandra is good at global replication
• Expected growthCassandra scales linearly for both reads and writes
• Prior experienceI had experience with Cassandra and could recommend it
#NoSQLNow
The path to adoption
• Largely unilateral decision by developers – a result of a startup culture
• Replacement of key consumer app functionality, splitting up the PHP/MySQL web app into a mixture of global PHP/Java services backed by a Cassandra data store
• Launched into production in September 2012 – originally just powering North American expansion, before gradually switching over Dublin and London
#NoSQLNow
Cassandra at Hailo
#NoSQLNow
“Cassandra just works”
Dom W, Senior Engineer
#NoSQLNow
Use cases
1. Entity storage
2. Time series data
#NoSQLNow
CF = customers
126007613634425612:createdTimestamp: 1370465412email: dave@cruft.cogivenName: DavefamilyName: Gardnerlocale: en_GBphone:
+447911111111
#NoSQLNow
Considerations for entity storage
• Do not read the entire entity, update one property and then write back a mutation containing every column
• Only mutate columns that have been set
• This avoids read-before-write race conditions
#NoSQLNow
CF = comms
2013-06-01:55374fa0-ce2b-11e2-8b8b-0800200c9a66:
{“to”:”dave@c…a48bd800-ce2b-11e2-8b8b-0800200c9a66:
{“to”:”foo@ex…b0e15850-ce2b-11e2-8b8b-0800200c9a66:
{“to”:”bar@ho …bfac6c80-ce2b-11e2-8b8b-0800200c9a66:
{“to”:”baz@fo…
#NoSQLNow
CF = comms
dave@cruft.co:13b247f0-ce2c-11e2-8b8b-0800200c9a66:
{“to”:”dave@c…20f70a40-ce2c-11e2-8b8b-0800200c9a66:
{“to”:”dave@c…2b44d3b0-ce2c-11e2-8b8b-0800200c9a66:
{“to”:”dave@c…338a22f0-ce2c-11e2-8b8b-0800200c9a66:
{“to”:”dave@c…
#NoSQLNow
Considerations for time series storage
• Choose row key carefully, since this partitions the records
• Think about how many records you want in a single row
• Denormalise on write into many indexes
#NoSQLNow
Client libraries
• Astyanax (Java)
• phpcassa (PHP)
• github.com/carloscm/gossie (Go)
#NoSQLNow
#NoSQLNow
2 clusters
6 machines per region
3 regions
(stats cluster pending addition of third DC) O
pera
tion
al C
luste
rS
tats
Clu
ste
r
ap-southeast-1
us-east-1 eu-west-1
us-east-1 eu-west-1
#NoSQLNow
AWS VPCs with Open VPN links
3 AZs per region
m1.large machines
Provisoned IOPS EBS
Op
era
tion
al C
luste
rS
tats
Clu
ste
r
~ 600GB/node
~ 100GB/node
#NoSQLNow
Multi DC
• Something that Cassandra makes trivial
• Would have been very difficult to accomplish active-active inter-DC replication with a team of 2 without Cassandra
• Rolling repair needed to make it safe (we use LOCAL_QUORUM)
• We schedule “narrow repairs” on different nodes in our cluster each night
#NoSQLNow
#NoSQLNow
Acunu Analytics at Hailo
#NoSQLNow
Analytics
• With Cassandra we lost the ability to carry out analyticseg: COUNT, SUM, AVG, GROUP BY
• We use Acunu Analytics to give us this abilty in real time, for pre-planned query templates
• It is backed by Cassandra and therefore highly available, resilient and globally distributed
• Integration is straightforward
#NoSQLNow
NSQ Acunu C*events
#NoSQLNow
AQL
SELECT SUM(accepted), SUM(ignored), SUM(declined), SUM(withdrawn)FROM AllocationsWHERE timestamp BETWEEN '1 week ago' AND 'now’ AND driver='LON123456789’GROUP BY timestamp(day)
#NoSQLNow
#NoSQLNow
#NoSQLNow
Challenges
#NoSQLNow
10 Average years experience per team
member
MySQL Cassandra
#NoSQLNow
People who canattempt to queryMySQL
People who canattempt to
query Cassandra
#NoSQLNow
#NoSQLNow
Lessons learned
• Have an advovate - get someone who will sell the vision internally
• Teach team members the fundamentals of how the solution works
• Don’t cause yourself a “big data” problem unnecessarily
• Explain trade-offs in choosing NoSQL to all parts of the business
• Provide solutions!
#NoSQLNow
People who canattempt to queryMySQL
People who canattempt to
query Cassandra
#NoSQLNow
Conclusion
#NoSQLNow
We like Cassandra
• Solid design
• HA characteristics
• Easy multi-DC setup
• Simplicity of operation
#NoSQLNow
The future
• We will continue to invest in Cassandra as we expand globally
• We will hire people with experience running Cassandra
• We will focus on expanding our reporting facilities
• We aspire to extend our network (1M consumer installs, wallet) beyond cabs
• We will continue to hire the best engineers in London, NYC and Asia
top related