the future of big data is relational (or why you can't escape sql)
TRANSCRIPT
![Page 1: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/1.jpg)
The Future of Relational (or Why You Can't
Escape SQL)
Twitter: @tobrien
Thursday, February 28, 13
![Page 2: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/2.jpg)
In this session...OuroborosCopernican RevolutionPtolemaic EntrenchmentJanusA two minute summary of the last 15 yearsGoogle MagicThe Future of SQL
Thursday, February 28, 13
![Page 3: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/3.jpg)
Tim O’Brien I’m a developer who also writes
[email protected] Twitter: @tobrien
Thursday, February 28, 13
![Page 4: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/4.jpg)
Thursday, February 28, 13
![Page 5: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/5.jpg)
Thursday, February 28, 13
![Page 6: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/6.jpg)
Revolution
Thursday, February 28, 13
![Page 7: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/7.jpg)
Remember all that Big DataStuff?
Thursday, February 28, 13
![Page 8: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/8.jpg)
Remember when we all thought it was time to give up schemas?
Man, wasn’t that a lot of work.
Thursday, February 28, 13
![Page 9: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/9.jpg)
What if the relational database “catches up”?
What then?
Thursday, February 28, 13
![Page 10: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/10.jpg)
How we market Big Data:
Big Data == Paradigm Shift
“singularity” > “disruptor”
Thursday, February 28, 13
![Page 11: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/11.jpg)
Thursday, February 28, 13
![Page 12: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/12.jpg)
Thursday, February 28, 13
![Page 13: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/13.jpg)
“Big Data” is to “Traditional Databases” as...
Copernicus is to Ptolemy
Thursday, February 28, 13
![Page 14: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/14.jpg)
Out with the “old”In with the “new”
Thursday, February 28, 13
![Page 15: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/15.jpg)
Copernicus’model
1543 AD
Claudius Ptolemy~150 AD
Thursday, February 28, 13
![Page 16: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/16.jpg)
Google’s BigTablePaper - 2006
Edgar F. Codd
“A Relational Model ofData for Large Shared
Data Banks”1970
Hadoop - 2007
Thursday, February 28, 13
![Page 17: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/17.jpg)
Thursday, February 28, 13
![Page 18: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/18.jpg)
Google’s BigTablePaper - 2006
Codd
Hadoop - 2007
+ =Text
Google F1, SpannerTranslattice, Impala,Drawn-to-Scale
NuoDB, Akiban, manymore NewSQL products
Thursday, February 28, 13
![Page 19: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/19.jpg)
Thursday, February 28, 13
![Page 20: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/20.jpg)
Thursday, February 28, 13
![Page 21: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/21.jpg)
YouthLooking Forward
AgeLooking Backward
Thursday, February 28, 13
![Page 22: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/22.jpg)
Whatever.
Haven’t you heard?
Databases don’t scale.
Let’s create a schema.
Ok?
Thursday, February 28, 13
![Page 23: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/23.jpg)
And, both are right...
Thursday, February 28, 13
![Page 24: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/24.jpg)
• \
Thursday, February 28, 13
![Page 25: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/25.jpg)
Thursday, February 28, 13
![Page 26: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/26.jpg)
Thursday, February 28, 13
![Page 27: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/27.jpg)
Text
2000 In the beginning...
Proprietary app servers
Big Oracle database
Thursday, February 28, 13
![Page 28: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/28.jpg)
2001
Text
More traffic?
Specialized application servers
Throw hardware at the database
Thursday, February 28, 13
![Page 29: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/29.jpg)
2002-2005 More traffic?
Specialized application servers
Throw hardware at the database
Thursday, February 28, 13
![Page 30: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/30.jpg)
2005 Event More Traffic?
Sharding.... ugh.
Everything else was scaling horizontal exceptthe database.
Tex
Thursday, February 28, 13
![Page 31: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/31.jpg)
2006 - New Reality of Big Data
Google’s BigTablePaper - 2006
Hadoop - 2007
Q: What would Google do?A: Not use a RDBMs
Thursday, February 28, 13
![Page 32: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/32.jpg)
2006
Big Data for a few
RDBMs for most
vs.
Thursday, February 28, 13
![Page 33: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/33.jpg)
2007
Who needs Foreign Keys?Transac3ons? Just Simplify
•
Text
•The rise of Database “Luddites”
Thursday, February 28, 13
![Page 34: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/34.jpg)
2007
Text
•The rise of Database “Luddites”
Rails hacked away @ database “orthodoxy”
Opened the door to alterna3ve approaches
Thursday, February 28, 13
![Page 35: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/35.jpg)
•Although, Basecamp is s3ll a single RDBMS…
Thursday, February 28, 13
![Page 36: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/36.jpg)
2007- present == Alternatives•Documents
–MongoDB – Started in 2007, OSS in 2009–CouchDB – Started in 2005
•Graphs–Neo4j
•Key-‐Value Stores–Cassandra–Riak–Tokyo Cabinet
•Memory–Memcached / Redis
•Tabular–HBase
Thursday, February 28, 13
![Page 37: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/37.jpg)
2012 Q: What databasedo you use?
A: All of them
Oracle, Mongo, MySQL, Impala,Riak, some memcache, and some Hadoop thrown in for fun
Text
Thursday, February 28, 13
![Page 38: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/38.jpg)
Thursday, February 28, 13
![Page 39: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/39.jpg)
Big Data a Necessity at Largest Scale
Most development still RDBMS
“A certain kind of developer at a certain kind of company”
Thursday, February 28, 13
![Page 40: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/40.jpg)
•There’s this company that sells adver3sing–~96% of revenue came from adver3sing in 2011–~75% of the US Search Advert Market in 2011–~44% shared of overall online ad market
•One of the most important applica3ons at Google ran on MySQL –AdWords missed the NoSQL revolu3on
Thursday, February 28, 13
![Page 41: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/41.jpg)
Digging into the evolution of Storage at Google
•Google’s BigTable – 2006–Tabular–Sparse, distributed, mul3-‐dimensional sorted map
Thursday, February 28, 13
![Page 42: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/42.jpg)
Digging into the evolution of Storage at Google
•Google’s BigTable – 2006
–“New users [] uncertain of how to best use the BigTable interface, par3cularly if they are accustomed to using rela3onal databases that support general-‐purpose transac3ons.”
Thursday, February 28, 13
![Page 43: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/43.jpg)
Digging into the evolution of Storage at Google
•Google’s Megastore – 2010–Hierarchical “schemas”–Posi3oned as a NoSQL store–ACID within par33ons
Thursday, February 28, 13
![Page 44: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/44.jpg)
Digging into the evolution of Storage at Google
•Google’s Megastore – 2010
–“Supports two-‐phase commit for atomic updates [] these transac3ons have much higher latency and increase the risk of conten3on, we generally discourage applica3ons from using the feature“
Thursday, February 28, 13
![Page 45: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/45.jpg)
Digging into the evolution of Storage at Google•Google’s Spanner & F1 – 2012•Paper published in 2012–Hierarchical, Semi-‐rela3onal Schemas–ACID across con3nents possible -‐ 14ms transac3on overhead in a data-‐center with clock uncertainty of 1ms.–SQL
–Focus on Performance •Gated by Clock Uncertainty•Consensus: Paxos
Thursday, February 28, 13
![Page 46: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/46.jpg)
What Differentiates Google Spanner?•Transac3ons are only possible because of Paxos
•Forget NTP, Google has “Reified Clock Uncertainty”•Epsilon, clock uncertainty, is the ga3ng factor for gaining consensus on transac3on 3mestampe.
•It’s all about Time•“as the underlying system enforces 3ghter bounds on clock uncertainty, the overhead of the stronger seman3cs decreases. As a community, we should no longer depend on loosely synchronized clocks and weak 3me APIs in designing distributed algorithms.
Thursday, February 28, 13
![Page 47: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/47.jpg)
Let me reiterate Google has Mastered Time
Thursday, February 28, 13
![Page 48: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/48.jpg)
What Differentiates Google Spanner?•Hierarchical, Schema3zed Tables
•Similar to Akiban’s approach.
•Leads to some interes3ng possibili3es.
•Nested Subqueries and Tree Results
Thursday, February 28, 13
![Page 49: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/49.jpg)
What Differentiates Google Spanner?
To reiterate:
* hierarchical, schematized tables* distributed “compute fabric” for data* Google has mastered Time* Google built a warp reactor
Thursday, February 28, 13
![Page 50: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/50.jpg)
As goes Google so does the world... Translattice Drawn-to-Scale Akiban Impala
Several NewSQL companies quickly jumped on this train:- NuoDB- VoltDB
Yes, we’ve had Hive for a while, but these new initiatives resemble a more robust effort.
Thursday, February 28, 13
![Page 51: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/51.jpg)
Translattice Translattice identifies itself as a database that resembles F1
It is a hosted database service which provides distributed transactions.
Translattice uses Paxos
They’ve extended Postgresql and emphasize customer control over data. A distributed, cloud-based database
Thursday, February 28, 13
![Page 52: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/52.jpg)
Akiban Akiban’s approach to storage almost *exactly* matches the strategy Google uses in
Spanner.
Akiban lacks the distributed transaction capability of Spanner and F1, but they are working on developing the capability.
Akiban has implemented a query parser, optimizer, and execution engine atop a hierarchical approach to storage.
Thursday, February 28, 13
![Page 53: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/53.jpg)
Drawn-to-Scale
Reports: the most similar to F1 in the market. Fault-tolerant in distributed environments
Created a Query Parser + Optimizer + Execution Engine atop a distributed “compute fabric”
No Paxos or Transactions... yet. To be released, shortly. Stay tuned.
Drawn to Scale aims to be an “installable” database. Not going the hosted route.
Data stored in HDFS/HBase.
Thursday, February 28, 13
![Page 54: The Future of Big Data is Relational (or why you can't escape SQL)](https://reader034.vdocuments.us/reader034/viewer/2022052505/5551521db4c905f2288b55ff/html5/thumbnails/54.jpg)
So there.Big Data is turning into a Big Relational Database
Thursday, February 28, 13