when to no sql and when to know sql javaone
DESCRIPTION
An introduction to the different types of NoSQL and some guidance on when to choose them, and when to use plain old SQL. Focuses on developer productivity, intuitive code, and system issues including scaling and usage patterns. As delivered at JavaOne 2014 in San FranciscoTRANSCRIPT
![Page 1: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/1.jpg)
Simon Elliston Ball Head of Big Data
@sireb
When to NoSQL and When to Know SQL
#noSQLknowSQL
http://nosqlknowsql.io
![Page 2: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/2.jpg)
Not only SQL
SQL
what is NoSQL?
Many many things
NoSQL
No, SQL
![Page 3: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/3.jpg)
files
before SQL
multi-value
ur… hash maps?
![Page 4: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/4.jpg)
everything is relational
after SQL
ORMs fill in the other data structures
scale up rules
data first design
![Page 5: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/5.jpg)
datastores that suit applications
and now NoSQL
polyglot persistence: the right tools
scale out rules
APIs not EDWs
![Page 6: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/6.jpg)
data growth
why should you care?
machine learning
social
rapid development
fewer migration headaches… maybe
![Page 7: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/7.jpg)
big bucks.
O’Reilly 2013 Data Science Salary Survey
![Page 8: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/8.jpg)
So many NoSQLs…
![Page 9: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/9.jpg)
![Page 10: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/10.jpg)
document databases
![Page 11: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/11.jpg)
document databases
known access pattern
JSON docs
complex, variable models
rapid development
![Page 12: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/12.jpg)
document databases
JUST DON’Tjoins?
learn a very new query language
denormalize
http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
document form
![Page 13: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/13.jpg)
document vs SQLwhat can SQL do?
query all the angles
sure, you can use blobs…
… but you can’t get into them
![Page 14: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/14.jpg)
documents in SQLSQL xml fields
mapping xquery paths is painful
native JSON
but still structured
![Page 15: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/15.jpg)
class of database database
query everything: search
full-text indexing
![Page 16: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/16.jpg)
range query
you know google, right…
span query
keyword query
![Page 17: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/17.jpg)
scores
you know the score"query": { "function_score": { "query": { "match": { "title": "NoSQL"} }, "functions": [ "boost": 1, "gauss": { "timestamp": { "scale": "4w" } }, "script_score" : { "script" : "_score * doc['important_document'].value ? 2 : 1" } ], "score_mode": "sum" }}
![Page 18: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/18.jpg)
scores
SQL knows the score toodeclare @origin float = 0;declare @delay_weeks float = 4;
SELECT TOP 10 * FROM ( SELECT title, score * CASE WHEN p.important = 1 THEN 2.0 WHEN p.important = 0 THEN 1.0 END * exp(-power(timestamp-@origin,2)/(2*@delay*7*24*3600)) + 1 AS score FROM posts p WHERE title LIKE '%NoSQL%') as foundORDER BY score
![Page 19: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/19.jpg)
more like this: instant tf-idf
you know google, right…
{ "more_like_this" : { "fields" : ["name.first", "name.last"], "like_text" : "text like this one", "min_term_freq" : 1, "max_query_terms" : 12 }}
![Page 20: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/20.jpg)
Facets
![Page 21: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/21.jpg)
Facets
![Page 22: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/22.jpg)
FacetsSQL:
x lots
SELECT a.name, count(p.id) FROM people p JOIN industry a on a.id = p.industry_id JOIN people_keywords pk on pk.person_id = p.id JOIN keywords k on k.id = pk.keyword_id WHERE CONTAINS(p.description, 'NoSQL') OR k.name = 'NoSQL' ...GROUP BY a.name
SELECT a.name, count(p.id) FROM people p JOIN area a on a.id = p.area_id JOIN people_keywords pk on pk.person_id = p.id JOIN keywords k on k.id = pk.keyword_id WHERE CONTAINS(p.description, 'NoSQL') OR k.name = 'NoSQL' ...GROUP BY a.name
![Page 23: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/23.jpg)
FacetsElastic search:{ "query": { “query_string": { “default_field”: “content”, “query”: “keywords” } }, "facets": { “myTerms": { "terms": { "field" : "lang", "all_terms" : true } } }}
![Page 24: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/24.jpg)
untyped free-text documents
logs
timestamped
semi-structured
discovery
aggregation and statistics
![Page 25: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/25.jpg)
close to your programming model
key: value
distributed map | list | set
keys can be objects
![Page 26: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/26.jpg)
hash types
SQL extensions
hstore
![Page 27: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/27.jpg)
inheritance
SQL and polymorphism
ORMs hide the horror
![Page 28: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/28.jpg)
columnar databases
turning round the rows
physical layout matters
![Page 29: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/29.jpg)
turning round the rowskey value type
1 A Home
2 B Work
3 C Work
4 D Work
00001 1 A Home 00002 2 B Work 00003 3 C Work …
Row storage
A B C D Home Work Work Work …
Column storage
![Page 30: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/30.jpg)
teaching an old SQL new tricksMySQL InfoBright
SQL Server Columnar Indexes
CREATE NONCLUSTERED COLUMNSTORE INDEX idx_col ON Orders (OrderDate, DueDate, ShipDate)
Great for your data warehouse, but no use for OLTP
![Page 31: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/31.jpg)
Parquet
ORC files
column for hadoop and other animals
http://parquet.io
![Page 32: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/32.jpg)
wide column databases
column families
![Page 33: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/33.jpg)
millions of columns
eventually consistent
CQL
http://cassandra.apache.org/http://www.datastax.com/
set | list | map types
![Page 34: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/34.jpg)
SQL: so many views, so much confusion
cell level security
accumulo https://accumulo.apache.org/
![Page 35: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/35.jpg)
Tim
e se
ries
![Page 36: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/36.jpg)
timeretrieving time series and graphs
window functionsSELECT business_date, ticker,
close, close /
LAG(close,1) OVER (PARTITION BY ticker ORDER BY business_date ASC)
- 1 AS ret FROM sp500
![Page 37: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/37.jpg)
timeOpen TSDB
Influx DB
key:sub-key:metric
key:*
key:*:metric
![Page 38: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/38.jpg)
Queues
![Page 39: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/39.jpg)
CREATE procedure [dbo].[Dequeue]AS
set nocount on
declare @BatchSize intset @BatchSize = 10
declare @Batch table (QueueID int, QueueDateTime datetime, Title nvarchar(255))
begin tran
insert into @Batchselect Top (@BatchSize) QueueID, QueueDateTime, Title from QueueMetaWITH (UPDLOCK, HOLDLOCK)where Status = 0order by QueueDateTime ASC
declare @ItemsToUpdate intset @ItemsToUpdate = @@ROWCOUNT
update QueueMetaSET Status = 1WHERE QueueID IN (select QueueID from @Batch)AND Status = 0
if @@ROWCOUNT = @ItemsToUpdatebegin
commit transelect b.*, q.TextData from @Batch binner join QueueData q on q.QueueID = b.QueueIDprint 'SUCCESS'
endelsebegin
rollback tranprint 'FAILED'
end
queues in SQL
![Page 40: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/40.jpg)
index fragmentation is a problem
queues in SQL
but built in logs of a sort
![Page 41: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/41.jpg)
specialised apis
message queues
capabilities like fan-out
routing
acknowledgement
![Page 42: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/42.jpg)
relationships countGraph databases
![Page 43: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/43.jpg)
relationships counttrees and hierarchies
overloaded relationships
fancy algorithms
![Page 44: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/44.jpg)
hierarchies with SQLadjacency lists
nested sets (MPTT)
materialised path
CONSTRAIN fk_parent_id_id FOREIGN KEY parent_id REFERENCES some_table.id
path = 1.2.23.55.786.33425
Node Left Right Depth
A 1 22 1
B 2 9 2
C 10 21 2
D 3 4 3
E 5 6 3
F 7 8 3
G 11 18 3
H 19 20 3
I 12 13 4
J 14 15 4
K 16 17 4
![Page 45: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/45.jpg)
OrientDB
Consolidation
ArangoDB
SQLs with NoSQL
Postgres
MySQL
https://www.arangodb.org/
http://www.orientechnologies.com/
![Page 46: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/46.jpg)
Big Data
![Page 47: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/47.jpg)
SQL on Hadoop
![Page 48: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/48.jpg)
Shark
More than SQL
Drill
Cascading
Map Reduce
![Page 49: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/49.jpg)
System issues, Speed issues,Soft issues
![Page 50: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/50.jpg)
Atomic
the ACID, BASE litmus
Consistent
Isolated
Durable
Basically Available
Soft-state
Eventually consistent
what matters to you?
![Page 51: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/51.jpg)
CAP it all
Consistency
AvailabilityPartition
![Page 52: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/52.jpg)
SQL writes cost a lot
write fast, ask questions later
mainly write workload: NoSQL
low latency write workload: NoSQL
![Page 53: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/53.jpg)
most NoSQL scales well
is it web scale?
but clusters still need management
are you facebook? one machine is easier than n
can ops handle it? app developers make bad admins
![Page 54: When to no sql and when to know sql javaone](https://reader038.vdocuments.us/reader038/viewer/2022103016/5560e798d8b42a016e8b514d/html5/thumbnails/54.jpg)
analysts: they want SQL
who is going to use it?
developers: they want applications
data scientists: they want access