sneak peek: couchbase analytics
TRANSCRIPT
©2016CouchbaseInc. 1
The Couchbase Connect16mobile appTake our in-app survey!
©2016CouchbaseInc.
Sneak Peek: Couchbase AnalyticsTill Westmann
Senior Director Engineering
Yingyi BuSenior Software Engineer
©2016CouchbaseInc.©2016CouchbaseInc.
Agenda
• Vision• How to use it?•What is new?•What is in the Developer Preview?• Demo
3
©2016CouchbaseInc. 4
What is Couchbase Analytics?
• Extend Couchbase Platform to power real-time analytics
• Ad-hoc queries (“Ask me anything!”)
• Workload isolation
• Independent scaling
UNIFIEDPROGRAMMINGINTERFACE
DATA QUERY INDEX TRANSPORT SEARCH ANALYTICS
UNIFIEDADMINISTRATION
COUCHBASECLUSTER
• Common programming model & data model
• Unified management
• Fast data synchronization
©2016CouchbaseInc. 5
How to use it?SQL++
©2016CouchbaseInc.©2016CouchbaseInc.
SQL++ Data: Beer Sample
{"name": "Piranha Pale Ale","abv": 5.7,"ibu": 0,"srm": 0,"upc": 0,"type": "beer","brewery_id": "110f04166d","updated": "2010-07-22 20:00:20","description": "","style": "American-Style Pale Ale","category": "North American Ale"
}
{"name": "Commonwealth Brewing #1","city": "Boston","state": "Massachusetts","code": "","country": "United States","phone": "","website": "","type": "brewery","updated": "2010-07-22 20:00:20","description": "","address": [ ],"geo": {
"accuracy": "APPROXIMATE","lat": 42.3584,"lng": -71.0598
}}
6
©2016CouchbaseInc.©2016CouchbaseInc.
SQL++: Simple Join
"Get 3 beers with their breweries"
SELECT bw.name AS brewer, br.name AS beer
FROM breweries bw, beers br
WHERE br.brewery_id = meta(bw).id
ORDER BY bw.name, br.name
LIMIT 3;
[{"brewer": "(512) Brewing Company","beer": "(512) ALT"
}, {"brewer": "(512) Brewing Company","beer": "(512) Bruin"
}, {"brewer": "(512) Brewing Company","beer": "(512) IPA"
}]
7
©2016CouchbaseInc.©2016CouchbaseInc.
SQL++: Non-key Self Join
"Get 3 beer names used by different breweries"
SELECT b1.name AS beer,
b1.brewery_id AS brewer1,
b2.brewery_id AS brewer2
FROM beers b1, beers b2
WHERE b1.name = b2.name
AND b1.brewery_id != b2.brewery_id
ORDER BY b1.brewery_id
LIMIT 3;
[{"brewer1": "aberdeen_brewing","brewer2": "hoffbrau_steaks_brewery_2","beer": "Scottish Ale"
},{"brewer1": "aberdeen_brewing","brewer2": "carlyle_brewing","beer": "Scottish Ale"
},{"brewer1": "aberdeen_brewing","brewer2": "belhaven_brewery","beer": "Scottish Ale"
}]
8
©2016CouchbaseInc.©2016CouchbaseInc.
SQL++: Nested Outer Join
"Get 2 breweries and the list of their beers"
SELECT bw.name AS brewer, (
SELECT br.name, br.abv
FROM beers br
WHERE br.brewery_id = meta(bw).id
) AS beers
FROM breweries bw
ORDER BY bw.name
LIMIT 2;
[{"beers": [
{ "abv": 8.2, "name": "(512) Pecan Porter" },{ "abv": 5.8, "name": "(512) Pale" }, ...
],"brewer": "(512) Brewing Company"
},{"beers": [
{ "abv": 7.2, "name": "21A IPA" },{ "abv": 5.8, "name": "North Star Red" }, ...
],"brewer": "21st Amendment Brewery Cafe"
}]
9
©2016CouchbaseInc.©2016CouchbaseInc.
SQL++: Grouping and Aggregation
"Get all breweries that produce more than 37 beers"
SELECT br.brewery_id, COUNT(*) AS num_beers
FROM beers br
GROUP BY br.brewery_id
HAVING num_beers > 37
ORDER BY num_beers DESC;
[{"num_beers": 57,"brewery_id": "midnight_sun_brewing_co"
},{"num_beers": 49,"brewery_id": "rogue_ales"
},{"num_beers": 38,"brewery_id": "anheuser_busch"
}]
10
©2016CouchbaseInc.©2016CouchbaseInc.
Couchbase Analytics DDL: Lifecycle
• Analytics extension for shadow datasets
CREATE BUCKET beerbucket WITH { "name": "beer-sample", "nodes":"127.0.0.1" };
CREATE SHADOW DATASET beers ON beerbucket WHERE `type` = "beer";
CREATE SHADOW DATASET breweries ON beerbucket WHERE `type` = "brewery";
CONNECT BUCKET beerbucket WITH { "password": "!@#", "timeout": 2000 };
SELECT * FROM beers ORDER BY abv DESC LIMIT 12;
DISCONNECT BUCKET beerbucket;
DROP DATASET breweries ;
DROP DATASET beers;
DROP BUCKET beerbucket;
11
©2016CouchbaseInc. 12
What is new?Why another service?
©2016CouchbaseInc. 13
Why another service?
• Extend Couchbase Platform to power real-time analytics
• Ad-hoc queries (“Ask me anything!”)
• Workload isolation
• Independent scaling
UNIFIEDPROGRAMMINGINTERFACE
DATA QUERY INDEX TRANSPORT SEARCH ANALYTICS
UNIFIEDADMINISTRATION
COUCHBASECLUSTER
• Common programming model & data model
• Unified management
• Fast data synchronization
©2016CouchbaseInc.©2016CouchbaseInc.
Couchbase Query and Analytics
14
§ Many queries § Eachtouchesalittledata § Fewerqueries § Eachtouchesalotofdata
Couchbase4.xQuery CouchbaseAnalytics
Optimizedfor
Analytics(OLAP)
Optimizedfor
Operations(OLTP)
©2016CouchbaseInc.©2016CouchbaseInc.
Example: Join, Grouping, and Aggregation
15
"Get the 10 chattiest users in a timeframe"
SELECT user.id, COUNT(message) AS count
FROM gbook_messages AS message, gbook_users AS user
WHERE message.author_id = user.id
AND message.send_time BETWEEN "2001-11-28T09:57:13" AND "2001-11-29T09:57:13"
GROUP BY user.id
ORDER BY count DESC
LIMIT 10;
©2016CouchbaseInc.©2016CouchbaseInc.
Couchbase Query and Analytics
1m(<10) 1h(<500) 1d(<5000)
JoinGByCBA JoinGByN1QLGSI
1w(<25K) 1mo(<100K) 3mo(<300K) 6mo(<600K)
JoinGByCBA JoinGByN1QLGSI
©2016CouchbaseInc.©2016CouchbaseInc.
Parallel Processing Example: Generalized Aggregation
17
SELECT br.brewery_id, COUNT(*) AS num_beersFROM beers brGROUP BY br.brewery_id;
{ "name": "dubbel", "brewery_id": 3 }{ "name": "saison", "brewery_id": 3 }
{ "name": "pils", "brewery_id": 2 }
{ "num_beers": 1, "brewery_id": 2 }{ "num_beers": 2, "brewery_id": 3 }
{ "name": "kölsch", "brewery_id": 2 }{ "name": "lager", "brewery_id": 1 }
{ "name": "ale", "brewery_id": 1 }
{ "num_beers": 2, "brewery_id": 1 }{ "num_beers": 1, "brewery_id": 2 }
{ "name": "tripel", "brewery_id": 3 }{ "name": "stout", "brewery_id": 1 }
{ "name": "weizen", "brewery_id": 2 }
{ "num_beers": 1, "brewery_id": 1 }{ "num_beers": 1, "brewery_id": 2 }{ "num_beers": 1, "brewery_id": 3 }
{ "num_beers": 2, "brewery_id": 3 }{ "num_beers": 2, "brewery_id": 1 }{ "num_beers": 1, "brewery_id": 1 }{ "num_beers": 1, "brewery_id": 3 }
{ "num_beers": 3, "brewery_id": 1 }{ "num_beers": 3, "brewery_id": 3 }
{ "num_beers": 1, "brewery_id": 2 }{ "num_beers": 1, "brewery_id": 2 }{ "num_beers": 1, "brewery_id": 2 }
{ "num_beers": 3, "brewery_id": 2 }
local aggregation
repartition
final aggregation
©2016CouchbaseInc.©2016CouchbaseInc.
Couchbase Analytics Coupling
• Separate services, separate nodes• Multi-Dimensional Scaling• Workload isolation
• Parallel shadowing of data(sets) via DCP• Low impact on data nodes• Low latency
18
ANALYTICS
ANALYTICS
ANALYTICS
ANALYTICS
DATA
DATA
DATA
©2016CouchbaseInc. 19
What is in the Developer Preview?
What works today?
©2016CouchbaseInc. 20
What is in the DP?
• Extend Couchbase Platform to power real-time analytics
• Ad-hoc queries (“Ask me anything!”)
• Workload isolation
• Independent scaling
UNIFIEDPROGRAMMINGINTERFACE
DATA QUERY INDEX TRANSPORT SEARCH ANALYTICS
UNIFIEDADMINISTRATION
COUCHBASECLUSTER
• Common programming model & data model
• Unified management
• Fast data synchronization
✘
✔
✔
✔
✔
✔
©2016CouchbaseInc.©2016CouchbaseInc.
What is in the DP?
• Stand-alone service
• HTTP APIs• SQL++ queries • Data shadowing • Minimal administration
• Query workbench
• "Installer"• Zip-file• Shell script for single machine
21
CCDriver
NCService
NCDriver NCDriver
NCService
SQL++Compiler
HTTPAPI Workbench
MetadataManager
StorageManager
DataflowEngine
StorageManager
DataflowEngine
©2016CouchbaseInc.©2016CouchbaseInc.
What is in the DP?
• Stand-alone service
• HTTP APIs• SQL++ queries • Data shadowing • Minimal administration
• Query workbench
• "Installer"• Zip-file• Shell script for single machine
22
CCDriver
NCService
NCDriver NCDriver
NCService
1 2 3 4
Partition Partition Partition Partition
1-4
©2016CouchbaseInc.
Demo
23
©2016CouchbaseInc.
after the break (3:10 PM) in this room
SQL++: SQL for NoSQLProf. Yannis Papakonstantinou
24
©2016CouchbaseInc.
Thank You!
Q & A
25
©2016CouchbaseInc. 26
Share your opinion on Couchbase1. Go here: http://gtnr.it/2eRxYWn
2. Create a profile
3. Provide feedback (~15 minutes)