![Page 1: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/1.jpg)
From NoSQL to Mo’SQL
Gordon Guthrie
![Page 2: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/2.jpg)
Lots of people have worked on Riak TSAndrei ZavadaAndy TillBill SoudanBrett HazenBrian McClainBryce KerleyDerek SomogyiErik Johnson
Erik LeitchHeather McKelveyJohn DailyLauren RotherPaul HaganPavel HardakSeema Jethani
![Page 3: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/3.jpg)
SQL and NoSQL?
a match made in hell?
![Page 4: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/4.jpg)
select key, value from bucket where key=‘ModelT’;
![Page 5: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/5.jpg)
![Page 6: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/6.jpg)
The ring itself is a pain right up yer bahooky in slides
Cate Alice Alice Alice Bob
Bob Bob Cate Cate
1 2 3 4 5 6 7 8 9 10 11 12
Key = Alice
Hash Fn
Key = BobKey = Cate
![Page 7: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/7.jpg)
The ring is logical on physical nodes
Cate Alice Alice Alice Bob
Bob Bob Cate Cate
1 2 3 4 5 6 7 8 9 10 11 12
Server 1 Server 2 Server 3
![Page 8: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/8.jpg)
This give you two query modes1 2 3 4 5 6 7 8 9 10 11 12
I have a key I don’t have a key
![Page 9: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/9.jpg)
And its not just about the travelling1 2 3 4 5 6 7 8 9 10 11 12
I don’t have a key
![Page 10: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/10.jpg)
In summary
• You can talk to 2 servers if you have a key
• You must talk to all servers if you don’t
![Page 11: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/11.jpg)
Lets see how TS works
8:01 8:03 8:04
8:01 8:03 8:04
8:01 8:03 8:04 9:01 9:09
9:01 9:09
9:01 9:09
7:03 7:07 7:09
7:03 7:07 7:09
7:03 7:07 7:09
1 2 3 4 5 6 7 8 9 10 11 12
Key = (Alice, quantum(Time, 10, ’s'))
Hash Fn
![Page 12: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/12.jpg)
Different access patterns
• You can talk to 2 servers if you want to query the data across 1 quantum
• Add another 2 for 2 quanta• eventually must talk to all
servers
![Page 13: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/13.jpg)
There are trade-offs
• You can make the quanta bigger which means less trips to read more data
• but your write pattern gets lumpier with higher risk of hot spots
![Page 14: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/14.jpg)
Can we improve that?
![Page 15: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/15.jpg)
We have 2i indices1 2 3 4 5 6 7 8 9 10 11 12
Key
Index
To get a list of keys that match an index you visit 1/3 of nodes +1 and make an index read
![Page 16: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/16.jpg)
You’ve seen the movie
The SQL
![Page 17: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/17.jpg)
Why SQL
•Everybody knows it/low barrier to entry
•Good tooling• Its a declarative language, but extendable
![Page 18: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/18.jpg)
High level architecture
Riak KV
Riak Core
leveldb leveldb leveldb
Riak QL
![Page 19: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/19.jpg)
KV stores are know-nothing wrt valuesCREATE TABLE GeoCheckin
( id SINT64 NOT NULL,
time TIMESTAMP NOT NULL, region VARCHAR NOT NULL,
state VARCHAR NOT NULL, weather VARCHAR NOT NULL,
temperature DOUBLE, PRIMARY KEY (
(id, QUANTUM(time, 15, 'm')), id, time
))
Erlang data structure(a record)
leex/yecc
riak core metadata
distribute round cluster with riak_core
load Erlang ddl helper module
as .beam
merl
![Page 20: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/20.jpg)
The query system is distributed
Co-ordinator maybe remote vnode
maybe remote vnode
Init
new query
get data
get data
read data
process read data
process
process reply
![Page 21: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/21.jpg)
Quite complex query validation process
| DDL DDL DDL Function Helper | Helper Helper Type Sigs + + + + | | | +-----+-----+ | | | | | | | v v vDeclatory Are fields YES Is valid TS YES Is query YES To QuerySQL Select +----> in table? +----------> WHERE clause? +-----> typesafe? +-----> RewriterRecord + + + NO | NO | NO | | | | | v v v Error | Error Error
|
Lexer/Parser - QL | Query Compiler KV
![Page 22: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/22.jpg)
Unroll all the SQLSELECT AVG(temp) FROM mytimeseries WHERE family = ‘myfamily' AND series = ‘myseries' AND timestamp > 1233 AND timestamp < 6789 AND temp > 18;
<----Erlang Coordinator-----> <-----LeveldDB C++ Code-----> <---Network--->+ FROM <-----------------------+ + FROM mytable on vnode X| | || SELECT SUM(STemp)/SUM(NoTemp) | | SELECT SUM(temp) AS STemp, COUNT(temp) AS NoTemp| | Chunk1 || GROUP BY [] +--------+ GROUP BY []| | || ORDER BY [] | | ORDER BY []| | |+ WHERE [] | + WHERE + start_key = {myfamily, myseries, 1233} | | end_key = {myfamily, myseries, 4000} | + temp > 18 | | + FROM mytable on vnode Y | | | | SELECT SUM(temp) AS STemp, COUNT(temp) AS NoTemp | Chunk2 | +--------+ GROUP BY [] | | ORDER BY [] | + WHERE + start_key = {myfamily, myseries, 4001} | end_key = {myfamily, myseries, 6789} + temp > 18
![Page 23: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/23.jpg)
declarative SQL(decorated with execution hints)
Query rewriting in a nutshell
Query Plan(executable fragments)
transform syntax
preserve semantics
![Page 24: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/24.jpg)
Schematic SQL Operations
Table In Shell Data On Disk
+-------+-------+ +-------+-------+-------+| ColX | ColY | | Col1 | Col2 | Col3 || Type1 | Type2 | | Type1 | Type2 | Type3 |+-------+-------+ SQL Query +-------+-------+-------+ <--------------++-------+-------+ +-------+-------+-------+| Val1X | Val1Y | | Val1a | Val1b | Val1c |+---------------+ +-----------------------+| Val2X | Val2Y | | Val2a | Val2b | Val2c |+-------+-------+ +-----------------------+ | Val3a | Val3b | Val3c | +-------+-------+-------+
![Page 25: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/25.jpg)
All the fragments meet this pattern - row ops
+-------+-------+-------+ +-------+-------+-------+| Col1 | Col2 | Col3 | | Col1 | Col2 | Col3 || Type1 | Type2 | Type3 | | Type1 | Type2 | Type3 |+-------+-------+-------+ Operation +-------+-------+-------+ <-------------++-------+-------+-------+ +-------+-------+-------+| Val1a | Val1b | Val1c | WHERE | Val1a | Val1b | Val1c |+-----------------------+ GROUP BY +-----------------------+| Val3a | Val3b | Val3c | ORDER BY | Val2a | Val2b | Val2c |+-----------------------+ LIMIT +-----------------------+| Val6a | Val6b | Val6c | DISTINCT | Val3a | Val3b | Val3c |+-----------------------+ HAVING +-----------------------+| Val5a | Val5b | Val5c | | Val4a | Val4b | Val4c |+-------+-------+-------+ +-----------------------+ | Val5a | Val5b | Val5c | +-----------------------+ | Val6a | Val6b | Val6c | +-------+-------+-------+
![Page 26: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/26.jpg)
Row and column operations
+-------+-------+ +-------+-------+-------+| ColX | ColY | | Col1 | Col2 | Col3 || Type1 | Type2 | | Type1 | Type2 | Type3 |+-------+-------+ Operation +-------+-------+-------+ <-------------++-------+-------+ +-------+-------+-------+| Val1X | Val1Y | SELECT | Val1a | Val1b | Val1c |+---------------+ +-----------------------+| Val2X | Val2Y | | Val2a | Val2b | Val2c |+-------+-------+ +-----------------------+ | Val3a | Val3b | Val3c | +-------+-------+-------+
![Page 27: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/27.jpg)
Column Name Vector Ops
+-------+-------+-------+ +-------+-------+-------+| ColX | ColY | ColZ | | Col1 | Col2 | Col3 || Type1 | Type2 | Type3 | | Type1 | Type2 | Type3 |+-------+-------+-------+ Operation +-------+-------+-------+ <-------------++-------+-------+-------+ +-------+-------+-------+| Val1a | Val1b | Val1c | AS | Val1a | Val1b | Val1c |+-----------------------+ +-----------------------+| Val2a | Val2b | Val2c | | Val2a | Val2b | Val2c |+-----------------------+ +-----------------------+| Val3a | Val3b | Val3c | | Val3a | Val3b | Val3c |+-----------------------+ +-----------------------+| Val4a | Val4b | Val4c | | Val4a | Val4b | Val4c |+-------+-------+-------+ +-----------------------+
![Page 28: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/28.jpg)
Executable Fragments
{where, [ {and_, {'=', <<"sequence_number">>, {integer, 2321}}, {'=', <<"time">>, {integer, 1400497861762723}} } ]}
YASL’s all the way down
![Page 29: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/29.jpg)
How much SQL?• SELECT• WHERE• GROUP BY• ORDER BY/LIMIT is being worked on• functions:
–AVG/MEAN–MAX–MIN–SUM–COUNT–STDDEV/STDDEV_SAMP–STDDEV_POP
![Page 30: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/30.jpg)
What does ‘decorated with execution hints mean’?
SELECT * FROM mytable;SELECT * FROM mytable LIMIT 1000;SELECT * FROM mytable WITH frobulate=on;
Standard SQL works in Tools
Extensions Set as table defaults
![Page 31: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/31.jpg)
What does the future hold?
![Page 32: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/32.jpg)
Riak has other sorts of co-located data• CRDT sets look like colocation
– 100,000 elements in a CRDT set– written to a vnode under a key
• performance issues– monolithic object– read 100,000 element set from disk– operate on it– write it back to disk
Enter big sets!
![Page 33: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/33.jpg)
How would that work?
Table In Shell Data On Disk
+-------+-------+ +-------+-------+-------+| ColX | ColY | | Col1 | Col2 | Col3 || Type1 | Type2 | | Type1 | Type2 | Type3 |+-------+-------+ SQL Query +-------+-------+-------+ <--------------++-------+-------+ +-------+-------+-------+| Val1X | Val1Y | | Val1a | Val1b | Val1c |+---------------+ +-----------------------+| Val2X | Val2Y | | Val2a | Val2b | Val2c |+-------+-------+ +-----------------------+ | Val3a | Val3b | Val3c | +-------+-------+-------+
Table Schemas
Set of Rows
Maps
Set of Maps
![Page 34: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/34.jpg)
High level architecture
Big Sets/Big Maps
Riak Core
leveldb leveldb leveldb
Riak QL
![Page 35: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/35.jpg)
…and because maps are recursive and can contain sets which can be maps
•we have prototyped subsets of relational queries–left or inner joins
Much excites!
![Page 36: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/36.jpg)
You still have the 2 query paths…
Talk to all servers
Talk to 2 servers
Can we ‘steal’ some of the causality information from Delta ops and use that to build single point access eventually consistent indices?
Dunno!
![Page 37: From NoSQL to Mo’SQL€¦ · riak_core load Erlang ddl helper module as .beam merl. The query system is distributed Co-ordinator maybe remote vnode maybe remote vnode Init new query](https://reader033.vdocuments.us/reader033/viewer/2022050400/5f7dbe4f0dd9c05f674b01c7/html5/thumbnails/37.jpg)
Fin