sql gene in nosql

Post on 09-Jan-2017

223 Views

Category:

Engineering

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SQL Gene in NoSQL

How SQL Extensions are making flexible data structures Query-able

Cihan Biyikoglucihanb@live.com - @cihangirb

NoSQL Now – Aug 2015 – San Jose

Agenda Nature of Data

Relational vs NoSQL Database Approach Extending SQL for JSON

1. Ingest JSON and Return JSON2. Accessing JSON Hierarchies – the dot notation3. Bring full syntax – including JOINs and Sub-Queries4. Handle missing values – MISSING , UNSET5. Fold & Unfold JSON Hierarchies6. Native Set/Collection Handling7. Adapt Indexing to JSON

Nature of Real-World Data

Rich structure Attributes, Sub-structure

Relationships To other data

Value evolution Data is updated

Structure evolution Data is reshaped

Models for Real-World Data

Data Concern Tabular Model Document Model

Rich Structure Multiple flat tables Constant assembly /

disassembly Documents No assembly required!

Relationships Represented Queried (SQL)

Represented Queried with limitations

Value Evolution Data can be updated Data can be updated

Structure Evolution

Uniform and rigid Manual change

(disruptive) Flexible Dynamic change

Models for Real World Data

Data Concern Tabular Model Document Model JSON + SQL

Rich Structure

Multiple flat tables Constant

assembly / disassembly

Documents No assembly

required! Document

Relationships

Represented Queried (SQL)

Represented Queried with

limitations SQL

Value Evolution

Data can be updated

Data can be updated

Data can be updated

Structure Evolution

Uniform and rigid Manual change

(disruptive) Flexible Dynamic change

Flexible & Dynamic

For the Best of Both, Bring SQL to NoSQL From JSON

Rich structure, no assembly Schema flexibility

From SQL General, proven, and pervasive query capabilities Querying across relationships

SQL + JSON

For the Best of Both, Bring SQL to NoSQL SQL + JSON isn’t new and many are doing it

already Couchbase Server, SQL++ , DocumentDB, Apache Spark, Datastax, Relational Guys – SQL Server,

Postgress, MySQL, Memsql And so on…

However these implementation are very different!

SQL ExtensionsRequirements for the Best SQL+JSON

Implementation

Bring SQL to JSON Requirements for Full SQL + JSON Solution

1. Ingest JSON and Return JSON2. Accessing JSON Hierarchies – the dot notation3. Bring full syntax – including JOINs and Sub-Queries4. Handle missing values – MISSING , UNSET5. Fold & Unfold JSON Hierarchies6. Native Set/Collection Handling

#1 Ingest JSON and Return JSON

Tables and Rows require constant assembly and disassembly…

VS

DEMO #1

"default": { "Company": "Some Company", "Organization": { "Research": { "Robotics": { "AI": { "People": [ { "manager": true, "name": "p2" }, { "manager": false, "name": "p3" }, { "manager": false, "name": "p4" } ]

#2 Access the JSON Hierarchies - Dot Notation

<>.Company

<>.Organization.Research

<>.Organization.Research.Robotics.AI.People[0]

#3 Bring full syntax – including JOINs and Sub-

Queries

…You don’t need JOINs & Subqueries in Document

Models…

#3Embedded vs. Referenced

{ “Name”: “Pike Brewing”, …

“City”:”Seattle”, “Country”:”USA”}

“Beers”: [{“Name”: “Blond”, …

},{“Name”:”Dark”,…}

],

Key: Brewery_Pike_Brewing{ “Name”: “Pike Brewing”,“Beers”:[“Pike_Brewing_Blond”, “Pike_Brewing_Dark”], … “City”:”Seattle”, “Country”:”USA”}

Key: Brewery_Pike_Brewing

{ “Name”:”Pike Brewing Blond”, “Type”:”Blond Beer” …}

Key: Pike_Brewing_Blond

#4 Handling Missing Values

Schema evolves fast and SQL should accommodate

SELECT * FROM …WHERE (trips IS MISSING AND departing_code = …) OR (trips IS NOT MISSING AND trips[0].departing_code = …)

v1 v2{ “flight_plan_id”:…, “departing_code”…, “arriving_code”:…, …}

{ “flight_plan_id”:…, “trips”:[ {“departing_code”…, “arriving_code”:…}, {“departing_code”…, “arriving_code”:…}, ] …}

#5 Fold and Unfold JSON Hierarchies

{ “flight_plan_id”:…, “trips”:[ {“departing_code”:…, “arriving_code”:…}, {“departing_code”:…, “arriving_code”:…}, ] …}

DEMO #2

#6 Native Set/Collection Handling

Search a Level or Sub-Hierarchy

SELECT * FROM …WHERE ANY item IN/WITHIN RoboticsSATISFY item.name=“P3”

Research

Robotics

AI

P1 P2

Speech

Listen

P3 P4

Speak

Group 1

P5

Group 2

P6

P3

DEMO #3

Conclusion

Models for Real World Data

Data Concern Tabular Model Document Model JSON + SQL

Rich Structure

Multiple flat tables Constant

assembly / disassembly

Documents No assembly

required! Document

Relationships

Represented Queried (SQL)

Represented Queried with

limitations SQL

Value Evolution

Data can be updated

Data can be updated

Data can be updated

Structure Evolution

Uniform and rigid Manual change

(disruptive) Flexible Dynamic change

Flexible & Dynamic

Thank You@cihangirb

cihanb@live.com

Cihan Biyikoglu

top related