sql and nosql are two sides of the same coin

21
© 2012 Microsoft SQL AND NOSQL ARE TWO SIDES OF THE SAME COIN Michael Rys, Microsoft Corp. @SQLServerMike Strata 2012 Conference, March 2012

Upload: woody

Post on 25-Feb-2016

52 views

Category:

Documents


1 download

DESCRIPTION

SQL and NoSQL Are Two Sides Of The Same Coin. Michael Rys, Microsoft Corp. @ SQLServerMike. Strata 2012 Conference, March 2012. Agenda. Scaling out your business is important! NoSQL Paradigms and NoSQL Platforms SQL learns from NoSQL (with a demo of SQL Azure Federations) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SQL and  NoSQL  Are Two Sides Of The Same Coin

© 2012 Microsoft

SQL AND NOSQL ARE TWO SIDES OF THE SAME COIN

Michael Rys, Microsoft Corp.@SQLServerMike

Strata 2012 Conference, March 2012

Page 2: SQL and  NoSQL  Are Two Sides Of The Same Coin

AGENDA• Scaling out your business is important!• NoSQL Paradigms and NoSQL Platforms• SQL learns from NoSQL (with a demo of SQL Azure Federations)

• NoSQL learns from SQL• Scalable Data Processing Platform of the Future

Page 3: SQL and  NoSQL  Are Two Sides Of The Same Coin

Attract Individual Consumers:- Provide

interesting service

- Provide mobility

- Provide social

Monetize Individual:- Upsell service

- VIP- Speed- Extra

Capabilities

Monetize the Social:- Improve individual

experience- Re-sell Aggregate

Data (e.g., Advertisers)

THE WEB 2.0 BUSINESS ARCHITECTURE

Page 4: SQL and  NoSQL  Are Two Sides Of The Same Coin

SOCIAL NETWORKING: THE BUSINESS PROBLEM• 100s of million of users

• 10s of million of users concurrently• Terabytes to petabytes of data

• Structured and unstructured• Required (eventual) data consistency across users• E.g. show your updated state in your friends’ profile pages

Page 5: SQL and  NoSQL  Are Two Sides Of The Same Coin

SOLUTION• Shard/Partition user data across

hundreds to thousands of SQL Databases• Propagate data changes from one DB to

other DBs using reliable, async Message Service• Managing routes from each DB to every other

DB would be too complex• Global Transactions would hinder scale and

availability• Provide a caching layer for performance• And also used for

oClean-up state (e.g. on account close)

oDeploy business logic (stored procedures)

Page 6: SQL and  NoSQL  Are Two Sides Of The Same Coin

EXAMPLE ARCHITECTURE

1-1000

3001-4000

2001-3000 1001-

2000

4001-5000

5001-6000

I change my status

userId=1024

Web TierData Tier

ServiceDispatch

er

AsyncMessage

AsyncMessage

TX2

My DBgets updatedTX

1AsyncMessage

TX3

TX4

TX5

Page 7: SQL and  NoSQL  Are Two Sides Of The Same Coin

MANY LARGE SCALE CUSTOMERS USING SIMILAR PATTERNS• Patterns

• Sharding and reliable messaging• Sharding and fan/out query layer• Caching layer

• Customer Examples• Social Networking: Facebook, MySpace, etc• Online electronic stores (cannot give names )• Travel reservation systems (e.g. Choice International)• MSN Casual Gaming• etc.

Page 8: SQL and  NoSQL  Are Two Sides Of The Same Coin

LESSONS LEARNED FROM THESE SCENARIOS• Require high availability• Be able to scale out:

• Functional and Data Partitioning Architecture• Provide scale-out processing:

o Function shippingo Fanout and Map/Reduce processing

• Be able to deal with failures:o Quorumo Retrieso Eventual Consistency (similar to Read-consistent Snapshot Isolation)

• Be able to quickly grow and change:• Elastic scale• Flexible, open schema• Multi-version schema support

Move better support for these patterns into the Data Platform!

Page 9: SQL and  NoSQL  Are Two Sides Of The Same Coin

WHAT IS NOSQL ABOUT?• NoSQL = operational and developer agility at low CapEx and OpEx!

• Low Cost• Free Open Source Stores• Scale CapEx cost below customer growth rate• Web friendly developer model and tool chain

• Processing Paradigms• High Availability (scalable Replication, Fast Failover, DR/GeoDR, tunable latency)• Scale-out (Sharding, Map-Reduce, Elasticity)• Performance (tuned for specific workloads, Caching, co-located compute with partitioned state)• Tunable/Eventual Consistency

• Data Model Paradigms• Data first: Flexible Schema• Low-impedance mismatch between programming and data model:

o Key-Documents and Objects (BLOBS, JSON, XML, POJO)o Key-Wide Sparse Column Setso Graphs (e.g., RDF)

• Range from devices, over OLTP Web 2.0 applications to BigData Analytics

Page 10: SQL and  NoSQL  Are Two Sides Of The Same Coin

DATA MODELSData Model Example Stores (apologies to the ones I did not

list)Simple Key-Value Pairs Memcache, Redis, Dynamo, Voldermort, LevelDB, Azure

CachingWide Sparse Column Sets HyperTable, Big Table, Cassandra, HBASE, Hyperbase,

Amazon DynamoDB, Windows Azure Tables, SQL Server/Azure Sparse columns

BLOBs Amazon S3, Oracle Berkeley NoSQL, Windows Azure Blob Store, SQL Server RBS/FileTable

JSON Documents MongoDB, CouchBase, Riak, RavenDB

Graph Neo4J, GraphDB, HypergraphDB, Stig, Intellidimension

Objects and XML Documents

Versant, Oracle Berkeley NoSQL, MarkLogic, existDB, EMC HiveDB, SQL Server/Azure, Oracle, IBM DB2

Extended Relational Oracle, EMC SQLFire, IBM DB2, MySQL, Postgres, SQL Server/Azure

Page 11: SQL and  NoSQL  Are Two Sides Of The Same Coin

WHAT CAN SQL LEARN FROM NOSQL?

• Low CapEx, Low OpEx • Built-in tunable High-Availability• Data scale-out (Sharding)• Processing scale-out (Map-Reduce, Fan-Out, tunable consistency) • Flexible Data Models

• JSON (& XML) support• Sparse columns/Column sets

• Integrate with BigData Analytics (e.g., Hadoop)

Many Relational Database Systems are incorporating these learning!

Page 12: SQL and  NoSQL  Are Two Sides Of The Same Coin

EXAMPLE: SQL AZURE FEDERATIONS• Provides Data Partitioning/Sharding at the Data Platform• Enables applications to build elastic scale-out applications• Provides non-blocking SPLIT/DROP for shards (MERGE to

come later)• Auto-connect to right shard based on sharding keyvalue• Provides SPLIT resilient query mode

Page 13: SQL and  NoSQL  Are Two Sides Of The Same Coin

SQL AZURE FEDERATION CONCEPTS

16

Federation “Orders_Fed”

ShardedApplication

Azure DB with Federation RootFederation Directories, Federation Users, Federation Distributions, …

Federation- Represents the data being sharded

Federation Root- Database that logically houses federations,

contains federation meta data Federation Key

- Value that determines the routing of a piece of data (defines a Federation Distribution)

Atomic Unit- All rows with the same federation key value:

always together! Federation Member (aka Shard)

- A physical container for a set of federated tables for a specific key range and reference tables

Federated Table- Table that contains only atomic units for the

member’s key range Reference Table

- Non-sharded table

Member: PK [min, 100)

Member: PK [100, 488)

Member: PK [488, max)

(Federation Key: CustomerID)

AUPK=

5

AUPK=25

AUPK=35

AUPK=105

AUPK=235

AUPK=365

AUPK=555

AUPK=254

5

AUPK=356

5Connectio

n G

ateway

Page 14: SQL and  NoSQL  Are Two Sides Of The Same Coin

17

DEMOMAP-REDUCE SCALE-OUT OVER SQL AZURE FEDERATIONS

• Sharded GamesInfo table using SQL Azure Federations

• Use a C# library that does implement a Map/Reduce processor on top SQL Azure Federations

• Mapper and Reducer are specified using SQL

Page 15: SQL and  NoSQL  Are Two Sides Of The Same Coin

WHAT CAN NOSQL LEARN FROM SQL?• Flexible data is good, but:

• Provide optional schema in data platform to help with constraints and optimizations• Procedural Scale-Out processing is good, but:

• Develop a declarative language suited for and across the data models (e.g., coSQL)• Standardize suitable abstractions and languages

• Eventual Consistency is good, but:• Provide users the choice

• Simple Queries are good, but:• Provide me with secondary indexes• it will be more efficient to join between two collections of JSON documents in the

query engine than in the Application layer

Many NoSQL Database Systems are starting to incorporate these learnings!

Page 16: SQL and  NoSQL  Are Two Sides Of The Same Coin

Attract Individual Consumers:- Provide

interesting service

- Provide mobility

- Provide social

Monetize Individual:- Upsell service

- VIP- Speed- Extra

Capabilities

Monetize the Social:- Improve individual

experience- Re-sell Aggregate

Data (e.g., Advertisers)

THE WEB 2.0 BUSINESS ARCHITECTURE

Page 17: SQL and  NoSQL  Are Two Sides Of The Same Coin

OLTP Workloads

Highly AvailableHigh ScaleHigh Flexibility

mostly touching 1 to low number of shards

Dynamic OLAP Workloads

3Vs (Volume, Velocity, Variety) Exploratory

Scale-out queries, often using eventual consistent scale-out frameworks like Hadoop

SCALE-OUT DATA PLATFORM ARCHITECTURE

Traditional OLAP Workloadsknown schemaData warehouse, “Star joins”

Copy

Query

SQL or NoSQL Store

Page 18: SQL and  NoSQL  Are Two Sides Of The Same Coin

BIG DATA REQUIRES AN END-TO-END APPROACH

21

Page 19: SQL and  NoSQL  Are Two Sides Of The Same Coin

22

CALL TO ACTION• Familiarize yourself with the NoSQL genes in the Microsoft Online Platform

• Free 3-Month Trial for Windows and SQL Azure: http://www.windowsazure.com

• Engage with us throughout Strata

• Download slides with additional information and related resources: http://.../....

Presentation Speaker Date and TimeDo We Have the Tools We Need to Navigate the New World of Data?

Dave Campbell 2/29 9:00am PST

Onsite Interview * Tim O’Reilly, Dave Campbell 2/29 10:15am PST

Unleash Insights on All Data With Microsoft Big Data

Alexander Stojanovic 2/29 11:30am PSTOffice Hours (Q&A session) Dave Campbell 2/29 1:30pm PSTHadoop + Javascript: What We Learned

Asad Khan 2/29 2:20pm PSTDemocratizing BI at Microsoft: 40,000 Users and Counting

Kirkland Barrett 3/1 10:40am PSTData Marketplaces For Your Extended Enterprise

Piyush Lumba 3/1 2:20pm PST

Page 20: SQL and  NoSQL  Are Two Sides Of The Same Coin

23

APPENDIX

Page 21: SQL and  NoSQL  Are Two Sides Of The Same Coin

RELATED RESOURCES• Scale-Out with SQL Databases

• http://gigaom.com/cloud/facebook-shares-some-secrets-on-making-mysql-scale/ • Windows Gaming Experience Case Study: http://

www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=4000008310 • Scalable SQL: http://cacm.acm.org/magazines/2011/6/108663-scalable-sql• http://www.slideshare.net/MichaelRys/scaling-with-sql-server-and-sql-azure-federations

• NoSQL and the Windows Azure Platform• Whitepaper:

http://download.microsoft.com/download/9/E/9/9E9F240D-0EB6-472E-B4DE-6D9FCBB505DD/Windows%20Azure%20No%20SQL%20White%20Paper.pdf

• SQL Federation blog: http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure-federations.aspx

• Contact me• @SQLServerMike• http://sqlblog.com/blogs/michael_rys/default.aspx