webinar - moving from mysql to couchbase
DESCRIPTION
Many organizations are making the shift from relational databases to NoSQL. In this webinar we'll cover the important things you need to keep in mind in moving from MySQL to Couchbase for your application.TRANSCRIPT
Moving from MySQLto Couchbase
Don Pinto
Product Marketing Manager
Why transition to NoSQL?
You are here because..
You want to build an app
and currently have a relational database
It’s probably gotten a little bit big..
http://www.seoclerks.com/imagedb/2005/BIG-CAT-FOUND-Spoh
Or maybe huge
http://www.2pep.com/extreme-funny-stuff/
And complicated
You’re not aloneTop two big drivers for NoSQL
adoption
Lack of flexibility/rigid schemas
Inability to scale out data
Performance challenges
Cost All of these Other
49%
35%
29%
16% 12% 11%
Source: Couchbase Survey, December 2011, n = 1351.
NoSQL catalog
Key-Value
memcached redis
Data Structure Document Column Graph
mongoDB
couchbase cassandra
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
Neo4j
Distributed DOCUMENT DATABASES
Document Databases
• Each record in the database is a self-describing document
• Each document has an independent structure
• Documents can be complex • All databases require a unique key• Documents are stored using JSON or
XML or their derivatives• Content can be indexed and queried • Offer auto-sharding for scaling and
replication for high-availability
{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:
{“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:
[“SERVER”, “US-West”, “API”]
}}
Comparison of DATA Models
Relational vs Document data model
Relational data model Document data modelCollection of complex documents with
arbitrary, nested data formats andvarying “record” format.
Highly-structured table organization with rigidly-defined data formats and
record structure.
JSONJSON
JSON
C1 C2 C3 C4
{
}
Example: User Profile
Address Info
1 DEN 30303CO
2 MV 94040CA
3 CHI 60609IL
User Info
KEY First ZIP_idLast
4 NY 10010NY
1 Dipti 2Borkar
2 Joe 2Smith
3 Ali 2Dodson
4 John 3Doe
ZIP_id CITY ZIPSTATE
1 2
2 MV 94040CA
To get information about specific user, you perform a join across two tables
All data in a single document
Document Example: User Profile
{ “ID”: 1, “FIRST”: “Dipti”, “LAST”: “Borkar”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA” }
JSON
= +
User ID First Last Zip
1 Dipti Borkar 94040
2 Joe Smith 94040
3 Ali Dodson 94040
4 Sarah Gorin NW1
5 Bob Young 30303
6 Nancy Baker 10010
7 Ray Jones 31311
8 Lee Chen V5V3M
• • •
50000 Doug Moore 04252
50001 Mary White SW195
50002 Lisa Clark 12425
Country ID
TEL3
001
Country ID Country name
001 USA
002 UK
003 Argentina
004 Australia
005 Aruba
006 Austria
007 Brazil
008 Canada
009 Chile
• • •
130 Portugal
131 Romania
132 Russia
133 Spain
134 Sweden
User ID Photo ID Comment
2 d043 NYC
2 b054 Bday
5 c036 Miami
7 d072 Sunset
5002 e086 Spain
Photo Table
001
007
001
133
133
User ID Status ID Text
1 a42 At conf
4 b26 excited
5 c32 hockey
12 d83 Go A’s
5000 e34 sailing
Status Table
134
007
008
001
005
Country Table
User ID Affl ID Affl Name
2 a42 Cal
4 b96 USC
7 c14 UW
8 e22 Oxford
Affiliations TableCountry
ID
001
001
001
002
Country ID
Country ID
001
001
002
001
001
001
008
001
002
001
User Table
...
Making a Change Using RDBMS
Making the Same Change with a Document Database
{ “ID”: 1, “FIRST”: “Don”, “LAST”: “Pinto”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA”, “STATUS”: { “TEXT”: “At Conf” }
}
“GEO_LOC”: “134” },“COUNTRY”: ”USA”
Just add information to a document
JSON
,}
Where is NoSQL a good fit?
Market AdoptionInternet Companies Enterprises
• Social Gaming• Ad Networks• Social Networks• Online Business
Services• E-Commerce• Online Media• Content Management• Cloud Services
• Communications
• Retail
• Financial Services
• Health Care
• Automotive/Airline
• Agriculture
• Consumer Electronics
• Business Systems
Market Adoption – CustomersInternet Companies Enterprises
More than 300 customers -- 5,000 production deployments worldwide
Application Characteristics - Data driven
• 3rd party or user defined structure (Twitter feeds)
• Support for unlimited data growth (Viral apps)
• Data with non-homogenous structure
• Need to quickly and often change data structure
• Variable length documents
• Sparse data records
• Hierarchical data
Couchbase is a good fit
Application Characteristics - Performance driven
• Low latency critical (ex. 1millisecond)
• High throughput (ex. 200000 ops / sec)
• Large number of users
• Unknown demand with sudden growth of users/data
• Predominantly direct document access
• Read / Mixed / Write heavy workloads
Couchbase is a good fit
Common Use CasesSocial Gaming• Couchbase stores
player and game data
• Examples customers include: Zynga
• Tapjoy, Ubisoft, Tencent
Mobile Apps• Couchbase stores user
info and app content
• Examples customers include: Kobo, Playtika
Ad Targeting• Couchbase stores
user information for fast access
• Examples customers include: AOL, Mediamind, Convertro
Session store• Couchbase Server as a key-
value store
• Examples customers include: Concur, Sabre
User Profile Store• Couchbase Server as a
key-value store
• Examples customers include: Tunewiki
High availability cache• Couchbase Server used as a cache tier replacement
• Examples customers include: Orbitz
Content & Metadata Store
• Couchbase document store with Elastic Search
• Examples customers include: McGraw Hill
3rd party data aggregation • Couchbase stores social media and
data feeds• Examples customers include:
Sambacloud
From SQL to NoSQL
Mental Adjustments
• In SQL we tend to want to avoid hitting the database as much as possible
• In Couchbase, gets and sets on documents are in memory. This is fast.
• The key to finding data is the key of the document.
• Many newcomers see views as a replacement for key design, because it seems more “SQL” –like.
Complex Joins vs. Multiple Gets
Document modeling
When considering how to model data for a given application
• Think of a logical container for the data
• Think of how data groups together
• Start by creating documents from application-level objects
• Documents that grow continuously or under write contention should be split
Q• Are these separate objects in the model layer? • Are these objects accessed together? • Do you need updates to these objects to be atomic?• Are multiple people editing these objects concurrently?
Document Design Options
• One document that contains all related data Data is de-normalized Better performance and scale Eliminate client-side joins But if the document grows continuously or has write
contention, it should be split
• Separate documents for different object types with cross references Data duplication is reduced Objects may not be co-located Transactions supported only on a document boundary Most document databases do not support joins
Document ID / Key selection• Similar to primary keys in relational databases
• Documents are partitioned based on the document ID
• ID based document lookup is extremely fast
• Usually an ID can only appear once in a bucket
Options• UUIDs, date-based IDs, numeric IDs • Hand-crafted (human readable) • Matching prefixes (for multiple related objects)• Creative Keying (on next few slides )
Q • Do you have a unique way of referencing objects?• Are related objects stored in separate documents?
Example: Entities for a Blog
• User profile
The main pointer into the user data• Blog entries• Badge settings, like a twitter badge
• Blog posts
Contains the blogs themselves
• Blog comments
• Comments from other users
BLOG
{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:
{“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:
[“SERVER”, “US-West”, “API”]
}}
Blog Document – Option 1 – Single document
{ “_id”: “Couchbase_Hello_World”,“author”: “dborkar”, “type”: “post”“title”: “Hello World”,“format”: “markdown”, “body”: “Hello from [Couchbase](http://couchbase.com).”, “html”: “<p>Hello from <a href=\“http: …“comments”:[ [“format”: “markdown”, “body”:”Awesome post!”], [“format”: “markdown”, “body”:”Like it.” ] ]}
Creative KeyingCounter-ID pattern
Uses of this pattern
• Creating an index based list
• Globally referenced documents with unique key
Creative KeyingLookup pattern
Uses of this pattern
• Categorical references
• Single document referred by multiple keys (Lookup user by username, TwitterID, FacebookID)
Blog Document – Option 2 - Split into multiple docs
{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:
{“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:
[“SERVER”, “US-West”, “API”]
}}
{ “_id”: “Coucbase_Hello_World”,“author”: “dborkar”, “type”: “post”“title”: “Hello World”,“format”: “markdown”, “body”: “Hello from [Couchbase](http://couchbase.com).”, “html”: “<p>Hello from <a href=\“http: …“comments”:[
“comment1_Couchbase_Hello_world”
]}
{ “UUID”: “21f7f8de-8051-5b89-86“Time”: “2011-04-01T13:01:02.42“Server”: “A2223E”,“Calling Server”: “A2213W”,“Type”: “E100”,“Initiating User”: “[email protected]”,“Details”:
{“IP”: “10.1.1.22”,“API”: “InsertDVDQueueItem”,“Trace”: “cleansed”,“Tags”:
[“SERVER”, “US-West”, “API”]
}}
{“_id”: “comment1_Couchbase_Hello_World”,“format”: “markdown”, “body”:”Awesome post!” }
BLOG DOC
COMMENT
Threaded Comments• You can imagine how to take this to a threaded list
BlogFirst comment
Reply to comment
More Comments
List
List
Advantages• Only fetch the data when you need it• For example, rendering part of a web page
• Spread the data and load across the entire cluster
Java PetStore App Example
Pet Store ER Diagram
Documents Modeling in CouchbaseCustomer Document{"id":"customer_marc","type":"customer","login":"marc","password":"marc","firstname":"Marc","lastname":"Fleury","telephone":null,"email":"[email protected]","homeAddress":{"street1":"65 Ritherdon Road","street2":null,"city":"Los Angeles","state":null,"zipcode":"56421","country":"USA"},"dateOfBirth":1363794557891,"age":null}
Order Document{"id":"Marc","type":"order","orderDate":null,"customer":{"id":1,"login":"marc","password":"marc","firstname":"Marc",…},},"orderLines":[{"id":null,"quantity":1,"item":{"id":"item_Goldfish_Male Puppy","type":"item","name":"Male Puppy",}},
Category Document{"id":"category_Birds","type":"category","name":"Birds","description":"Any of ...","products":[{"id":"product_Amazon Parrot","type":"product","name":"Amazon Parrot","description":"Great companion for up to 75 years","items":[{"id":"item_Male Adult","type":"item","name":"Male Adult","description":"Lorem ...",
Pet store queries in Couchbase
Define a primary index on the bucket
• Lookup the document ID / key by key, range, prefix, suffixIndex definition
Define a secondary index on the bucket
• Lookup an attribute by value, range, prefix, suffixIndex definition
List all the item categories• If a document if of type category, we want to get the doc IDs
Find items by name• Let’s find all the items
Application code changesDerby callsfind : TypedQuery<Category> typedQuery = em.createNamedQuery(Category.FIND_BY_NAME, Category.class);
save : em.persist(category);
update : em.merge(category);
delete : em.remove(em.merge(category));
Couchbase callsfind : client.get(categoryName);
save : client.set(category.getName(), EXP_TIME, mapper.writeValueAsString(category));
update : client.replace(category.getName(), EXP_TIME, mapper.writeValueAsString(category));
delete : client.delete(category.getName());
Questions?
Thank [email protected]
@NoSQLDon