clustrix big data podcast

14
The Leading Scale-out SQL Database Engineered for the Cloud Robin Purohit CEO and President

Upload: insidehpc

Post on 07-May-2015

278 views

Category:

Technology


0 download

DESCRIPTION

In this slidecast, Robin Purohit of Clustrix describes the company's leading scale-out SQL database engineered for the cloud. "Clustrix provides the scale, flexibility, simplicity, availability, and raw power that have given both enterprise and fast-growth organizations the ability to innovate faster -- and drive those innovations to market sooner than their competition. As the most mature of the primary databases, Clustrix is the leading scale-out SQL database engineered for the cloud. With Clustrix, organizations can scale transactions, run real-time analytics, and simplify operations." Learn more: http://www.clustrix.com Watch the presentation video: http://inside-bigdata.com/2013/09/06/clustrix-scaleout-sql-database-engineered-cloud/

TRANSCRIPT

Page 1: Clustrix Big Data Podcast

The Leading Scale-out SQL Database Engineered for the Cloud

Robin Purohit

CEO and President

Page 2: Clustrix Big Data Podcast

SCALE-OUT DATABASES ARE THE RIGHT APPROACH

UNLESS YOU HAVE UMLIMITED MONEY TO SPEND

NoSQL NewSQL Hadoop

Page 3: Clustrix Big Data Podcast

FOR HYPER-SCALE WEB AND MOBILE APPLICATIONS

Cloud Makes It Possible Do This Quickly and Pay-as-you-go

Great Idea Billions of Transactions and Rows

Smarter Application

Ad HocReporting

Page 4: Clustrix Big Data Podcast

SCALE-OUT SQL DATABASE FOR OPERATIONAL DATA

MASSIVE TRANSACTIONVOLUME

REAL-TIME ANALYTICS

ACID, SQL AND MYSQL

SELF-MANAGING

BUILT-IN INSTRUMENTATION

SCALE-OUT SQL

Add nodes as demand grows

Automated recovery on failure

OPERATIONAL DATABASE

Page 5: Clustrix Big Data Podcast

E-commerce

EXAMPLES APPLICATION SEGMENTS

BATTLED TESTED LESSONS

Consumer Web Advertising Analytics

Page 6: Clustrix Big Data Podcast

BUSTING THE MYTH - SQL CAN SCALE

• 20 million+ users / 70,000+ TPS• Write heavy workload; 1TB+ writes / day

Massive Transaction Scale Real-Time Analytics

MIXED WORKLOADS

Page 7: Clustrix Big Data Podcast

IF YOU DON’T BELIEVE US – BELIEVE GOOGLE

F1 Based on “SPANNER” for Ad Words

http://www.theregister.co.uk/2013/08/30/google_f1_deepdive/

“100s of applications on over 100TB serving up 100s of thousands of requests per second

+ SQL queries that scans tens of trillions of data rows a day”

Page 8: Clustrix Big Data Podcast

HOW TO CHOOSE THE RIGHT TOOL FOR THE JOB?

Page 9: Clustrix Big Data Podcast

E-COMERCE EXAMPLE (SQL NORMALIZATION + JOIN = GOOD)

Customers(many)

Products(many or few

& may require flexibility)

Orders(many)

Reviews(many)

Problem is naturally relational - Orders, Reviews are for products by customers

What questions do you have?• Do you want to know all reviews for a product

along with the customer who wrote it (Product X Review X Customer)

• What about most popular products in San Francisco, or last 10 orders by a customer?

What Flexibility do you need? • Maybe all products have different attributes

WHAT DATA and WHAT QUESTIONS?

Page 10: Clustrix Big Data Podcast

How SIMPLY do the QUESTIONS need to be answered?

MAP REDUCE OR SQL?

And how many lines of code?

Page 11: Clustrix Big Data Podcast

WHEN do you want the QUESTIONS answered?

How COMPLEX is the Question?

NoSQLKey-Value, Document

NewSQLe.g. Clustrix

Warehousing AnalyticsHadoop, Vertica, Redshift

Query Complexity

In Memory Analytics

Reads and Writes Real-Time Analytics Batch Analytics

milliseconds secondsminutes Hours

ETL

Page 12: Clustrix Big Data Podcast

HadoopKey-Value

SQL Warehousing

Vertica

SIZE and FLEXIBILITY and QUERIES

SIZE FLEXIBILITY

NewSQL10s of TBS

100s of TBS

PetabytesKey-ValueHadoop

Document / Tabular

Relational Schema,Online schema

changes

Schema-less

NEWSQL

Rows with different columns

QUERY ABILITY

Simple lookup

Indexed lookup

Joins and complex Analytics

With Flexibility,you Lose the sophisticated

SQL Query optimizer

Page 13: Clustrix Big Data Podcast

RIGHT TOOL FOR THE JOB

NoSQL NewSQL Hadoop Columnar

OPERATIONAL DATA BATCH ANALYSIS

With Alot More SQL

Page 14: Clustrix Big Data Podcast

Clustrix Technical Resources

docs.clustrix.com