tecfinal 451 webinar deck

31
NoSQL: The Challenges Beyond Multi-Model and Integrating into Big Data Applications

Upload: basho-technologies

Post on 07-Aug-2015

102 views

Category:

Technology


11 download

TRANSCRIPT

Page 1: tecFinal 451 webinar deck

NoSQL: The Challenges Beyond Multi-Model and Integrating into Big Data Applications

Page 2: tecFinal 451 webinar deck

Presenters:

Matthew Aslett, Research Director, 451 Research• NoSQL Beyond Polyglot

Persistence

Peter Coppola, VP Product & Marketing, Basho Technologies• How a Data Platform solves the

challenges of integrating NoSQL into Big Data applications

Page 3: tecFinal 451 webinar deck

NoSQL: Beyond polyglot persistence

Matthew Aslett, research director

Page 4: tecFinal 451 webinar deck

451 Research is an information technology research & advisory companyFounded in 2000

210+ employees, including over 100 analysts

1,000+ clients: Technology & Service providers, corporate advisory, finance, professional services, and IT decision makers

12,500+ senior IT professionals in our research community

Over 52 million data points each quarter

4,500+ reports published each year covering 2,000+ innovative technology & service providers

Headquartered in New York City with offices in London, Boston, San Francisco, and Washington D.C.

451 Research and its sister company Uptime Institute comprise the two divisions of The 451 Group

Research & Data

Advisory Services

Events

4

Copyright (C) 2015 451 Research LLC

Page 5: tecFinal 451 webinar deck

5

The birth of NoSQL• The genesis of much – although by no means all – of the momentum

behind the NoSQL database movement can be attributed to two research papers:

• Google’s BigTable: A Distributed Storage System for Structured Data, presented at the Seventh Symposium on Operating System Design and Implementation, in November 2006

• Amazon’s Dynamo: Amazon’s Highly Available Key-Value Store, presented at the 21st ACM Symposium on Operating Systems Principles, in October 2007

• The term itself was coined by Johan Oskarsson as the name for a June 2009 meeting of developers, users and others interested in a group of loosely related data technologies

Page 6: tecFinal 451 webinar deck

SPRAINED RELATIONAL DATABASES

Photo credit: Foxtongue on Flickr http://www.flickr.com/photos/foxtongue/4844016087/

Page 7: tecFinal 451 webinar deck

The traditional relational database has been stretched beyond its normal capacity by the needs of high-volume, highly distributed or highly complex applications.

Scalability Performance Relaxed consistency Increased willingness to look

towards Agility emerging alternatives Intricacy Necessity

Database SPRAIN

7

Page 8: tecFinal 451 webinar deck

The traditional relational database has been stretched beyond its normal capacity by the needs of high-volume, highly distributed or highly complex applications.

Scalability Performance Relaxed consistency A diverse array of NoSQL projects Agility serving a range of use-cases Intricacy Necessity

Database SPRAIN

8

Page 9: tecFinal 451 webinar deck

114

Relational zone

Non-relationalzone

Lotus Notes

Objectivity

MarkLogic

InterSystemsCaché

McObject

Starcounter

ArangoDB

Neo4J

InfiniteGraph

Apache CouchDB

Oracle NoSQL

Redis

Handlersocket

RavenDB

RethinkDB

LevelDB

Apache Accumulo

Apache Cassandra

Apache HBase

RiakCouchbase

Splice Machine

Actian IngresSAP Sybase ASE

EnterpriseDB

SQL Server

MySQL

InformixMariaDB

SAP HANA

IBMDB2

Database.com

ClearDB

Google Cloud SQL

RackspaceCloud Databases

AWS RDS

Azure SQLDatabase

HP CloudRelational Database

StormDB

Hadapt Teradata Aster

HPCC

Cloudera

Azure Data Lake

MapR IBM BigInsights

Zettaset

NGDATA

InfochimpsMetascale

Rackspace

Qubole

Voldemort

Aerospike

Teradata

IBM PureDatafor Analytics/dashDB

Pivotal GreenplumHP Vertica

SAP Sybase IQ

IBM InfoSphere

Actian Vector

XtremeData

Kx Systems

Exasol

Actian Matrix

ParStreamTokuDB

ScaleDB

ScaleArc

ContinuentTransLattice

NuoDB

Drizzle

JustOneDB

Pivotal GemFire XD

Galera

ScaleBase

Clustrix

Tesora DVE

MemSQL

DatomicUrika-GD

FlockDB

Allegrograph

HypergraphDB

AffinityDBTrinity

MemCachier

Redis LabsMemcached Cloud

FairCom

BitYota

IronCache

Grid/cache zoneMemcached

Ehcache

ScaleOutSoftware

IBM eXtreme

ScaleOracle

Coherence

GigaSpaces XAPApache Ignite

PivotalGemFire

CloudTran

InfiniSpan

Hazelcast

OracleExalytics

OracleDatabase

MySQL Cluster

Oracle Endeca Server Attivio

LucidWorksBig Data

Lucene/Solr

IBM InfoSphere Data Explorer

TowardsE-discovery

Towardsenterprise search

DocumentumxDB

TaminoXML Server

Ipedo XMLDatabase

ObjectStore

LucidDB

MonetDB

Metamarkets Druid

Apache Spark

AWSElastiCache

FirebirdSQLite

Oracle TimesTensolidDB

Adabas

IBM IMS

UniData

UniVerse

WakandaDB

Altiscale

Oracle Big Data Appliance

OrientDB

Sparksee

Doopex

TreasureData

PostgreSQLPercona Server

vFabric Postgres

© 2015 by 451 Research LLC. All rights reserved

HyperDex

TIBCOActiveSpaces

SAP Sybase SQL Anywhere

JethroData

CitusDB

PivotalHD/HAWQ

BigMemory

ActianVersant

DataStaxEnterprise

DeepEnigine

Infobright

FatDB

Google CloudDatastore

HerokuPostgres

GrapheneDBInstacluster

Hypertable

BerkeleyDB

SqrrlEnterprise

AzureHDInsight

HPAutonomy

OracleExadata

IBM PureData

IBMBig SQL

ClouderaImpala

ApacheDrill

Presto

MicrosoftSQL Server

PDW

ApacheTajo

ApacheHive

MammothDB

Altibase HDB

LogicBlox

SRCH2

TIBCOLogLogic

Splunk

TowardsSIEM

Loggly SumoLogicLogentries

InfiniSQL

JumboDB

Actian PSQL

Progress OpenEdge

Kognitio

Altibase XDB

CenturyLink

IBM SoftlayerJoyent

xPlenty

Stardog

MariaDB Enterprise

Apache StormApache S4

IBMInfoSphereStreams

TIBCOStreamBase

DataTorrent

AWSKinesis

Feedzai

GuavusLokad

SQLStream

Software AG

Key: General purposeSpecialist analytic

BigTablesGraphDocumentKey value stores

-as-a-Service

Key value direct accessHadoop

MySQL ecosystem

Advanced clustering/shardingNew SQL databases

Data caching

Data grid

Search

Appliances

In-memory

Stream processing

OpenStack Trove

1010dataGoogle BigQuery

AWSRedshift

TempoIQInfluxDBWebScaleSQL

MySQLFabricSpider

2

E

D

A

B

C

T-Systems

E

D

A

B

C

2 43 5

SQream

SpaceCurve

Postgres-XL

Google Cloud Dataflow

Trafodion Hadapt

AzureSearch

Red Hat JBossData Grid

654

MongoDB

Cloudant

Iris Couch

MongoLab

Compose

ObjectRocket

CloudBird

Azure DocumentDB

1 3

1 6

Data PlatformsMapJune 2015

https://451research.c

om/dashboard/dpa

CockroachDB

AWS DynamoDB AWS SimpleDB

Redis LabsRedis Cloud

RedisGreen

AWS ElastiCachewith Redis

MagnetoDB

ObjectRocketwith Redis

TokuMX

VoltDB

CortexDB

CodeFutures

Oracle Big Data Cloud

AWSEMR

StratioTeradata Cloud

for Hadoop

MapR-DB

Snowflake

Cloudant Local GridGain In-Memory Data Fabric

Databricks

Apache Hadoop

MongoDirector

Redis-to-go

GraphHost

Redis LabsEnterprise Cluster

Azure Redis Cache

Azure ManagedCache Service

Azure In-Role Cache

SciDB AsterixDB Apache FlinkData Artisans

BrytlytMapD

Modulus

Elasticsearch

ElasticFound

OrchestrateHP NonStop SQL

Crate

Titan

TesoraDBaaS

AWS Aurora

MariaDB MaxScale

Azure SQLData Warehouse

Hortonworks

Ontotext GraphDB

Google CloudBigTable

Page 10: tecFinal 451 webinar deck

The NoSQL database landscape

10

MarkLogic ArangoDB

Neo4J

InfiniteGraph

Apache CouchDB

Oracle NoSQL

Redis

Handlersocket

RavenDB

RethinkDB

LevelDB

Apache Accumulo

Apache Cassandra

Apache HBase

RiakCouchbase

Voldemort

Aerospike

Urika-GD

FlockDB

Allegrograph

HypergraphDB

AffinityDB

OrientDB

Sparksee

HyperDex

DataStaxEnterprise

FatDB

Google CloudDatastore

GrapheneDBInstacluster

Hypertable

BerkeleyDB

SqrrlEnterprise

JumboDB

Stardog

MongoDB

Cloudant

Iris Couch

MongoLab

Compose

ObjectRocket

CloudBird

Azure DocumentDB

AWS DynamoDB AWS SimpleDB

Redis LabsRedis Cloud

RedisGreen

AWS ElastiCachewith Redis

MagnetoDB

ObjectRocketwith Redis

TokuMX

CortexDB

MapR-DB

Cloudant Local

MongoDirector

Redis-to-go

GraphHost

Redis LabsEnterprise Cluster

Azure Redis Cache

Modulus

Orchestrate

Google CloudBigTable

TitanTrinity

Ontotext GraphDB

Page 11: tecFinal 451 webinar deck

The idea that different data storage models have their own strengths and should be used in combination to solve the various data processing needs of a complex application.

Polyglot persistence

Wide-column

Data is mapped by a row key, column key and time stamp.

Key Value

Store keys and associated values.

Graph

Store data and the relationships between data.

Document

Store all data related to a specific key as a single document.

DATA MODEL COMPLEXITY

11

Page 12: tecFinal 451 webinar deck

Polyglot persistence

Wide-columnKey Value GraphDocument

12

Page 13: tecFinal 451 webinar deck

Polyglot persistence

Wide-columnKey Value GraphDocument

13

Search Analytics Cache

Page 14: tecFinal 451 webinar deck

Multi-model

Wide-column stores

Key Value GraphDocument stores

14

Search Analytics Cache

Multi-model databases

Support a combination of the various individual NoSQL data models - avoid operational complexity- maintain developer agility

Page 15: tecFinal 451 webinar deck

Multi-model

Wide-column stores

Key Value GraphDocument stores

15

Search Analytics Cache

Multi-model databases

Page 16: tecFinal 451 webinar deck

Multi-model

Wide-column stores

Key Value GraphDocument stores

16

Search Analytics Cache

Multi-model databases

Page 17: tecFinal 451 webinar deck

Multi-model data platform

Wide-columnKey Value GraphDocument

17

Search Analytics Cache

Page 18: tecFinal 451 webinar deck

Thank [email protected]@maslettwww.451research.com

Page 19: tecFinal 451 webinar deck

Delivering on a Data PlatformPeter CoppolaVP, Product & Marketing

Page 20: tecFinal 451 webinar deck

THE EVOLUTION OF NOSQL

UnstructuredData Platforms

Multi-Model Solutions

Point Solutions

Basho Technologies | 20CONFIDENTIAL

Page 21: tecFinal 451 webinar deck

42% of database decision makers admit they

struggle to manage the NoSQL solutions deployed in their environments”

COMPLEX TECHNOLOGY STACK

Riak

Spark

Basho Technologies | 21

Page 22: tecFinal 451 webinar deck

OUR CUSTOMERS ARE INTEGRATINGNoSQL, Caching, Real-time Analytics and Search

Basho Technologies | 22

Page 23: tecFinal 451 webinar deck

Big data, hybrid cloud architectures and IoT require developers to integrate, replicate and synchronize information across functionsMac Devine

Vice President and CTO IBM Cloud Services

Page 24: tecFinal 451 webinar deck

Enterprises building Big Data, IoT and Hybrid Cloud applications are struggling with complexity

Distributed workload challenges: availability, scale and geo-location

Proliferation of data models: Key-Value, In-Memory, Document, etc.

High costs to ensure data accuracy: replication, synchronization and integration

High operational costs: architectural and management simplicity & efficiency

Lack of available developer expertise

Big DataHybrid Cloud

IoT

Database(s)

Storage

Caches Analytics Queues SearchLog

Mgmt.

Page 25: tecFinal 451 webinar deck

Current Operational Challenges

• Managing separate clusters for Riak KV, Redis and Spark

• Manually synchronizing data across the applications

• Using Zookeeper for Spark cluster management

• Manually sharding data in Redis

• Manually managing failures of Redis instances

Customers manually integrating

Big data applications like ours need to integrate and then deploy many different technology components

Martin DaviesCEO of Technology

Page 26: tecFinal 451 webinar deck

BASHO DATA PLATFORM

Basho Technologies | 26

SERVICEINSTANCES

STORAGEINSTANCES

Solr

SparkRedis

(Caching)Solr

ElasticSearch

Web Services3rd Party Web

Services & Integrations

RiakKey/Value

Riak Object Storage

Riak Coming Soon

Document Store

Columnar Graph

Replication & Synchronization

MessageRouting

Cluster Management &

Monitoring

Logging &Analytics

Internal Data Store

CORE SERVICES

Page 27: tecFinal 451 webinar deck

CONFIDENTIAL

BASHO DATA PLATFORM

Data Replication and SynchronizationReplicate and synchronize data across and between storage instances and service instances to ensure data accuracy with no data loss and high availability.

Cluster Management Integrated cluster management automates deployment and configuration of Riak KV, Riak S2, Spark and Redis. Once deployed in production, auto-detect issues and restart Redis instances or Spark clusters. Cluster management eliminates the need for Zookeeper.

Internal Data StoreA built-in, distributed data store for ensuring speed, fault-tolerance and ease-of-operations is used to persist static and dynamic configuration data (port number and IP address) across the Basho Data Platform.

Message RoutingA high-throughput, distributed message system for speed, scalability and high availability. This message system will have the ability to persist and route messages across platform clusters.

Logging and AnalyticsEvent logs provide valuable information that can facilitate the enhanced tuning of clusters and accurately analyze dataflow across the cluster

Core Services

Page 28: tecFinal 451 webinar deck

BASHO DATA PLATFORM: SERVICE INSTANCES

Apache Spark Add-OnZookeeper not required

Real-Time Analytics• Move data from Riak KV

to Spark for batch and real-time analytics and store results back in Riak KV for future processing

• Cluster management eliminates the need for Zookeeper

Redis Add-OnAvailability w/ auto-sharding

Integrated Caching• Redis is now

Enterprise grade with high availability, data synchronization with Riak KV and cluster management

• Automatic data sharding across multiple cache servers simplifies operations

Apache Solr Add-OnQuery like Solr

Enriched Search • Powerful full-text

search of Solr with the availability and scalability of Riak KV

• As data changes, search indexes are automatically synchronized

Page 29: tecFinal 451 webinar deck

BASHO DIFFERENCE

• Ease of Scale• Optimized for High Availability• Data Correctness• Solving data distribution

challenge• Operational Simplicity

Basho Technologies | 29CONFIDENTIAL

We are excited that Basho is stepping forward and simplifying our daunting technology stack

Jason OrdwayCTO

Page 30: tecFinal 451 webinar deck

Basho Technologies | 30

RIAK DEPLOYED WORLDWIDE

Page 31: tecFinal 451 webinar deck

QUESTIONS?