nosql simplified: schema vs. schema-less

30
The Database for Big Data Solutions © Objectivity, Inc. 2014 NoSQL Simplified: Schema vs Schema-less Leon Guzenda & Nick Quinn Meetup - February 20, 2014 1

Upload: infinitegraph

Post on 14-Jan-2015

656 views

Category:

Technology


2 download

DESCRIPTION

A look at the many facets of schema-less approaches vs a rich schema approach, ranging from performance and query support to heterogeneity and code/data migration issues. Presented by Leon Guzenda, Founder, Objectivity

TRANSCRIPT

Page 1: NoSQL Simplified: Schema vs. Schema-less

The Database for Big Data Solutions

© Objectivity, Inc. 2014

NoSQL Simplified: Schema vs Schema-less

Leon Guzenda & Nick Quinn Meetup - February 20, 2014

!1

Page 2: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Overview

• Objectivity Inc.

• Pros & Cons:

• Schema • Schema-less

• What We Provide

• A Compromise

!2

Page 3: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Objectivity, Inc.

• Headquartered in San Jose, CA • Over two decades of NoSQL and Big Data experience • Enables complex data virtualization and Big Data

solutions for the enterprise • Software products: • Objectivity/DB • InfiniteGraph • InfiniteGraph Social App

• Embedded in hundreds of enterprises, government organizations and products, with millions of deployments.

!3

Page 4: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014 !4

Objectivity/DB

• Fully distributed object database.

• Handles complex, highly inter-related data. "

• Extremely fast navigational access.

• Scalable collections and B-Tree indices

• ACID transactions plus Multi-Reader, One Writer mode.

• Highly scalable - Single Logical View plus simple servers

• Parallel Query Engine and Relationship Analytics

• Fully interoperable C++, C#, Java, Python and SQL++ on Windows, Unix, Linux and Mac OS X.

!4

Page 5: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

ODBMS Deployments

!5

Monitoring & Response Telecom Infrastructure

Big Science Complex Financial Systems

Data Fusion

Page 6: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014 !6

InfiniteGraph

• Fully distributed graph database

• High throughput and scalability "

• Extremely fast navigational access

• ACID transactions for online operation

• Relaxed consistency during batch-mode parallel ingest

• Parallel queries

• Flexible indexing, including Lucene for text

• Java API and Gremlin support!6

Page 7: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Graph DBMS - Finding The Links

!7

OTHER DATABASE(S)

GRAPH DATABASE

Page 8: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Objectivity’s Disruptive Big Data Architecture

!8

Uses Data Virtualization to hide the nodes and focus on the connections

Page 9: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Schema: Pros & Cons

!9

Page 10: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Who's Who?

• SCHEMA: • Network [CODASYL] databases - DDL [1972] • Relational Databases - Data Dictionary • Object Databases - ODMG'93 • Most Graph Databases "

• Schema-less: • KSAM/ISAM/DSAM/ESAM • IMS (hierarchical) • Pick OS Database (hash-tables) • MUMPS (hierarchical array-storage) • MongoDB - a specialized JSON (and JSON-like)

document store. • CouchDB - a JSON document store.

!10

Page 11: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Schema: Pros...

• Global data definitions "

• Optimal access "• Enables Query By Example "

• Interoperability "

• Schema change control "• Schema contents can be manipulated via standard

APIs and tools

!11

Page 12: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

• Global data definitions: • Data types and the relationships between them • Makes queries more efficient • Actions can be restricted by data type, field values, relationship types "

• Optimal access: • Used to determine how to best store, manage and access particular data types "

• Enables Query By Example by showing: • Types of information available • Relationships between them "

• Interoperability: • DBMS can change the shape of data items to suit the language/environment "

• Schema change control: • Can be used to enforce workflows that will keep applications and data in sync. "

• Schema contents can be manipulated via standard APIs and tools: • Easier learning curve • Uniform security controls:

• The schema can use the same security controls as the data • Query and visualization tools can be used for both data and schema

!12

...Schema: Pros

Page 13: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Schema: Cons

• The database designer and application developers have to create and maintain the schema.

"• Applications have to be kept in sync with schema

changes. "• Applications and programmers have to be aware of data

types • Though this is one of the major claimed advantages of object-

oriented programming. "

• There is a perceived loss of flexibility • Though this is more a function of the user interface to the

database than the underlying mechanisms.

!13

Page 14: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Schema-less: Pros…

• Flexibility "

• Can be more tolerant of variable Acidity and Consistency models "

• Ease of use and maintenance:

!14

Page 15: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

…Schema-less: Pros• Flexibility - Users can, in theory: "

• Put any kind of data into the system • Create new kinds of relationships between things (in a few

products) • Find data without worrying about the types of data

involved. "

• Can be more tolerant of variable Acidity and Consistency models "

• Ease of use and maintenance: • No need to worry about data types • No need for a DBA • Applications will [probably] work when new data arrives

!15

Page 16: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Schema-less: Cons…

• Confusion "

• Performance suffers "

• poor Integrity "

• Ambiguity

!16

Page 17: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

…Schema-less: Cons• Apparent tolerance of variable CAP models is actually orthogonal to

the schema vs schema-less debate [as is support for sharding]. "

• Performance suffers "

• Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object

• 54686973206973206120737472696e6720706c7573206120666c6f6174696e6720706f696e74206e756d62657258585858706c757320616e6f7468657220737472696e67

!17

Page 18: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Schema-less: Cons• Apparent tolerance of variable CAP models is actually orthogonal to

the schema vs schema-less debate [as is support for sharding]. "

• Performance suffers "

• Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object

• 54686973206973206120737472696e6720706c7573206120666c6f6174696e6720706f696e74206e756d62657258585858706c757320616e6f7468657220737472696e67

!18

Floating Point

Page 19: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Schema-less: Cons• Apparent tolerance of variable CAP models is actually orthogonal to

the schema vs schema-less debate [as is support for sharding]. "

• Performance suffers "

• Integrity is practically non-existent • Maintaining referential integrity is hard • Queries may misinterpret values within an object

• 54686973206973206120737472696e6720706c7573206120666c6f6174696e6720706f696e74206e756d62657258585858706c757320616e6f7468657220737472696e67

• A ZIPcode may be stored as an integer (01234) or a string (“01234”) in JSON, causing query and display problems.

!19

Floating Point

Page 20: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

The NoSQL Players

!20

Intersystems MarkLogic McObject

Operational

AppEngine Cloudant CouchDB MongoDB RavenDB

Document

Object/GraphObjectivity/DB

Progress Versant "

AllegroGraph InfiniteGraph

Neo4j Titan Berkeley DB

Cassandra Redis Riak

Voldemort

Key-Value

Couchbase

Column Family

HBase HyperTable SimpleDB

*

* *

* Fully or partially schema-less

Page 21: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

A Compromise

Provide Flexibility With The Advantages Of Having A Schema

!21

Page 22: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Objectivity/DB Schema Usage

• Has an internal schema in its system database (the Federated DB). "• User schemas are created and updated by:

• Creating .ddl files and pre-processing them with the DDL processor. • Creating and compiling Java, C# or Python header files. • Declaring or dynamically creating/modifyingSmalltalk classes (defunct). • Declaring and changing table definitions with Objectivity/SQL++.

"• SQL++ table/column definitions are updated automatically when classes are

declared or modified using other languages. • This allows SQL++ to access C#, C++, Java and Python objects and vice-versa.

"• A Federated Database can contain multiple named Schemas:

• Reduces re-compilation and re-building after a localized schema change. • May facilitate security mechanisms in the future.

!22

Page 23: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Objectivity Active Schema"

• API and tools for creating, modifying, reading and deleting class definitions, which include association (relationship) definitions. • If used with a dynamic language, such as Smalltalk, creating or

modifying a class doesn't need to affect existing programs. • In general, only generic access (via the ooObj base clase) can be used

without creating the files needed to recompile programs and methods for accessing the new object types.

"• Helps application developers build tools that need to access the schema,

e.g.: • Graphical query tools • highly flexible object modeling capabilities for end users. "

• An end-user, such as a field technician or an analyst: • Can add local object classes, populate, maintain and query them,

but... • Cannot interfere with the correct operation of the pre-built

applications.

!23

Page 24: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Use Cases

!24

Page 25: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Use Case 1 - Intelligence Gathering Framework…

!25

• An integrated application development framework that focuses on adaptability.

• Dynamic modeling of entities, services and workflows.

• Versioning and temporality features support system evolution.

The screenshots show a location that is under surveillance and everything known about it in the database.

1 2of

Page 26: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014 !26

2 2of

• Eliminates the mapping layer between the user defined objects and the database.

• Performance and scalability.

• Active Schema facilitates object migration.

Design and Information Feeds Users

Database

…Use Case 1 - Intelligence Gathering Framework

Page 27: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Use Case 2 - GDMO Framework

!27

"• Operations, Administration, and"Maintenance interface for the CDMA"system RF infrastructure

• Controls the Base Station Controller and Base Station Transceiver Subsystem

• GDMO* Schema and CMIP agent-manager"messaging

• A SPARC-based BSC rack supports a"peak load of 150,000 simultaneous callers

• Deployed in CDMA networks worldwide,"including SprintPCS"

* GDMO is the Guideline for the Definition of Managed Objects

Page 28: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Use Case 3 - Ontology Framework

!28

"• Uses standard objects to define a meta-

schema

• It is used to define concept templates

• They can be inherited from, combined or extended to support a “class specification”

• The data is combined with Horn Logic to build complex ontologies."

* GDMO is the Guideline for the Definition of Managed Objects

SCHEMA

CONCEPT

CLASS COMPONENTS

STRUCT ARRAY FIELDRELATIONSHIP

LOGIC

Page 29: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

Summary

• Don’t confuse CAP issues with Schema considerations

• Schemas make the DBMS more powerful

• Schema-less architectures are more flexible

• It’s possible to build flexible systems with Schema-based infrastructure

!29

Page 30: NoSQL Simplified: Schema vs. Schema-less

© Objectivity, Inc. 2014

THANK YOU

• Please visit objectivity.com for:

• Features • Use Cases • White Papers • Free downloads (60 day evaluation) • Sample Applications • Application Developer’s Wiki "

• For further information: "• Email: [email protected]

!30