accelerating insight with high octane graph fueled data

15
©2017 Cambridge Semantics Inc. All rights reserved. Company Confidential Anzo Smart Data Lake™ - Accelerating Insight Disrupting the Analytics Time-to-Value Function Barry Zane Vice President, Engineering [email protected] Ben Szekely Vice President, Solution Engineering [email protected]

Upload: cambridge-semantics

Post on 11-Apr-2017

202 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved. Company Confidential

Anzo Smart Data Lake™ - Accelerating InsightDisrupting the Analytics Time-to-Value Function

Barry ZaneVice President, [email protected]

Ben SzekelyVice President, Solution [email protected]

Page 2: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Big Data and Analytics Industry Trends

• We are graduating from pieced-together ETL, Hadoop and BI solutions to consolidate around complete end-to-end solutions– Forward thinking customers looking to product vendors for innovation, delivery

and accountability for value. – Consolidation of partnerships and acquisitions

Page 3: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Cloud Computing Trends

• Cloud Computing is a transformative cost saver for analytics as demand for access to all data grows– Think beyond infrastructure balance sheet savings – Pay only for the analytics compute you use, as business needs demand and peak.

Page 4: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

The importance of “Time-to-Value”

• Time-to-Value from data becoming the key driver for analytics strategy with an assumption of self-service

Analyst Request

IT Data Prep

IT Data Extraction

IT Data Enrichment

Data Discovery

Effor

t

Time to Value

Page 5: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Key Risks

Costs rising from vendor lock-in of data format/storage, analytics tools and cloud infrastructure.

Page 6: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Anzo Smart Data Lake: Accelerating Insight

Disparate Sources

Insight

Exploratory AnalyticsKnowledge Discovery

Data on Demand

Automated Ingestion

Rich Models

Scalability

Security

Enterprise Knowledge Graph

Governance

Page 7: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

IT B

uild

and

Dep

loym

ent

Anzo Smart Data Lake

Traditional BI

and Analytics

Tool Chains

Add

New

Dat

a

Add

New

Dat

a

Ad

d N

ew D

ata

A

dd N

ew D

ata

Disrupting the Time-to-Value Function

Tim

e an

d Re

sour

ce

Inve

stm

ents

Insights and Value

Traditional BI

and Analytics

Tool Chains

Anzo Smart Data Lake

Page 8: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Anzo Smart Data LakeA Graph-based Platform to Disrupt the Analytics Time-to-Value Function

Connectors Models Rules Analytics & Tools

ASDL Customer Fingerprint - Intellectual Property

Data Ingestion& Mapping

AutomatedETL Generation

CollaborativeMapping

Text Processing

DataCataloging

Data & ModelGovernance

Active Metadata Management

Role-Based Security

Discovery & Analytics

Automated Query Generation

User Dashboards and Custom UI/UX

Self-Serve Live

Extracts

In-Memory MPP Query

Graphmartson Demand

ELT, Model BasedData Integration

Document Search

Actionable Insights

Enterprise Data Sources

EnterpriseData Lakes

“Last Mile”Analytics

Page 9: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Data Ingestion& Mapping

AutomatedETL Generation

CollaborativeMapping

Text Processing

DataCataloging

Data & ModelGovernance

Active Metadata Management

Role-Based Security

Discovery & Analytics

Automated Query Generation

Custom User Dashboards

Self-Serve Live

Extracts

In-Memory MPP Query

Graphmartson Demand

ELT, Model BasedData Integration

Document Search

Actionable Insights

“Last Mile”Analytics

Elastically Scaled Analytics

Scalable Encrypted Storage

Anzo Smart Data Lake – Cloud DeploymentASDL cloud deployment in Amazon Web Services or Google Cloud Platform

Cloud automation is a significant and strategic component of the Cambridge Semantics roadmap including deployment, elastic scale and high-availability. Our cloud mission is to offer customers lower costs in development, maintenance and operations – using cloud resources efficiently as business needs determine.

EnterpriseData Lakes

Enterprise

Data Sources

Elastically Scaled Ingestion

Cloud-delivered ASDL offers faster deployment and on-demand scale

Page 10: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Large Scale Graph Analytics

Graph is a simple, clean model for standard analytic queries and allows you to do more.

But, using Graph has had terrible performance for standard analytics queries against large-scale data.

If you can’t do the standard “data warehouse” queries at scale, you won’t get to the algorithms that only Graph can perform!

Build a Graph engine designed for large-scale analytics.

Leverage parallel computing - lots of hardware. Scale to hundreds of severs.

Extend the SPARQL language to backfill functionality present in SQL.

Deploy thru a user interface that automatically writes the SPARQL, and visualizes the results.

PROBLEM

SOLUTION

Page 11: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Analytic Landscape

ROLAP - Relational online analytics•Broad adoption, 45 years of technology evolution•Based on declarative SQL for business analysts•Formal ANSI/ISO standard since 1986

GOLAP - Graph based online analytics•Narrow adoption, accelerating over past 15 years•Based on declarative SPARQL for business analysts•Formal W3C standard since 2008

Hadoop (Spark) - Offline batch analytics•Growing adoption since created in 2005 (2012)•All queries programmed in Java/Scala/Python…•Apache and community standards•Limited only by programmer’s talents and available APIs

Page 12: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

GOLAP is Real Relational Data Warehouse, Really

Relational Databases are predefined “rectangular” tables and rows with columns.–Very natural for subjects (aka rows) with a number of known attributes common to all/most

of the subjects.–Allows columns to be links (aka keys) to other table’s subjects.

Challenged by:–Sparsity–One-to-many needs a separate “join table”–You need to understand the data in advance

Graphs are real relational, really. Just a little different than the points above!

Page 13: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

RDF/SPARQL… like RDB/SQL, but...

Standard SQL aggregates, joins, etc, but simple and powerful relationship capabilities.

“How is Joe related to Mary”–In SQL Relational

•Are they spouses?•Are they siblings?•Are they friends?•Do they have the same hobby?•… enumerate the choices, EXPLODES with degrees of separation

–In SPARQL Graph•How is Joe related to Mary?•… you can directly specify degrees of separation

Pretty exciting, essentially all the power of SQL, but you can do more, with more diverse data, where the data tells you about itself, rather than you knowing in advance.

Page 14: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

The Smart Data Lake is the “database”

• Data cached in HDFS, AWS/GCP buckets • Multiple Graph Query Engine instances, usually on subsets• Ephemeral in-memory operation• Short term instances - load, query, toss

Page 15: Accelerating Insight with High Octane Graph Fueled Data

©2017 Cambridge Semantics Inc. All rights reserved.

Thank You

Click here to request a demo