making the most of in-memory: more than speed

30
The Briefing Room Making the Most of In-Memory: More than Speed

Upload: inside-analysis

Post on 21-Jun-2015

169 views

Category:

Technology


4 download

DESCRIPTION

The Briefing Room with Robin Bloor and Kognitio Live Webcast Oct. 1, 2013 Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?AT=pb&SP=EC&rID=7539482&rKey=bc304aa8dac7b781 Everyone’s talking about in-memory these days, and the term has become synonymous with speed. But pinning data into memory is just the beginning, and it’s about more than speed. In-memory solutions need a tailored architecture, one that can take full advantage RAM processing from every aspect, and this requires an approach that considers memory and CPU from the ground-up. Register for this episode of The Briefing Room to hear from veteran Analyst Robin Bloor as he explains how memory is on the fast track to supersede disk, at least with respect to advanced analytics. He’ll be briefed by Kognitio CTO Roger Gaskell, who pioneered the in-memory analytical platform since its inception in 1989. He will also discuss how this type of solution changes the landscape for the modern data architecture and its impact on advanced analytical capabilities. Visit InsideAnalysis.com for more information

TRANSCRIPT

Page 1: Making the Most of In-Memory: More than Speed

The Briefing Room

Making the Most of In-Memory: More than Speed

Page 2: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected]

Page 3: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!   Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Page 4: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Topics

This Month: DATA PROCESSING

November: DATA DISCOVERY & VISUALIZATION

December: INNOVATORS

Page 5: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Data Processing

Efficiency  is  doing  things  right;  effec2veness  is  doing  the  right  things.  

“~Peter Drucker

Page 6: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

Robin Bloor is Chief Analyst at The Bloor Group

[email protected]

Page 7: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Kognitio

!   Founded in 1989, Kognitio is both an in-memory database and an analytical engine

!   The Kognitio Analytical Platform can be deployed as software, as an appliance, or in the cloud

!   The platform enables flexible, ad hoc queries on complex data sets, including data from Hadoop, and it offers scale-up and scale-out capabilities

Page 8: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Guest: Roger Gaskell

  Roger Gaskell is the Chief Technology Officer and one of the founding members of the Kognitio Development Team. He has overall responsibility for all product development, strategic direction and roadmap of new innovation for the Kognitio Analytical Platform. Roger has been instrumental in all generations of the product to date. Over this time, it has evolved from an appliance-based system in the original beta offering in 1989, to a hardware-independent software for x86 processing, then to a cloud-based Platform-as-a-Service offering in in the mid-1990s. Prior to Kognitio, Roger was test and development manager at AB Electronics. During this time his primary responsibility was for the famous BBC Micro Computer and the development and testing of the first mass production of personal computers for IBM.

Page 9: Making the Most of In-Memory: More than Speed

Making the most of in-memory platforms

October 2013

Page 10: Making the Most of In-Memory: More than Speed

10

What is an “In-memory” analytical platform

A database where queries are run from data held in computer memory (RAM) rather than mechanical disk

Memory = Fast / Disk = Slow

Analytics go much quicker – SIMPLE? Unfortunately, it’s not as simple as that….

Page 11: Making the Most of In-Memory: More than Speed

11

Why in-memory: RAM is faster than disk (really!)

Actually, this only part of the story: Analytics completely change the workload characteristics on the database workload

Simple reporting & transactional processing is all about “filtering” the data of interest filtering

Analytics is all about complex “crunching” of the data once it is filtered crunching

Crunching needs processing power & consumes CPU cycles CPU cycles

Storing data on physical disks severely limits the rate at which data can be provided to the CPUs storing

Accessing data directly from RAM allows much more CPU power to be deployed access

Page 12: Making the Most of In-Memory: More than Speed

12

Analytics is about through data

•  To understand what is happening in the data

“CRUNCHING”

Joins

Sorts

Aggregations

Grouping

Analytical Functions

crunching CPU cycle-intensive & CPU-bound

•  In-memory analytical platforms are therefore CPU-bound –  Assume disk I/O speeds not a bottleneck –  In-memory removes the disk I/O bottleneck

More complex analytics More pronounced this becomes =

Page 13: Making the Most of In-Memory: More than Speed

13

For analytics, the CPU is king

Being CPU-bound fundamentally changes a system’s design philosophy

Interactive / ad hoc analytics: THINK data to core ratios ≈ <10GB data per CPU core

Disk IO Bound CPUs wait for data from disk No need for efficient coding

Parallelisation ineffective

CPU Bound Every CPU cycle is precious – efficient coding

Parallelization = scalable performance Advanced techniques minimize CPU cycles

Page 14: Making the Most of In-Memory: More than Speed

14

Why now?

Price of RAM,

Logarithmic (10)

1995 2000 2005 2010 1987

Interest in in-memory

Page 15: Making the Most of In-Memory: More than Speed

15

Mature BI being overtaken

Numbers, tables, charts, indicators

…accessed with ease and simplicity Historical information, latency

But BI and BI tools have plateaued! Decision Support

Progression into advanced analytics & data science

It’s now all about doing more math …a lot more math

Page 16: Making the Most of In-Memory: More than Speed

16

Machine learning algorithms

Dynamic Simulation

Statistical Analysis

Clustering

Behaviour modelling

Thus more complex methods – real-time

Reporting & BPM Fraud detection

Dynamic Interaction

Technology/Automation

Ana

lytic

al C

ompl

exity

Campaign Management

#PP_R

Page 17: Making the Most of In-Memory: More than Speed

17

How to efficiently exploit RAM

•  A large cache is not in-memory –  In-memory platforms hold data in structures that take advantage of the

properties of RAM –  Caches are copies of frequently used disk blocks

•  Platform designed to specifically exploit the random access nature of memory –  Different algorithms –  CPU cycles are precious – code efficiency paramount –  Advanced techniques used to reduce code path length

•  Dynamic Machine Code Generation •  Extended CPU instruction sets

•  Parallelize everything –  Scale-out and Scale-up –  Fully and efficiently use every CPU

core, in every CPU, in every server

Page 18: Making the Most of In-Memory: More than Speed

18

Analytical Platform Reference Architecture

Persistence Layer

Hadoop Clusters

Enterprise Data Warehouses

Legacy Systems

Kognitio Storage

Reporting

Analytical Platform

Layer Near-line Storage (optional)

Application & Client Layer

All BI Tools All OLAP Clients Excel

Cloud Storage

Page 19: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Perceptions & Questions

Analyst: Robin Bloor

Page 20: Making the Most of In-Memory: More than Speed
Page 21: Making the Most of In-Memory: More than Speed

Big Data, Maybe — Big Parallelism, Yes

Many latency-reducing changes are afoot:

u  Hadoop is a data lake – It’s about latency

u  CPU and memory rule – The old database is dying

u  Grids, not clusters – A server is now a cluster

u  Scaling Up AND Scaling Out – “Only scaling out” is last year’s story

u  SSD will replace spinning disk – But it will never compete with RAM

Page 22: Making the Most of In-Memory: More than Speed

Why the Excitement?

What are the “new” applications?

BIG DATA capture and staging

BIG DATA ANALYTICS

LITTLE DATA ANALYTICS

OPERATIONAL INTELLIGENCE

Page 23: Making the Most of In-Memory: More than Speed

A “Modern” Workload

Query Light &

Math Heavy

Page 24: Making the Most of In-Memory: More than Speed

Where the Rubber Meets the Road

It isn’t really about application latency any more, it’s about business process latency (business time!). This can have many aspects:

u  The collapse of data flows – take the processing to the data

u  Data warehouse offload

u  Full process automation

u  Lower latency = NEW BUSINESS PROCESSES

Page 25: Making the Most of In-Memory: More than Speed

The Question

Exactly how do we take

advantage of these changes?

This is a BUSINESS question AND a TECHNICAL question.

The question for most organizations is:

Page 26: Making the Most of In-Memory: More than Speed

u  Low latency is exciting, but where do you see the clear business opportunities?

u  There seems to be a conundrum about where to store “slow” data: Ø  Hadoop? Ø  Traditional data warehouse? Ø  New data warehouse?

u  Is the split between the application and the data real any more?

Page 27: Making the Most of In-Memory: More than Speed

u  In your opinion, does the Enterprise need a new architecture?

u  How is it possible to define and monitor service levels with in-memory applications?

u  Whither data governance?

Page 28: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Page 29: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Upcoming Topics

www.insideanalysis.com

This Month: DATA PROCESSING

November: DATA DISCOVERY & VISUALIZATION

December: INNOVATORS

Page 30: Making the Most of In-Memory: More than Speed

Twitter Tag: #briefr

The Briefing Room

Thank You for Your

Attention