accelerating big data analytics

26
Accelerating Big Data Analytics with Microsoft APS and Attunity Replicate

Upload: attunity

Post on 15-Jul-2015

404 views

Category:

Software


3 download

TRANSCRIPT

Accelerating Big Data

Analytics

with Microsoft APS and Attunity Replicate

2

Data sources

The traditional data warehouse

3

Data sourcesNon-relational data

The traditional data warehouse

Data sources Non-Relational Data

HadoopRelational Data Warehouse

Data Platform

Analytics Platform System

SQL Server 2014

Azure HDInsight

Keep legacy

investment

Buy new tier-one

hardware appliance

Acquire Big Data

solution

Acquire business

intelligence

Roadblocks to evolving to a modern data warehouse

Limited

scalability and ability to

handle new data types

Significant training

and data silos

High acquisition

and migration

costs

Complex with low

adoption

Introducing the Microsoft Analytics Platform SystemThe turnkey modern data warehouse appliance

• Relational and non-relational

data in a single appliance

• Enterprise-ready Hadoop

• Integrated querying across

Hadoop and PDW using T-

SQL

• Direct integration with

Microsoft BI tools such as

Microsoft Excel

• Near real-time performance

with In-Memory Columnstore

• Ability to scale out to

accommodate growing data

• Removal of data warehouse

bottlenecks with MPP SQL

Server

• Concurrency that fuels rapid

adoption

• Industry’s lowest data

warehouse appliance price per

terabyte

• Value through a single

appliance solution

• Value with flexible hardware

options using commodity

hardware

Microsoft Analytics Platform SystemThe turnkey modern data warehouse appliance

Move HDFS into the warehouse before analysis

ETL

Learn new

skills

T-SQL

Build

Integrate

Manage

Maintain

Support

Hadoop alone is not the answer to all Big Data challengesSteep learning curve, slow and inefficient

Hadoop ecosystem

New data sources

“New” data sourcesNew data sources

Provides a single T-SQL query model for PDW and Hadoop with rich features of T-SQL, including joins without ETL

Uses the power of MPP to enhance query execution performance

Supports Windows Azure HDInsight to enable new hybrid cloud scenarios

Provides the ability to query non-Microsoft Hadoop distributions, such as Hortonworks and Cloudera

SQL Server

Parallel Data

WarehouseMicrosoft Azure

HDInsight

PolyBase

Microsoft

HDInsight

Hortonworks for

Windows and Linux

Cloudera

Connecting islands of data with PolyBaseBringing Hadoop point solutions and the data warehouse together for users and IT

Result setSelect…

Use cases where PolyBase simplifies using Hadoop dataBringing islands of Hadoop data together

Running high performance queries against Hadoop data

Archiving data warehouse data to Hadoop (move)

Exporting relational data to Hadoop (copy)

Importing Hadoop data into a data warehouse (copy)

Big Data insights for anyoneNew insights with familiar tools through native Microsoft BI integration

Minimizes ITintervention for discovering data with tools such as Microsoft Excel

Enables DBA and power users to join relational and Hadoop data with T-SQL

Offers Hadoop tools like MapReduce, Hive, and Pig for data scientists

Takes advantage of high adoptionof Excel, Power View, PowerPivot, and SQL Server Analysis Services

Power users

Data scientist

Everyone else using

Microsoft BI tools

Shinsegae Corporation, a major department store chain

in Korea, needed better performance for customer data

mining and basket purchase analysis. Shinsegae took

advantage of the integration of PDW and Hadoop to

combine 450 terabytes of data, and was pleased to see

PolyBase performing nearly twice as fast as their best

Hive/Hadoop environment.

#1 Retail company in Korea

We are really satisfied with the performance of

PolyBase to allow us to join relational and Hadoop

data (weather data, board data, text data) faster and

easier. PolyBase is a really powerful feature of PDW to

deploy a Big Data system. PolyBase is one of the

reasons we selected PDW as our Big Data platform.

The Royal Bank of Scotland—the leading UK provider of

corporate banking services—needed a powerful

analytics platform to improve performance and

customer services. The bank implemented a Microsoft

SQL Server Parallel Data Warehouse appliance to

increase productivity by 40 percent for faster response to

business needs.

I knew that it would be easy for my team

to transition from managing SQL Server

databases to SQL Server PDW, and the

solution cost about 85 percent less than

products from other vendors.

Microsoft Analytics Platform SystemNo-compromise modern data warehouse solution

Meeting today’s Big Data

analytics requirements

Enterprise-ready Hadoop

with HDInsight and the

simplicity of PolyBase

Optimized performance

with MPP technology and

In-Memory Columnstore

Providing value with a

low TCO

Accelerating Big Data

Analytics

with Microsoft APS and Attunity Replicate

To Use Data, You Must Move it!

16

Data Needs to Be Moved to Be Useful

»80% of the work that data

scientists put into big data projects is spent on data integration and resolving data quality issues.

Source: “For Big Data Scientists, “Janitor Work” is Key Hurtle to Insights,” by Steve Lohr, New York Times, August 17, 2014

Data Integration Remains a Major Challenge

1. Long rollout

2. Lots of personnel

3. Mixed systems

4. Hard to maintain

5. Not real-time

Attunity Replicate for Microsoft APS

19

More Data

Less Time

Less Cost

Data Value

• Easy, no coding, less complexity

• Pre-automated, optimized process

• Fast, high performance integration

• Real-time CDC with low overhead

• Optimized for large volumes in LAN and WAN

Use CasesGetting Data into Microsoft APS and SQL Server

1. ELT - accelerate new data feeds to your data warehouse

2. CDC – load data in real-time operational analytics

3. Query Offload into ODS and for BI on SQL Server

4. Migrate from another database or data warehouse

5. Hadoop – load data into and out of Hadoop

20

Attunity Replicate for Microsoft APS

Monitoring and Control.

Complete confidence at a glance

Turbo-Stream CDC and Optimizations for loading Microsoft APS

High performance, low-latency, low-impact, and scalability

Zero Footprint Architecture.

Nothing to install on source database for Oracle, SQL Server, DB2, Sybase,

mySQL

Click-2-Load.

Drag. Drop. Done.

Complete, Heterogeneous Data Loading/Replication.

Automating Schema Generation, Full Load and Change Data Capture

21

Attunity Replicate for Microsoft APSHigh Level Architecture

22

Web-based Designer and

Management Console

Replication Server

In Memory Stream Processing

Persistent Store

Source

Database

Transaction

Log

Data / Metadata

Data / Metadata

CDC

Bulk

Loader

Stream

Loader

Bulk

Reader Transform

Filter

Optimized

Integration• Oracle• SQL Server• DB2• DB2 for iSeries & z/OS• Sybase• mySQL• Informix• Files (CSV)• Mainframe VSAM, IMS• ODBC (e.g. Teradata, other DW)

PDW

HDInsight

PolyBase

Optimized Performance Attunity TurboStream CDC for DW and Microsoft APS

23

In Memory Stream Processing

Attunity TurboStream CDC

Transactional CDCTransactions applied in real-time, in order

High-Volume CDCCDC DW

Loader

SQL

n 2 1

SQL SQL

Consolidation of

change records to

minimize transactions applied to target

R1R1R2R1R2

R1R2

PDW

HDInsight

PolyBase

Attunity Replicate for Hadoop

Ad-hoc Analytics

Bulk Load

Change Data

Click-2-Replicate Design.Drag. Drop. Done.

Databases

Data Feed Sources

CSV

BI Reporting

Visualization& Analytics

DB/DW

Data Refresh

Data Append

Attunity Replicate for Microsoft APS Benefits

1. High Performance – high volumes, low latency, low impact

2. Heterogeneous – supports many source databases

3. Fast time-to-value – automated turnkey solution

4. Less impact on IT – less development resources required

5. Lower TCO – for both software licenses and implementation services

25

For more information, go to:

www.Attunity.com/aps

• Read the Attunity Replicate for Microsoft APS Solution Sheet

•Download the "Accelerating Big Data Analytics with Microsoft

APS and Attunity Replicate" Whitepaper

•Watch the Accelerating Big Data Analytics with Microsoft APS

and Attunity Replicate Webinar

26