web briefing: unlock the power of hadoop to enable interactive analytics

25
Unlock the power of Hadoop to enable interactive analytics & real-time Business Intelligence July 10, 2013

Upload: kognitio

Post on 27-Jan-2015

114 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Unlock the power of Hadoop to enable interactive analytics & real-time Business Intelligence July 10, 2013

Page 2: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Web Briefing: Unlock the power of Hadoop

to enable interactive analytics

• Thank you for joining today’s session!• The web briefing will start momentarily. • We will use the WebEx Q & A feature

Today’s Slides are available at www.slideshare.net/kognitio

@Hortonworks@Kognitio

Follow the conversation on Twitter:

Teleconference:Use your computer, or call:

US +1 631 267 4890UK +44-203-478-5289Passcode: 841 203 797

Page 3: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Unlock the power of Hadoop to enable interactive analytics

July 10, 2013

Demonstration: SQL and Hadoop with in‐memory MPP Acceleration  ‐ Stuart Watt

Hadoop meets Mature BI: Interactive Analytics‐Michael Hiskey

Modern Data Architectures‐ John Kriesa

Web Briefing Agenda

Page 4: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Modern Data ArchitecturesBig data drivers and patterns

John Kreisa – VP Strategic Marketing, Hortonworks@marked_man

Page 5: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Existing Data ArchitectureAP

PLICAT

IONS

DAT

A SYSTEM

S

TRADITIONAL REPOSRDBMS EDW MP

P

DAT

A SO

URC

ES

OLTP, POS 

SYSTEMS

OPERATIONALTOOLS

MANAGE & MONITOR

Traditional Sources (RDBMS, OLTP, OLAP)

DEV & DATATOOLS

BUILD & TEST

Business Analytics

Custom Applications

Enterprise Applications

Page 5

Page 6: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

6 Common Types of Hadoop Data

1. SentimentUnderstand how your customers feel about your brand and products – right now

2. ClickstreamCapture and analyze website visitors’ data trails and optimize your website

3. Sensor/MachineDiscover patterns in data streaming automatically from remote sensors and machines

4. GeographicAnalyze location-based data to manage operations where they occur

5. Server LogsResearch logs to diagnose process failures and prevent security breaches

6. Unstructured (txt, video, pictures, etc..)Understand patterns in text across millions of web pages, emails, and documents

Value

Page 6

Page 7: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Next-Generation Data Architecture

Page 7

APPLICAT

IONS

DAT

A SYSTEM

S

Microsoft Applications

DAT

A SO

URC

ES

Traditional Sources (RDBMS, OLTP, OLAP)

In‐memory MPP Accelerator

BI Tools & OLAP Clients

TRADITIONAL REPOSRDBMS EDW MPP

OPERATIONALTOOLS

MANAGE & MONITOR

DEV & DATATOOLS

BUILD & TEST

New Sources (web logs, email, sensors, social media)

HORTONWORKS DATA PLATFORM

Page 8: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Interoperating With Your Data Tools

Page 8

APPLICAT

IONS

DAT

A SYSTEM

S

Microsoft Applications

DAT

A SO

URC

ES

Traditional Sources (RDBMS, OLTP, OLAP)

In‐memory MPP Accelerator

HORTONWORKS DATA PLATFORM

OPERATIONALTOOLS

Viewpoint

DEV & DATATOOLS

TRADITIONAL REPOS

New Sources (web logs, email, sensors, social media)

Page 9: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Big DataTransactions, Interactions, Observations

Hadoop Common Patterns of Use

Business Cases

HORTONWORKSDATA PLATFORM

Refine Explore Enrich

Batch Interactive Online

“Right-time” Access to Data

Page 9

Page 10: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Data System

sAp

plications

Sources

Infrastructure ‐ Data LakeModern Data Architecture

Hadoop as a Shared Data Lake

TRADITIONAL REPOS

RDBMS EDW MPP

Custom Analytic App

New Sources (logs, clicks, social media, sensors)

Packaged Analytic App

Traditional Sources (RDBMS, OLTP, OLAP)

• A more mature organization will have this as a goal for Hadoop

ENTERPRISE HADOOP PLATFORM

Page 10

• Store all data and build/enable applications on shared “data lake”

• Delivers broad value across the enterprise

In‐memory MPP Accelerator

HORTONWORKS DATA PLATFORM

• Seamless SQL access with interactive analytics

Page 11: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Data System

sAp

plications

Sources

Hadoop for New Targeted Applications

TRADITIONAL REPOS

RDBMS EDW MPP

New Sources (logs, clicks, social media, sensors)

Packaged Analytic App

Traditional Sources (RDBMS, OLTP, OLAP)

ENTERPRISE HADOOP PLATFORM

Business ApplicationCatalyst: Type of Data

Custom Analytic App

In‐memory MPP Accelerator

HORTONWORKS DATA PLATFORM

• Many organizations start here & expand usage

• Driven by a type of data that was not capable of analysis before Hadoop

• Delivers explicit value for a business case or an individual LOB

• Complementary to existing applications that use SQL

• Interactive analytics with MPP in-memory execution of R, Python, Perl, etc.

Page 12: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

OS Cloud VM Appliance

HDP: Enterprise Hadoop Distribution

Page 12

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery,Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HIVE & HCATALOG

PIG HBASE

OOZIE

AMBARI

HDFS

MAP REDUCE

Hortonworks Data Platform (HDP)Enterprise Hadoop

• The ONLY 100% open source and complete distribution

• Enterprise grade, proven and tested at scale

• Ecosystem endorsed to ensure interoperability

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

Page 13: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Hadoop meets Mature BI: Interactive Analytics

Michael HiskeyVP of Marketing & Business Development

@mphnyc

Page 14: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Mature Business Intelligence and Reporting

Numbers, tables, charts, indicators

…accessed with ease and simplicity

Historical information, latency

BI tools have plateaued

Decision Support

Advanced analytics and data science

More math…a lot more math

Page 15: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Drive for a deeper level of understanding

DynamicSimulation

Statistical Analysis

Behavior modellingReporting Fraud

detection

create external script LM_PRODUCT_FORECAST environment rsintreceives ( SALEDATE DATE, DOW INTEGER, ROW_ID INTEGER, PRODNO INTEGER, DApartition by PRODNO order by PRODNO, ROW_IDsends ( R_OUTPUT varchar )isolate partitionsscript S'endofr( # Simple R script to run a linear fit on daily sales

prod1<-read.csv(file=file("stdin"), headercolnames(prod1)<-c("DOW","ID","PRODNO","DAdim1<-dim(prod1)daily1<-aggregate(prod1$DAILYSALES, list(Ddaily1[,2]<-daily1[,2]/sum(daily1[,2])basesales<-array(0,c(dim1[1],2))basesales[,1]<-prod1$IDbasesales[,2]<-(prod1$DAILYSALES/daily1[prcolnames(basesales)<-c("ID","BASESALES")fit1=lm(BASESALES ~ ID,as.data.frame(basesforecast<-array(0,c(dim1[1]+28,4))colnames(forecast)<-c("ID","ACTUAL","PREDI

select Trans_Year, Num_Trans,count(distinct Account_ID) Num_Accts,sum(count( distinct Account_ID)) over (partition by Trans_Year order by Num_Trancast(sum(total_spend)/1000 as int) Total_Spend,cast(sum(total_spend)/1000 as int) / count(distinct Account_ID) Avg_Yearly_Spendrank() over (partition by Trans_Year order by count(distinct Account_ID) desc) Rrank() over (partition by Trans_Year order by sum(total_spend) desc) Rank_by_Totfrom( select Account_ID,

Extract(Year from Effective_Date) Trans_Year,count(Transaction_ID) Num_Trans,sum(Transaction_Amount) Total_Spend,avg(Transaction_Amount) Avg_Spend

from Transaction_factwhere extract(year from Effective_Date)<2009and Trans_Type='D' and Account_ID<>9025011and actionid in (select actionid from DEMO_FS.V_FIN_actions

where actionoriginid =1)group by Account_ID, Extract(Year from Effective_Date) ) Acc_Summary

group by Trans_Year, Num_Transorder by Trans_Year desc, Num_Trans;

select dept, sum(sales) from sales_factWhere period between date ‘01-05-2006’ agroup by depthaving sum(sales) > 50000;

select sum(sales) from sales_historywhere year = 2006 and month = 5 and regiselect total_salesfrom summary where year = 2006 and month = 5 and regi

Page 16: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

The Analytical Enterprise

Business Analyst

Systems Admin

Data Scientist

Sexiest job of the 21st Century?

Key: “Graduation”• Projects will need to easily Graduate

from the Data Science Lab and become part of Business as Usual

Page 17: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Your goal:

PRESS HERE…and really cool Big Data stuff happens!

Page 18: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Big Data: Bring the Analytics TO the Data

Kognitio Hadoop Integration • Kognitio Map/Reduce Agent uploads itself to

Hadoop nodes• Query passes selections, relevant predicates• Data filtering & projection locally on each node

• Data filtered as it is read from file(s)• Only data of interest is transferred and loaded

into memory via parallel load streams

Page 19: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Demonstration: SQL &Hadoop with in‐memory 

MPP AccelerationStuart Watt

Senior Systems Engineer@Kognitio

Page 20: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Hortonworks Snapshot

• We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform

• We engineer, test & certify HDP for enterprise usage

• We employ the core architects, builders and operators of Apache Hadoop

• We drive innovation within Apache Software Foundation projects

• We are uniquely positioned to deliver the highest quality of Hadoop support

• We enable the ecosystem to work better with Hadoop

Develop Distribute Support

We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution

Endorsed by Strategic Partners

Headquarters: Palo Alto, CAEmployees: 200+ and growingInvestors: Benchmark, Index, Yahoo, Tenaya, Dragoneer

Page 21: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Kognitio Snapshot: Mature SQL atop Hadoop

Kognitio is an in‐memory analytical platform that is tightly integrated with Hadoop for high‐performance advanced analytics 

that make Big Data more consumable for enterprises, 

especially those with mature BI environments or engrained 

tools. 

• Privately held• Invented the in‐memory analytical platform• Labs in the UK ‐ HQ in New York, NY 

• Powering advanced analytics at organizations worldwide, such as: 

Page 22: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Interactive analytics with Hadoop: Getting Started

• Assess your environment and use case for Hortonworks Data Platform + Kognitio Analytical Platform www.kognitio.com/hadoop

Download Hortonworks Sandboxwww.hortonworks.com/sandbox

Sign up for Training for in-depth learninghortonworks.com/hadoop-training/

ZERO to big data in 15 minutes:

Request a Meeting

Download the Kognitio Analytical Platform• No registration required• Perpetual license - No time limits www.kognitio.com/free

Page 23: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

Question & Answer session will be conducted electronically, using the panel to the right of your screen

Today’s Slides available at: www.slideshare.net/kognitio

Download Hortonworks Sandboxwww.hortonworks.com/sandbox

Download the Kognitio Analytical Platform• No registration required• Perpetual license - No time limits www.kognitio.com/free

Unlock the power of Hadoop to enable interactive analytics

Request a Meetingwww.kognitio.com/hadoop

Page 24: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

connect

www.kognitio.com

twitter.com/kognitiolinkedin.com/companies/kognitio

tinyurl.com/kognitio youtube.com/kognitio

+1 855  KOGNITIO

Page 25: Web Briefing: Unlock the power of Hadoop to enable interactive analytics

© Hortonworks Inc. 2013

Hortonworks SandboxFastest onramp to Apache Hadoop• What is it?

– Free, virtualized single-node version of Hortonworks Data Platform– A personal Hadoop environment– An integrated learning environment with hands-on step-by-step tutorials

• What it does?– Dramatically accelerates the process of learning Apache Hadoop– Accelerates & validates the use of Hadoop within your unique data

architecture– Use your data to explore and investigate your use cases

• ZERO to big data in 15 minutes• Get Started!

Page 25

Download Hortonworks Sandboxwww.hortonworks.com/sandbox

Sign up for Training for in-depth learninghortonworks.com/hadoop-training/