pentaho big data analytics with vertica and hadoop

12
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 1 The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho

Upload: mark-kromer

Post on 10-May-2015

772 views

Category:

Technology


1 download

DESCRIPTION

Overview of the Pentaho Big Data Analytics Suite from the Pentaho + Vertica presentation at Big Data Techcon 2014 in Boston for the session called "The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho"

TRANSCRIPT

Page 1: Pentaho Big Data Analytics with Vertica and Hadoop

The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho

Page 2: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75552

The Ultimate Selfie Picture Yourself with the Fastest Analytics on Hadoop

with HP Vertica and Pentaho

Pentaho Big Data Analytics

Mark KromerPentaho Big Data Analytics Product Manager

Page 3: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75553

DBA ETL/BI Developer Business Users & Executives

Analysts & Data Scientists

OPERATIONAL DATA BIG DATA DATA STREAMPUBLIC/PRIVATE CLOUDS

Enterprise & Interactive Reporting

Interactive Analysis

Dashboards Predictive Analytics

Pentaho Business Analytics

Data IntegrationInstaview | Visual Map Reduce

DIRECT ACCESS

Pentaho Business Analytics Platform

Page 4: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75554

Product Components

Pentaho Data Integration

• Visual development for big data• Broad connectivity• Data quality & enrichment• Integrated scheduling• Security integration

• Visual data exploration• Ad hoc analysis• Interactive charts & visualizations

Pentaho Dashboards

• Self-service dashboard builder• Content linking & drill through• Highly customized mash-ups

Pentaho Data Mining & Predictive Analytics

• Model construction & evaluation • Learning schemes• Integration with 3rd part models

using PMML

Pentaho Enterprise & Interactive Reports

• Both ad hoc & distributed reporting• Drag & drop interactive reporting• Pixel-perfect enterprise reports

Pentaho for Big Data MapReduce & Instaview

• Visual Interface for Developing MR

• Self-service big data discovery• Big data access to Data Analysts

Pentaho Analyzer

Page 5: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75555

❯ Simple, easy-to-use visual data exploration

❯ Web-based thin client; in-memory caching

❯ Rich library of interactive visualizations • Geo-mapping, heat grids, scatter plots, bubble

charts, line over bar and more• Pluggable visualizations

❯ Java ROLAP engine to analyze structured and unstructured data, with SQL dialects for querying data from RDBMs

❯ Pluggable cache integrating with leading caching architectures: Infinispan (JBoss Data Grid) & Memcached

Pentaho Interactive Analysis & Data DiscoveryHighly Flexible Advanced Visualizations

Page 6: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75556

Pentaho Data Integration

Easy to Use, Highly Scalable

❯Graphical ETL designer

❯Data agnostic

• Structured, unstructured, web services, packaged

apps (Google, SAS, SFDC, etc.), big data sources,

traditional sources, JSON, XML, HL7, etc.

❯Batch, low-latency & real time processing

❯Scale-out architecture, deployable to PDI clusters,

Hadoop clusters

❯100% Java engine; plug-in architecture for extensibility

❯Workflow, alerting, monitoring

Integration, Manipulation & Enrichment

Use Cases:

Classic ETL – data warehouse creation, population & maintenance

Information Delivery – extraction from multiple data sources,

transformation and streaming to a report

MapReduce Applications – implementing “code-free”

transformation pipelines within Hadoop

Extensibility – adding 3rd-party functionality that automatically

works within any of the above use cases.

Page 7: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75557

Pentaho Big Data Analytics Accelerate the time to big data value

• Full continuity from data

access to decisions –

complete data integration &

analytics for any big data

store

• Faster development,

faster runtime – visual

development, distributed

execution

• Instant and interactive

analysis – no coding and

no ETL required

Page 8: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75558

Pentaho Visual DevelopmentEliminates the Need for Complex Coding

Would you rather do this?

Scheduling Modeling

Ingestion / Manipulation / Integration

… or this?

Page 9: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75559

Pentaho Visual MapReduceDrag & Drop, Then Run in the Cluster

Parallel Execution as MapReduce in the Hadoop Cluster

As Much as 15x Faster Than Hand-Written Code

Page 10: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755510

• Major sponsor of the open source project Weka

• Data exploration/visualization, model construction and

export, preliminary evaluation

• Numerous classification/regression and clustering

algorithms

• Integration with Pentaho Data Integration

❯ Import 3rd-party models using Predictive Modeling

Markup Language (PMML)

❯ Operationalize models inside or outside of a Hadoop

Cluster

❯ Incorporate algorithms into Pentaho visual interface;

store and version models using the Pentaho repository

Pentaho Predictive Analytics

Full Predictive Analytics Lifecycle Support

Page 11: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755511

Streamlined Data RefineryDrive a Sustainable Analytics Strategy with Big Data Orchestration at Scale

Transactions – Batch & Real-time

Enrollments & Redemptions

Location, Email, Other Data

Hadoop Cluster

Analyzer

Reports

Data Orchestration

Page 12: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755512

blog.pentaho.com

@Pentaho

Facebook.com/Pentaho

Pentaho Business Analytics

JOIN THE CONVERSATION. YOU CAN FIND US ON: