05nov13 webinar: introducing revolution r enterprise 7 - the big data big analytics platform
DESCRIPTION
TRANSCRIPT
Announcing: Release 7Revolution R Enterprise
Michele Chambers, Chief Strategy Officer and VP Product ManagementThomas W. Dinsmore, Director of Product Management
Tuesday, November 5
Agenda Introduction
– Demystifying R
– Revolution Analytics at a Glance
– Revolution R Enterprise
– Revolution Analytics Partner Ecosystem
– Customer TestimonialsWhat’s New in RRE 7?More InformationQuestions
2
Demystifying R
What is R & why is it so darn popular?
4
R is exploding in popularity & function
Web Site PopularityNumber of links to main web siteR
SAS
SPSS
S-Plus
Stata
Scholarly ActivityGoogle Scholar hits (’05-’09 CAGR)R 46%
SAS -11%
SPSS -27%
S-Plus 0%
Stata 10%
Internet DiscussionMean monthly traffic on email discussion list
R
SAS
Stata
SPSSS-Plus
Package GrowthNumber of R packages listed on CRAN
4,332 as of Feb 2013
5
Latest survey shows significant growth in R adoption
“A key benefit of R is that it provides near-instant availability of new and
experimental methods created by its user base — without waiting for the
development/release cycle of commercial software. SAS recognizes
the value of R to our customer base…”
Product Marketing Manager SAS Institute, Inc
“I’ve been astonished by the rate at which R has been adopted. Four years
ago, everyone in my economics department [at the University of
Chicago] was using Stata; now, as far as I can tell, R is the standard tool, and
students learn it first.”
Deputy Editor for New Products at Forbes
R Usage GrowthRexer Data Miner Survey, 2007-2013
70% of data miners report using R
24% use R as primary tool
Source: www.rexeranalytics.com
Revolution Analytics at a GlanceWho We AreOnly provider of commercial big data big analytics platform based on open source R statistical computing language
Our Software DeliversScalable Performance: Distributed & parallelized analyticsCross Platform: Write once, deploy anywhereProductivity: Easily build & deploy with latest modern analytics
Our Services DeliverKnowledge: Our experts enable you to be expertsTime-to-Value: Our Quickstart program gives you a jumpstartGuidance: Our customer support team is here to help you
Global Industries Served
Financial Services
Digital Media
Government
Health & Life Sciences
High Tech
Manufacturing
Retail
Telco
Customers200+ Global 2000
Global PresenceNorth America / EMEA / APAC
6
Revolution R Enterprise
High Performance, Scalable Analytics Portable Across Enterprise Platforms Easier to Build & Deploy Analytics
is….the only big data big analytics platform based on open source Rthe defacto statistical computing language for modern analytics
7
8
Big Data In-memory bound Hybrid memory & disk scalability
Operates on bigger volumes & factors
Speed of Analysis
Single threaded Parallel threading Shrinks analysis time
Enterprise Readiness
Community support Commercial support Delivers full service production support
Analytic Breadth & Depth
5000+ innovative analytic packages
Leverage open source packages plus Big Data ready packages
Supercharges R
Commercial Viability
Risk of deployment of open source
Commercial license Eliminate risk with open source
R is open source and drives analytic innovation but….has some limitations for Enterprises
Introducing Revolution R Enterprise (RRE)The Big Data Big Analytics Platform
R+
CR
AN
Rev
oR
DistributedR
DevelopR DeployR
ScaleR
ConnectR
Big Data Big Analytics Ready
– Enterprise readiness
– High performance analytics
– Multi-platform architecture
– Data source integration
– Development tools
– Deployment tools
9
The Platform Step by Step:R Capabilities
R+
CR
AN
DistributedR
ScaleR
ConnectR
R+CRAN• Open source R interpreter• Freely-available R algorithms• Algorithms callable by RevoR• Embeddable in R scripts• 100% Compatible with existing
R scripts, functions and packages
RevoR• Performance enhanced R interpreter• Based on open source R• Adds high-performance math
Rev
oR
DevelopR
DeployR
10
Rev
oR
DevelopR
DeployR
R+
CR
AN
DistributedR
ScaleR
ConnectR
The Platform Step by Step:Parallelization & Data Sourcing ConnectR
• High-speed & direct connectors
ScaleR• Ready-to-Use high-performance
big data big analytics • Fully-parallelized analytics• Data prep & data distillation• Descriptive statistics & statistical
tests• Correlation & covariance matrices• Predictive Models – linear, logistic,
GLM• Machine learning• Monte Carlo simulation• Tools for distributing customized
algorithms across nodes
DistributedR• Distributed computing framework• Delivers portability across platforms
11
Rev
oR
R+
CR
AN
DistributedR
ScaleR
ConnectR
DeployR• Web services software
development kit for integration analytics via Java, JavaScript or .NET APIs
• Integrates R Into application infrastructures
Capabilities:• Invokes R Scripts from
web services calls• RESTful interface for
easy integration• Works with web & mobile apps,
leading BI & Visualization tools and business rules engines
DevelopR• Integrated development
environment for R• Visual ‘step-into’ debugger
The Platform Step by Step:Tools & Deployment
DevelopR DeployR
12
R+
CR
AN
Rev
oR
DistributedR
ScaleR
ConnectR
DeployRDevelopR
Write Once. Deploy Anywhere.
DESIGNED FOR SCALE, PORTABILITY & PERFORMANCE
In the Cloud Microsoft Azure BurstAmazon AWS
Workstations & Servers DesktopServer
Clustered Systems IBM Platform LSFMicrosoft HPC
EDW IBM NetezzaTeradata
Hadoop HortonworksCloudera
13
The Power of Revolution R EnterprisePerformance & Scalability
R + CRAN
Fast Math Libraries
Memory Management
Multi-Threaded Execution
Grid Processing
Parallelized Algorithms
Parallelized User Code
In-Database Execution
Open Source Leverage latest innovation
In-Hadoop Execution
Va l ue
RevoR 3-50X faster
DistributedR Effective memory utilization
DistributedR Powerful divide & conquer
DistributedR Maximizes computation
ScaleR Labor saving power
ScaleR Leverage CRAN
ScaleR Moves computation to data
ScaleR Moves computation to data
14
Revolution R EnterprisePowering Next Generation Analytics
COMBINE INTERMEDIATE RESULTS
15
Revolution R Enterprise Revo RPerformance Enhanced R
OpenSource R
Revolution R Enterprise
Computation (4-core laptop) Open Source R Revolution R Speedup
Linear Algebra1
Matrix Multiply 176 sec 9.3 sec 18x
Cholesky Factorization 25.5 sec 1.3 sec 19x
Linear Discriminant Analysis 189 sec 74 sec 3x
General R Benchmarks2
R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x
R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable
1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php2. http://r.research.att.com/benchmarks/
Customers report 3-50x performance improvements
compared to Open Source R — without changing any code
16
RRE ScaleR outperforms SAS HPA – at a fraction of the cost
Rows of data 1 billion 1 billion
Parameters “just a few” 7
Time 80 seconds 44 seconds
Data location In memory On disk
Nodes 32 5
Cores 384 20
RAM 1,536 GB 80 GB
Double
45%
1/6th
5%
5%Revolution R is faster on the same amount of data, despite using approximately a 20 th as many cores, a 20th as much RAM, a 6th as many nodes, and not pre-loading data into RAM.
Bottom Line: Revolution R Enterprise Performance = Greatly Reduced TCO
*As published by SAS in HPC Wire, April 21, 2011
Logistic Regression:
17
R + Revolution R EnterpriseUnequaled Big Data Big Analytics
Big Data Distributed Analytics
Open Source Analytics
Performance Enhanced R
R Performance Enhanced R
Open Source Analytics
Big Data Distributed Analytics
Deploy AnalyticsWeb, Mobile, Data Visualization, BI
18
Revolution R Enterprise EcosystemPower of Integration
Deployment / Consumption
Data / Infrastructure
Advanced Analytics
ETL
SI / Service
Corios
MSP / DSP
19
Customers Revolutionize their Business
Power
“We’ve combined Revolution R Enterprise and Hadoop to build and deploy customized exploratory data analysis and GAM survival models for our marketing performance management and attribution platform. Given that our data sets are already in the terabytes and are growing rapidly, we depend on Revolution R Enterprise’s scalability and power – we saw about a 4x performance improvement on 50 million records. It works brilliantly.” - CEO, John Wallace, DataSong
4X performance 50M records scored daily
Scalability
“We’ve been able to scale our solution to a problem that’s so big that most companies could not address it. If we had to go with a different solution we wouldn’t be as efficient as we are now.” - SVP Analytics, Kevin Lyons, eXelate
TB’s data from 200+ data sources10’s thousands attributes100’s millions of scores daily
2X data 2X attributes no impact on performance
Performance
“We need a high-performance analytics
infrastructure because marketing optimization is a
lot like a financial trading. By watching the market
constantly for data or market condition updates,
we can now identify opportunities for our
clients that would otherwise be lost.”
- Chief Analytics Officer, Leon Zemel, [x+1]
20
What’s New in RRE 7
The Power of R
22
Most widely used analytics toolPreferred by working analystsMore than 6,000 packagesGlobal footprint
New• R 3.0.2
Scalable Statistical Modeling
23
Linear RegressionStepwise LinearLogistic RegressionGeneralized Linear Models
New• Stepwise Logistic• Stepwise GLM
Scalable Machine Learning
24
Decision Trees
New • Decision Forests• Tree Visualization
Data Source Integration
25
Fixed/delimited textSAS, SPSSODBCHDFS and HBaseTeradataTested• HP Vertica• Teradata Aster
New: Model Integration
26
BI Integration
27
Custom web reportsQlikView accelerator
New • Excel Accelerator• Tableau Integration
New: Business User Interface
28
Choice of Operating Systems
29
New: Inside-Hadoop Deployment
30
Name Node
Data NodeData Node Data NodeData Node Data Node
Job Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
MapReduce
HDFS
Multi-Node Package Manager
31
Name Node
Data NodeData Node Data NodeData Node Data Node
Job Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
Task Tracker
MapReduce
HDFS
ScaleR in Hadoop
32
In-Database Deployment
33
Summary: What’s New in RRE 7.0
34
R+
CR
AN
DistributedR
ScaleR
ConnectR
R 3.0.2
Rev
oR
DevelopR
DeployR
Summary: What’s New in RRE 7.0
35
Rev
oR
DevelopR
DeployR
R+
CR
AN
DistributedR
ScaleR
ConnectR
Stepwise Logistic
Stepwise GLM
Decision Forests
Tree Visualizer
PMML Export
Summary: What’s New in RRE 7.0
36
Rev
oR
DevelopR
DeployR
R+
CR
AN
DistributedR
ScaleR
ConnectR
Summary: What’s New in RRE 7.0
37
Rev
oR
DevelopR
DeployR
R+
CR
AN
DistributedR
ScaleR
ConnectR
Summary: What’s New in RRE 7.0
38
Rev
oR
R+
CR
AN
DistributedR
ScaleR
ConnectR
DevelopR DeployR
39
www.revolutionanalytics.com
40
41
www.revolutionanalytics.com/contact-us
42
43