what's new in revolution r enterprise 6.2
DESCRIPTION
TRANSCRIPT
R evolution R E nterpris e R eleas e 6.2 New F eatures
David M. S mith T homas W. Dins more
May 1, 2013 1
P oll Ques tion
Which stats package do you use most?
In today’s webcas t:
About Revolution R Enterprise New Features in Revolution R Enterprise 6.2 Resources for getting more from R Q&A
3
Revolution Analytics is the leading commercial provider of software and support for the
open-source R statistical computing language
4
R evolution R E nterpris e is Enterprise-ready Multi-platform Scalable from desktop to big data Delivers high performance analytics Easier to build and deploy analytic applications
What is R ? Data analysis software A powerful programming language
Development platform designed by and for statisticians A complete environment
Huge library of algorithms for data access, data manipulation, analysis and graphics
An open-source software project Free, open, and active
A vibrant community Thousands of contributors, 2 million users Resources and help in every domain
5
Download the White Paper
R is Hot bit.ly/r-is-hot
R evolution A nalytic s S c ales R to the E nterpris e
6
Power Distributed high performance
analytics Productivity Build & deploy analytics
applications easily Enterprise Readiness Enterprise landscape Full-service customer
support, consulting and training
Power
Productivity
Enterprise Readiness
Revolution R Enterprise
R evolution R E nterpris e High P erformanc e, Multi-P latform A nalytic s P latform
7
Revolution R Enterprise
ScaleR High Performance Big Data Analytics
RevoR Performance Enhanced Open Source R
Open Source R packages
ConnectR High Speed Connectors
HDFS, Hbase, ODBC, SAS
PlatformR Parallel Distributed Computing
IBM/Netezza, IBM/Platform LSF, MS HPC Server, MS Azure Burst
DevelopR Integrated Development Environment
DeployR Web Services
S c aleR brings the power of B ig Data to R
8
Parallel External Memory Algorithms exploit available compute resources (cores & computers) independent of platform
Abstracted communications layer provides portability of
code between platforms: server, cluster, or in-
database
Use the high-speed local data mart (XDF), or stream data from SAS, ODBC, HDFS or other remote data sources.
Familiar, high-productivity
programming environment for R users
Distributed Statistical Algorithms
Communications Framework
Data Source API
R Language Interface
S c aleR A ddres s es P erformanc e and C apac ity L imitations of Open S ourc e R
9
DeployR integrates R with applic ations
Seamless Bring the power of R to any web enabled application
Simple Leverage common APIs including JS, Java, .NET
Scalable Robustly scale user and compute workloads
Secure Manage enterprise security with LDAP & SSO
10
R / Statistical Modeling Expert
DeployR
Business Intelligence
Mobile Web Apps
Cloud / SaaS
Deployment Expert
R evolution R E nterpris e A rchitecture Use a connected MPP server or cluster for:
Data exploration On-demand R applications Big-data predictive models Offline (batch) operations Code generation for real-time deployment
R R E 6.2 New F eatures Open Source R 2.15.3 High Speed Teradata Connector New Analytics Improved Performance Enhanced DeployR API
2
S upport for Open S ourc e R 2.15.3
89 new features 11 performance enhancements 139 bug fixes 137 package dependencies
3
High-S peed Teradata C onnector
0
50,000
100,000
150,000
200,000
250,000
300,000
1MM 10MM 100MM
Row
s P
er S
econ
d Data Transfer Speed
ODBC TPT
Source: Teradata Center of Excellence
Six times faster
New A nalytics : (1) S tepwis e R egres s ion
Automates the model fitting process Uses statistical criteria to select features Multiple options:
Methods: Forwards, backwards, bidirectional Criteria: AIC, BIC, Mallows’ Cp
Linear models (working on Logistic, GLM)
5
New A nalytics (2): P R NG
Parallel Random Number Generator R interface to MKL PRNGs Needed for “Forests” Useful in simulation, Monte Carlo analysis
6
Other A nalytic E nhancements
Improved control over model-fitting Interaction terms Linear models, GLM, logit
Output to XDF: “By-Group” summary stats Cube results
7
P erformance E nhancements
Fixed format text input: Previously used Stat/Transfer With RRE 6.2:
~2x faster on Windows ~1.3x faster on Linux
Sort and merge
8
P erformance E nhancements : S ort
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
Test A Test B
Row
s P
er S
econ
d Sort Speed
RRE 6.1 RRE 6.2
Ten times faster Nine times faster
15MM rows 100 columns
7.5MM rows 200 columns
Sort on five keys
DeployR E nhancements
Priority scheduling for asynchronous jobs Execute external scripts
GIT SVN
Repository-managed files and scripts Lifecycle management Enhanced versioning
Updated java, javascript, .net client libraries
10
R evolution R E nterpris e 6.2
Updated R distribution Enhanced Teradata integration More analytic capabilities Faster import, sort and merge Improved enterprise deployability 11
P oll Ques tion
What interests you most about Revolution R Enterprise?
T hank You! Download slides, replay
http://bit.ly/RRE62-webinar Resources for getting started with R
http://bit.ly/ZnZGt2
Get Revolution R Enterprise Contact Sales: http://bit.ly/hey-revo Free to Academics: www.revolutionanalytics.com/academic
We’re Hiring!
www.revolutionanalytics.com/careers
24
12
25
www.revolutionanalytics.com +1 650 646 9545 Twitter: @RevolutionR
The leading commercial provider of software and support for the popular open source R statistics language.
T hank you
13