sas, jmp and r - files.meetup.comfiles.meetup.com/14454172/sas, jmp and r.pdfsas 9.4 pro's...

20
SAS, JMP and R for Big Data D. Zeitler April 13, 2016

Upload: others

Post on 30-Aug-2019

11 views

Category:

Documents


0 download

TRANSCRIPT

SAS, JMP and Rfor Big Data

D. Zeitler

April 13, 2016

Modern computer architecture

Analytics Components

Processors

Cache

Main memory

Graphics Processing Unit(GPU)

Disk drive(s)

Network interface

·

·

·

·

·

·

3/20

Analytics platforms

Notebook/tablet

Desktop

Stand alone server

Server farm

Cloud

·

·

·

·

·

4/20

Three different approachesSAS 9.4, JMP Pro 12, R 3.2.3

SAS 9.4

Pro'sCon's

Backward compatibility(extreme)

Enterprise level support

Respected and accepted

Big data on limited hardware(sort of)

Supports high end hardware

Academic credentials

Lots of experts available (fora price)

·

·

·

·

·

·

·

Cost and slow adoption·

6/20

JMP Pro 12

ProCon

Highly visual

Advanced GUI

Very fast for most analyses

Cross platform(Win/Mac/Linux)

Good OO programmability

High integration with SAS

·

·

·

·

·

·

Cost (maybe more than SAS)

In-memory operation limitssize

·

·

7/20

R 3.2.3

Pro ConAdvanced languagecapabilities

Unlimited visualizationcapability

Low cost (even for enterpriseversions)

Quick adoption of newtechniques

Interactive environment

Global free supportcommunity

·

·

·

·

·

·

8/20

Why not Python?

Even more 'roll your own' than R

I usually opt for C++ rather than Python

This is about statistical software

·

·

·

9/20

Head to head marketing

Big Data Analytics | Benchmarking SAS®, R, and MahoutSource(http://support.sas.com/resources/papers/Benchmark_R_Mahout_SAS.pdf)

Revolution R Enterprise: Faster Than SASSource (http://info.revolutionanalytics.com/SAS-Benchmark-White-Paper.html) You'll need to give them your contact info tolook at it.

So we have the gist in the next two slides.

10/20

These are the tasks.

11/20

Here's the punchline.

12/20

Handling Big data

General approaches

Sampling (row/cases & column/variables)

External data base and SQL

Big hardware

Chunking

External calls (Rcpp)

·

·

·

·

·

14/20

SAS 9.4

Developed vintage 1960's onmainframes (Think punchcards and tapes)

Searched for 'When was SASfirst written'. Second entrywas 'When was the Biblewritten?'

·

·

15/20

JMP Pro 12

Vintage 1980's on Mac

platform

GUI oriented

Highly graphics oriented

Expensive

Well supported and

integrated with SAS 9.4

·

·

·

·

·

16/20

R/RStudio 3.2.3

Vintage 1980's on Unixsystems

Under constant development

Freeware

Commercialized versionsavailable

·

·

·

·

17/20

SAS specific big data approaches

If you can afford it, SAS is probably the quickest way to solutions.

So pay the bucks for:Enterprise Miner

Visual Analytics

Factory Miner

Contextual Analysis

SAS/OR

Simulation Studio

– the list goes on… –

·

·

·

·

·

·

·

18/20

JMP specific big data approaches

SAS JMP Pro (instead of the cheaper JMP offering)

SAS server

SAS code

·

·

·

19/20

R specific big data approaches

Alternative interpreters (pqR

(http://radfordneal.github.io/pqR/), Renjin

(http://www.renjin.org/), TERR

(http://spotfire.tibco.com/en/discover-spotfire/what-does-

spotfire-do/predictive-analytics/tibco-enterprise-runtime-for-r-

terr.aspx), Oracle R

(http://www.oracle.com/technetwork/indexes/downloads/r-

distribution-1532464.html) )

Tessera (http://tessera.io)

Microsoft R Server - XDF files (https://www.microsoft.com/en-

us/server-cloud/products/r-server/)

Spark/R (https://spark.apache.org/docs/latest/sparkr.html)

The "Programming with Big Data in R" project (pbdR)

·

·

·

·

· 20/20