pragmatic analytics - case studies of high performance computing for better business and big data

13
Consultancy – Pragmatic Analytics Irish Centre for High End Computing Dr. Eoin Brazil www.ichec.ie/consultancy

Upload: eoin-brazil

Post on 14-May-2015

367 views

Category:

Technology


3 download

DESCRIPTION

SCIENCE FOUNDATION IRELAND DIGITAL CONTENT WORKSHOP Monday, July 25th 2011, Guinness Storehouse, Dublin Session 4 - Data Analytics, Mining and Visualisation Dr Eoin Brazil, Senior Software Developer and Tech Transfer Manager, Irish Centre for High End Computing (NUIG) Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data.

TRANSCRIPT

Page 1: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

Consultancy – Pragmatic Analytics

Irish Centre for High End Computing

Dr. Eoin Brazil

www.ichec.ie/consultancy

Page 2: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

Technology Transfer @ ICHEC

• Started just over eighteen months ago

• Core competencies include:

– Performance Optimization

– Data Mining/Analytics (e.g. Computational Finance)

• Consultancy • Training (e.g. R - & TSA / & AC, CUDA, HPC, etc.)

SFI Enterprise Workshop - 25th July 2011 2

Page 3: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

SFI Enterprise Workshop - 25th July 2011 3

Page 4: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

SFI Enterprise Workshop - 25th July 2011 4

Visual Exploration

Page 5: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

Example – Wine Vintage

SFI Enterprise Workshop - 25th July 2011 5

• Hot, dry summers give higher prices in mature wines

• Chȃteau Pétrus 2000 - ~$60,000 (liv-ex.com)

• Bordeaux Equation

• Wine quality = 12.145 + 0.00117 Winter Rainfall + 0.0614 Averarge Growing Season Temperature – 0.00386 Harvest Rainfall

Page 6: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

SFI Enterprise Workshop - 25th July 2011 6

Financial services – Computational Finance

Page 7: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

Real-World Constraints

• My application / workflow:

– Deal with +2B transactions per day per site

– Less than 50ms for end-to-end processing

– Need real-time detection of fraud

– Multiple coupled models in ensemble

– Production platform is X

– Cannot incorrectly classify good client as fraudster

– Data size is too large for my infrastructure

SFI Enterprise Workshop - 25th July 2011 7

Page 8: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

Are you ready for Big Data ?

• Hadoop is x50+ slower on relation data, can be x1000+ slower on graph data

• Make sure you hone the tool first:

– MCMC x53 faster using Rcpp Versus R

– Linear Regression x8 using Eigen via R

– x15 BLAS/LAPACK with ICC flags and hardware in R

– Rmpi / multicore / MKL / pnmath / MR / gputools

SFI Enterprise Workshop - 25th July 2011 8

Page 9: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

What are GPGPUs ?

• Disruptive Innovation in Parallel Computing

– HPC from desktop to supercomputers (10 Gen leap)

SFI Enterprise Workshop - 25th July 2011 9

Page 10: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

SFI Enterprise Workshop - 25th July 2011 10

Page 11: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

SFI Enterprise Workshop - 25th July 2011 11

Page 12: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

Typical Business Results

SFI Enterprise Workshop - 25th July 2011 12

Domain Result

Computational Finance

1 or 8 Cards (x121/x950) = Do in 1 second what used to take 2/16 minutes, 10 generations of processor

Oil and Gas Data processing = x2 – x6 (profiling at this stage), e.g. if volume took 44 mins could be done in 22 – 7 ½ mins

Life Sciences Patient analytics, initial prototype for cardio-vascular disease detection (~72% accuracy), ongoing work.

Telecomms Fraud detection prototype for subscription fraud, Detection (~99% accuracy), avoided predicting good clients as fraudster*

Electronic Commerce

Demand forecasting & customer segmentation = Using historic data to predict future demand (~90% accuracy) & identified valuable clients (~80% accuracy)

Page 13: Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

Acknowledgements

Supported by Science Foundation Ireland under grant 08/HEC/I1450 and by HEA’s PRTLI-C4.