pragmatic analytics - case studies of high performance computing for better business and big data

Post on 14-May-2015

367 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

SCIENCE FOUNDATION IRELAND DIGITAL CONTENT WORKSHOP Monday, July 25th 2011, Guinness Storehouse, Dublin Session 4 - Data Analytics, Mining and Visualisation Dr Eoin Brazil, Senior Software Developer and Tech Transfer Manager, Irish Centre for High End Computing (NUIG) Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data.

TRANSCRIPT

Consultancy – Pragmatic Analytics

Irish Centre for High End Computing

Dr. Eoin Brazil

www.ichec.ie/consultancy

Technology Transfer @ ICHEC

• Started just over eighteen months ago

• Core competencies include:

– Performance Optimization

– Data Mining/Analytics (e.g. Computational Finance)

• Consultancy • Training (e.g. R - & TSA / & AC, CUDA, HPC, etc.)

SFI Enterprise Workshop - 25th July 2011 2

SFI Enterprise Workshop - 25th July 2011 3

SFI Enterprise Workshop - 25th July 2011 4

Visual Exploration

Example – Wine Vintage

SFI Enterprise Workshop - 25th July 2011 5

• Hot, dry summers give higher prices in mature wines

• Chȃteau Pétrus 2000 - ~$60,000 (liv-ex.com)

• Bordeaux Equation

• Wine quality = 12.145 + 0.00117 Winter Rainfall + 0.0614 Averarge Growing Season Temperature – 0.00386 Harvest Rainfall

SFI Enterprise Workshop - 25th July 2011 6

Financial services – Computational Finance

Real-World Constraints

• My application / workflow:

– Deal with +2B transactions per day per site

– Less than 50ms for end-to-end processing

– Need real-time detection of fraud

– Multiple coupled models in ensemble

– Production platform is X

– Cannot incorrectly classify good client as fraudster

– Data size is too large for my infrastructure

SFI Enterprise Workshop - 25th July 2011 7

Are you ready for Big Data ?

• Hadoop is x50+ slower on relation data, can be x1000+ slower on graph data

• Make sure you hone the tool first:

– MCMC x53 faster using Rcpp Versus R

– Linear Regression x8 using Eigen via R

– x15 BLAS/LAPACK with ICC flags and hardware in R

– Rmpi / multicore / MKL / pnmath / MR / gputools

SFI Enterprise Workshop - 25th July 2011 8

What are GPGPUs ?

• Disruptive Innovation in Parallel Computing

– HPC from desktop to supercomputers (10 Gen leap)

SFI Enterprise Workshop - 25th July 2011 9

SFI Enterprise Workshop - 25th July 2011 10

SFI Enterprise Workshop - 25th July 2011 11

Typical Business Results

SFI Enterprise Workshop - 25th July 2011 12

Domain Result

Computational Finance

1 or 8 Cards (x121/x950) = Do in 1 second what used to take 2/16 minutes, 10 generations of processor

Oil and Gas Data processing = x2 – x6 (profiling at this stage), e.g. if volume took 44 mins could be done in 22 – 7 ½ mins

Life Sciences Patient analytics, initial prototype for cardio-vascular disease detection (~72% accuracy), ongoing work.

Telecomms Fraud detection prototype for subscription fraud, Detection (~99% accuracy), avoided predicting good clients as fraudster*

Electronic Commerce

Demand forecasting & customer segmentation = Using historic data to predict future demand (~90% accuracy) & identified valuable clients (~80% accuracy)

Acknowledgements

Supported by Science Foundation Ireland under grant 08/HEC/I1450 and by HEA’s PRTLI-C4.

top related