pragmatic analytics - case studies of high performance computing for better business and big data
DESCRIPTION
SCIENCE FOUNDATION IRELAND DIGITAL CONTENT WORKSHOP Monday, July 25th 2011, Guinness Storehouse, Dublin Session 4 - Data Analytics, Mining and Visualisation Dr Eoin Brazil, Senior Software Developer and Tech Transfer Manager, Irish Centre for High End Computing (NUIG) Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data.TRANSCRIPT
Consultancy – Pragmatic Analytics
Irish Centre for High End Computing
Dr. Eoin Brazil
www.ichec.ie/consultancy
Technology Transfer @ ICHEC
• Started just over eighteen months ago
• Core competencies include:
– Performance Optimization
– Data Mining/Analytics (e.g. Computational Finance)
• Consultancy • Training (e.g. R - & TSA / & AC, CUDA, HPC, etc.)
SFI Enterprise Workshop - 25th July 2011 2
SFI Enterprise Workshop - 25th July 2011 3
SFI Enterprise Workshop - 25th July 2011 4
Visual Exploration
Example – Wine Vintage
SFI Enterprise Workshop - 25th July 2011 5
• Hot, dry summers give higher prices in mature wines
• Chȃteau Pétrus 2000 - ~$60,000 (liv-ex.com)
• Bordeaux Equation
• Wine quality = 12.145 + 0.00117 Winter Rainfall + 0.0614 Averarge Growing Season Temperature – 0.00386 Harvest Rainfall
SFI Enterprise Workshop - 25th July 2011 6
Financial services – Computational Finance
Real-World Constraints
• My application / workflow:
– Deal with +2B transactions per day per site
– Less than 50ms for end-to-end processing
– Need real-time detection of fraud
– Multiple coupled models in ensemble
– Production platform is X
– Cannot incorrectly classify good client as fraudster
– Data size is too large for my infrastructure
SFI Enterprise Workshop - 25th July 2011 7
Are you ready for Big Data ?
• Hadoop is x50+ slower on relation data, can be x1000+ slower on graph data
• Make sure you hone the tool first:
– MCMC x53 faster using Rcpp Versus R
– Linear Regression x8 using Eigen via R
– x15 BLAS/LAPACK with ICC flags and hardware in R
– Rmpi / multicore / MKL / pnmath / MR / gputools
SFI Enterprise Workshop - 25th July 2011 8
What are GPGPUs ?
• Disruptive Innovation in Parallel Computing
– HPC from desktop to supercomputers (10 Gen leap)
SFI Enterprise Workshop - 25th July 2011 9
SFI Enterprise Workshop - 25th July 2011 10
SFI Enterprise Workshop - 25th July 2011 11
Typical Business Results
SFI Enterprise Workshop - 25th July 2011 12
Domain Result
Computational Finance
1 or 8 Cards (x121/x950) = Do in 1 second what used to take 2/16 minutes, 10 generations of processor
Oil and Gas Data processing = x2 – x6 (profiling at this stage), e.g. if volume took 44 mins could be done in 22 – 7 ½ mins
Life Sciences Patient analytics, initial prototype for cardio-vascular disease detection (~72% accuracy), ongoing work.
Telecomms Fraud detection prototype for subscription fraud, Detection (~99% accuracy), avoided predicting good clients as fraudster*
Electronic Commerce
Demand forecasting & customer segmentation = Using historic data to predict future demand (~90% accuracy) & identified valuable clients (~80% accuracy)
Acknowledgements
Supported by Science Foundation Ireland under grant 08/HEC/I1450 and by HEA’s PRTLI-C4.