using power at warwick university
TRANSCRIPT
![Page 1: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/1.jpg)
Using POWER at Warwick UniversityDugan Witherick
8th July 2019 / University of Birmingham/ Second PowerAI User Group Meeting
![Page 2: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/2.jpg)
• Established in the 1960s.
• Creation supported by University of Birmingham Vice Chancellor.
• ~27,000 students (undergrad and postgrad)
• Ranked
• 9th in the UK (Guardian 2020 league table)
• 62nd in the world (QS World University Rankings 2020)
• within the UK top 10 for highest earnings in over 11 subjects 5 years after graduating (UK Gov 2018 LEO Dataset).
Who is Warwick University
![Page 3: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/3.jpg)
Where is Warwick University (not in Warwick)
![Page 4: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/4.jpg)
• One of the UK's leading research universities.
• Theme focused research e.g.:
• The Engineered World: from Molecules to Machines
• Life Sciences and Health• Strong history of collaboration and partnerships
including:
• The Monash Warwick Alliance
• National Automotive Innovation Centre (WMG, JLR, Tata)
Research at Warwick University
![Page 5: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/5.jpg)
• World-class technologies and expertise.
• Ready access to research critical tools.
• Responsibility of Pro-Vice Chancellor (Research).
• Includes:
• Advanced Bioimaging
• Electron Microscopy
• X-ray diffraction
• And...
Where do we fit in?
Research Technology Platforms
![Page 6: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/6.jpg)
• Located in the Department of Computer Science.
• Providing:
• Scientific desktop (based on Linux).
• Two local HPC clusters providing ~6000 cores (Tinis and Orac).
• Access to the HPC Midlands+ Tier2 (Athena) system.
• One SCRTP Director.
• Four "computing" staff.
• Two dedicated RSEs with additional Project Associate RSEs.
Scientific Computing (SCRTP)
![Page 7: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/7.jpg)
• Centre for Scientific Computing (CSC)
• Interdisciplinary research community based around the sharing of knowledge and expertise in computer modelling and simulation.
• Representatives from Departments including Physics, Maths, WMG and Warwick Medical School
• Department of Computer Science
Close Working Relationships
![Page 8: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/8.jpg)
• UK's national institute for data science and artificial intelligence.
• University of Warwick one of the five founding partners.
• Over twenty Turing Fellows at Warwick.
• Data science tools for high-performance computing (ATI Research Project).
The Alan Turing Institute and Warwick
![Page 9: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/9.jpg)
• Tissue Image Analytics (TIA) Lab
• Deep Learning for Imaging Data.
• Using/developing ML algorithms and data science platforms to understand and improve air quality over London.
• Crowd blackspot intelligence for 5G rollout (COCKPIT-5G).
AI at Warwick
![Page 10: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/10.jpg)
• 4 x 16 core Haswell with 2 x Xeon Phi 7120P (co-processor)
• 4 x 16 core Haswell with 2 x K80 dual-GPU
• CentOS 6, QDR Infiniband
• 4 x Xeon Phi 7250F (Knights Landing)• CentOS 7, Omnipath
Accelerators for supporting AI workloads
![Page 11: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/11.jpg)
• Power8 Minsky S822LC
• 2 x IBM POWER8 3.259 GHz 8-core processors
• 16 cores per node
• 256 GB DDR4 memory
• 4 x NVIDIA P100 GPGPUs (SXM2 NVLink-enabled)
• Part of the Orac HPC Cluster (Broadwell, NetApp/Spectrum Scale, Omni-Path, CentOS 7, xCAT)
OpenPower TestBed
![Page 12: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/12.jpg)
• Version 1.5.4
• OS reinstall number one: CentOS 7.3 -> 7.5
• Not the simplest installation procedure.
• Non-relocatable dependencies!
• I think my "simplified" instructions to users may have put them off!
• User Question: "Where's theano?"
PowerAI Attempt 1
![Page 13: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/13.jpg)
• Version 1.6.0
• OS reinstall number two: CentOS 7.5 -> 7.6
• Much simpler installation
• All in a Conda channel (thank you).
• Some actual usage!
PowerAI Attempt 2
![Page 14: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/14.jpg)
• Demetris Marnerides
• Warwick Centre for Predictive Modelling
• Converting Low Dynamic Range (LDR) images to High Dynamic Range
• HDR displays readily available but most content still LDR.
Deep Learning for HDR Imaging
![Page 15: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/15.jpg)
• Convolutional Neural Networks to learn mapping from LDR to HDR
• PyTorch and OpenCV
• https://github.com/dmarnerides/hdr-expandnet
• https://arxiv.org/abs/1803.02266(Initial Reseach)
ExpandNet
![Page 16: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/16.jpg)
• Kieran Kalair
• Mathematics for Real-World Systems Centre for Doctoral Training
• Analysing traffic data particularly extreme cases:
• Accident
• Breakdown
• Random perturbation that causes a cascade of flow breakdown
Large Scale Traffic Data Analysis Problems
![Page 17: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/17.jpg)
• Time-Series models poor predictors for very short horizons.
• Neural Networks to improve predictions using UK motorway data.
• PyTorch
Improving Predictions
Image Credit: Jaroslaw Kilian / Shutterstock.com
![Page 18: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/18.jpg)
• Version 1.6.1.
• Sorry, Watson Machine Learning Community Edition.
• Please stop renaming your products!
• On my todo list.
PowerAI Attempt 3
![Page 19: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/19.jpg)
• How easy is it to build HPC applications currently used on Orac and Tinis on OpenPower?
• Affects support load (manual/adhoc builds take time).
• Currently use EasyBuild.
• Once built, do they produce the expected output?
• How well do these builds perform?
Migrating Non-AI Work to OpenPower
![Page 20: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/20.jpg)
• Use EasyBuild to build GPU accelerated HPC applications (commonly used at Warwick).
• EasyBuild fosscuda 2018b toolchain:
• GCC 7.3.0, CUDA 9.2, OpenMPI 3.1.1
• Test "identical" build on Haswell/K80 for baseline performance.
Testing Non-AI Support and Workloads
![Page 21: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/21.jpg)
• Large-scale Atomic/Molecular Massively Parallel Simulator.
• Classical Molecular Dynamics Code.
• Distributed by Sandia National Laboratories.
• Used by Warwick Physics.
LAMMPS
![Page 22: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/22.jpg)
• “Freeze” internal benchmark:
• 50 x 50 x 50 crystal in lattice units.
• Lennard-Jones Interactions.
• 100,000 steps.
• Patch Release 18 June 2019.
• GPU Package built with DOUBLE_DOUBLE precision.
• Built entirely by EasyBuild (no noticeable issues).
LAMMPS Testing
![Page 23: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/23.jpg)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
CPU
CPU+GPU
Time-steps per second (higher is better)
LAMMPS Freeze Test
Power8 K80
![Page 24: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/24.jpg)
•Still using EasyBuild fosscuda 2018b toolchain.
•Manual build (toolchain loaded but LAMMPS build manually).
LAMMPS Testing Attempt 2
![Page 25: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/25.jpg)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Manual
EasyBuild
Time-steps per second (higher is better)
LAMMPS Freeze GPU Test (EasyBuild vs Manual Build)
Power8 K80
![Page 26: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/26.jpg)
• Empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).
• Developed at MRC Laboratory of Molecular Biology (Cambridge).
• Used by Warwick Life Sciences.
• Class3D standard benchmark using v. 3.0.6.
• Built entirely by EasyBuild.
RELION
![Page 27: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/27.jpg)
0:00 1:00 2:00 3:00
K80
Power8
Time HH:MM (Lower is better)
RELION Class3D Test
![Page 28: Using POWER at Warwick University](https://reader030.vdocuments.us/reader030/viewer/2022012806/61bd404d61276e740b10e345/html5/thumbnails/28.jpg)
• Increase in "manual" or ad-hoc builds to get performance for some applications.
• Partial EasyBuild.
• Using toolchain but final build by hand.
• "Ad-hoc" builds using XL Compilers.
Non-AI Testing (conclusions so far)