advanced research computing - university of oxforddownloads.arc.ox.ac.uk/training-docs/arc... · an...
TRANSCRIPT
An IT Services and Oxford e-Research Centre Partner facility
Advanced Research Computing(previously known as Oxford Supercomputing Centre)
Introduction Training
An IT Services and Oxford e-Research Centre Partner facility
The ARC Training Programme
Course 1:
An introduction to Supercomputing at Oxford
– Introduction to the ARC
– Introduction to Parallel Computing & HPC
Job Submission, Account management
Course 2:
Introduction to Scientific Programming
Mihai Duta and Ian Bush
An IT Services and Oxford e-Research Centre Partner facility
Introduction to ARC
Who are we?
What do we offer?
What is High Performance Computing
An IT Services and Oxford e-Research Centre Partner facility
Who are we?Helping researchers access advanced research computing
facilities, locally, nationally and internationally
Central University HPC
User Support and Training
Access to range of machines
Priority Usage Service
Support for Commercial Customers
An IT Services and Oxford e-Research Centre Partner facility
ARC Team : [email protected]
Andrew Richards
Head of ARC
Steven Young
Head of Technical Services
Dugan Witherick
Senior Research Computing Specialist
Mihai Duta
Scientific Software Advisor
Ian Bush
Scientific Application Specialist
An IT Services and Oxford e-Research Centre Partner facility
ARC Governance Structure
Executive Committee
IT Committee
User GroupOperations GroupTechnical
Advisory Group
An IT Services and Oxford e-Research Centre Partner facility
Where we are
ARC
Data Centre
&
Offices
An IT Services and Oxford e-Research Centre Partner facility
Data Centre
An IT Services and Oxford e-Research Centre Partner facility
Provide access to :
Hardware
Software
User Support
An IT Services and Oxford e-Research Centre Partner facility
Beyond Personal Computing…
An IT Services and Oxford e-Research Centre Partner facility
Computing to Compete
• IT in all forms increasingly underpins research activities
• Resources of the scale required can not always be provided
locally at research institutes.
• A clear need for regional, national and international collaboration
on e-infrastructure
An IT Services and Oxford e-Research Centre Partner facility
An IT Services and Oxford e-Research Centre Partner facility
Research Computing Pyramid
Tier 0
Tier 1
Tier 2
Tier 3
Tier 0: Europe wide with users from
multiple countries, e.g. PRACE.
Tier-0 for the particle physics community is
the HPC Data Centre at CERN.
Tier 1: National facility.
e.g. ARCHER facility
For particle physics users it is the LHC
Tier-1 Centre at RAL.
Tier 2: Regional Centres
e.g. SES Centre for Innovation
Tier 3: main institutional computing service
such as ARC
An IT Services and Oxford e-Research Centre Partner facility
HPC Landscape in the UK
An IT Services and Oxford e-Research Centre Partner facility
ARC Services
• Local HPC– Range of Clusters, Shared memory, Development systems– ARCUS, ARCUS-GPU, SAL, HAL, CARIBOU, RUBY, PHILEAS…
– & Dedicated Storage infrastructure
• Regional HPC– SES-5 e-Infrastructure
• EMERALD (GPU) and IRIDIS (x86)
• Training, Support, Consultancy
An IT Services and Oxford e-Research Centre Partner facility
What is Linux?
• Linux is an Operating System– Software loaded into your computer's
memory when you switch it on
– Software which manages the hardware
• Linux is “Unix-like”– Unix is an operating system originally developed in the
1970s to run large, central computers with many users
– Runs everything from phones to supercomputers today
An IT Services and Oxford e-Research Centre Partner facility
Operating Systems
An IT Services and Oxford e-Research Centre Partner facility
Brief history of UNIX and Linux
• A multi-user, multi-tasking operating system
• 1960s
– MIT, AT&T, General Electric develop Multics (Multiplexing Information and Computing Services)
• 1970s
– Unics (Uniplexed Information and Computing Service) – renamed to Unix
• 1972
– rewritten in C – resulted in more portable software to different hardware
• 1980
– AT&T licenses UNIX System III – consolidated differences into UNIX System V release 1
• 1983
– AT&T commercialise UNIX System V
– Academia continue to develop BSD Unix as an alternative
– BSD Unix main contributor of TCP/IP network code to the mainstream Unix kernel
• 1990
– Open Software Foundation released OSF/1 – standard Unix implementation (Mach and BSD)
• 1991
– Linus Torvalds starts Linux (originally Freax). Mixture of BSD and SYS-V
An IT Services and Oxford e-Research Centre Partner facility
Why do we use Linux?
• Linux very popular on critical servers due to
– Stability
– Security
– Performance
– Price
– Flexibility & Open Source nature
• All of the above are applicable to high performance computing
An IT Services and Oxford e-Research Centre Partner facility
Storage
/home: 20GB quota per user
/data: 5TB per group/project
Larger quotas available on request (charges may apply)
Scratch areas specific to each machine
• Home areas (/home)
• mid-term storage
• General purpose (/data)
• mid-term storage
• Scratch space (/scratch)
• High performance storage
• Jobs which perform significant disk
read/write use the scratch disks
and not /home
• Short term (duration of job)
ARC provides three types of storage
• active data storage for ARC projects only
• limited backups of home and data
Storage Management: http://www.arc.ox.ac.uk/content/storage-management
An IT Services and Oxford e-Research Centre Partner facility
Data Storage Policy
The ARC makes every effort to ensure the integrity of data stored on our facilities
• However, we are under no obligation to guarantee the integrity or availability of data - this is the responsibility of the individual user.
No Backups (limited snapshots of home and data)
• The ARC does not accept any liability, financial or otherwise for loss of data.
• We recommend that users employ standard industry practice for their important data and store it at sites other than the ARC , for example, in their department.
An IT Services and Oxford e-Research Centre Partner facility
Lib
rari
es &
Co
mp
ilers
• Intel C/C++
• Intel Fortran
• Intel MKL
• Portland
• BLACS
• CGNS
• HDF5
• ParMetis
• PETSc
• Scalapack/
• VTK
• MUMPS
• NetCDF
• DDT
Stat
isti
cs• Stata MP4
• R
• Rmpi
Ch
em
istr
y • ADF
• Gaussian03
• Gaussian09
• CPMD
• SIESTA
• CP2K
• GAMESS-US
• GROMACS
• NAMD
• Abaqus –FE
• CFD-ACE
Installed Software
‘module avail’ - command on ARC Services will show applications available
An IT Services and Oxford e-Research Centre Partner facility
How much computing knowledge do I need ?
An IT Services and Oxford e-Research Centre Partner facility
User Support
Training Courses
ARC User Guide
User Forum
Seminars on parallel Matlab & Mathematica
Advice on costing grants
Installing, testing software applications
Advice on using software, best practice, examples
What else would be of use in the future?
An IT Services and Oxford e-Research Centre Partner facility
How do I get on the Systems?
ARC is open to all researchers at the University of Oxford.
• To use the ARC you must first set up a new project and then apply for an individual user account.
Projects can be research groups, individuals or whatever best suits your local set up.
• For example, a research group may have two areas of research with different funding streams.
• It might make sense to create separate projects and have different user accounts for each project.
• Users can decide on the most sensible way to approach this.
An IT Services and Oxford e-Research Centre Partner facility
Logging onto ARC Systems
Remote access
• Linux, Mac (and other Unix/Unix-like) users should use sshto connect to the ARC systems from a terminal
• ssh -X [email protected]
• Windows users should download and install an application called PuTTY and Xming for X11 support
Batch Jobs• Never run intensive jobs on ARC systems without using the
job scheduler
• Jobs are submitted to the scheduler using the "qsub" command and a job submission script
An IT Services and Oxford e-Research Centre Partner facility
User Account Managementusing ‘OSCtool’
Passwords and other user
functions can be managed via
the osctool
To be replaced soon
An IT Services and Oxford e-Research Centre Partner facility
How is usage funded?
An IT Services and Oxford e-Research Centre Partner facility
Access Policy
Purchase of time via Research Grants
For larger, sustained use of ARC systems
Higher priority job queue
Direct
Basic Service
Usage charges distributed across Divisions and Departments
Contact us to request credits
Indirect
ARC costs around £0.5M a year to run and is currently supported through a hybrid
model funding the basic provision of the service through indirect charges with
additional direct costs from grants for the large users of the facility.
An IT Services and Oxford e-Research Centre Partner facility
ARC charges
Compute Time
• Funds deposited for usage on ARC systems are converted to credits where 1 credit = 1 Processor core*1 second of walltime.
• The current Full Economic Cost of credits is 3,600 for £0.02 (or 2p per core hour). Pricing is reviewed annually
• EU grants are charged differently (please speak to us first)
• Please contact us for current ‘price’ of credits
Storage
• £250 per TB per year above 5TB space limit provided to projects
Consultancy
• Please contact us.
An IT Services and Oxford e-Research Centre Partner facility
Parallel Computing
High Performance Computing
High Throughput Computing
Supercomputing
An IT Services and Oxford e-Research Centre Partner facility
What is High Performance computing?• No single, all-encompassing definition
• For the purposes of this course it’s computing which is:
– Work you can’t do on a desktop or server,
– requires specialised resources and
– is carried out on multiple processors in ‘parallel’
• We’re going to be working at the “sports car” level of computing...
An IT Services and Oxford e-Research Centre Partner facility
A useful analogy
Local desktop
• Easily to buy
• Cheap
• Low running costs
• Easy to use
• No training required (but available)
• Low performance
An IT Services and Oxford e-Research Centre Partner facility
Servers
• Very common but more expensive
• Can be expensive to run
• Some local training may be required to use properly.
• Relatively easy to get the best out of them with some practice
• Mid-range performance
A useful analogy
An IT Services and Oxford e-Research Centre Partner facility
Campus HPC (Tier 3)
• Less common
• Fairly expensive
• High running costs
• Training can build on previous experience BUT to squeeze out the best performance, more advanced training and years of experience needed
• New users can have accidents easily
A useful analogy
An IT Services and Oxford e-Research Centre Partner facility
National HPC
• A few per country
• Extremely expensive
• High running costs
• Specialist training and experience required to begin usage
• New users not allowed near them
A useful analogy
An IT Services and Oxford e-Research Centre Partner facility
Impact of HPC• Waiting for computers is dead time.
• Take any one thing which takes a week
to do in your place of work and bring it
down to a few hours...how big is the
impact?
• Performance also limits the scale of the
problems we can tackle.
• Take a problem which you can tackle in
ten times more detail, size (or both)...
How big is the impact?
An IT Services and Oxford e-Research Centre Partner facility
How fast is fast enough?
A home PC can manage on the order of a few Billion calculations per second – isn't this fast enough?
To run weather and climate models in a sensible amount of time the Met Office needs a machine which can reach 1Pflop (1 Million Billion calculations per second) by 2011
Why can't we just make our PCs that fast?
An IT Services and Oxford e-Research Centre Partner facility
Computational Performance
• Computers are an amalgamation of
components, some faster than
others
• Processors have enjoyed the
majority of progress but
increasingly software is bottle
necked by other components:
Processor(FAST)
Bus(SLOW)
Memory(SLOW)
Network
(SLOW)
Storage
(SLOW)
An IT Services and Oxford e-Research Centre Partner facility
Making Processors Faster
1980s to early 00's – just make the processor clocks tick faster!
Clocks speed increases heat production, chips experiencing thermal limits/damage
Power consumption getting out of control
The “Megahertz Myth”
Make components smaller, faster
Runs into physical limitations
An IT Services and Oxford e-Research Centre Partner facility
Physical Limitations• Size of transistors:
– 2003 90nm
– 2009 32nm
– 2014 14nm
– 2015 8-11nm (TBC)
• Size of Silicon atom = 0.22nm
• Analogy:
– Silicon atom = football
– 2015 transistor = centre circle
– 2009 transistor = width of pitch
What do we do when we're near 1nm?
An IT Services and Oxford e-Research Centre Partner facility
Solution
• Use more processors!
– 1950s: multiple processor computers
– 1980s: possible to link multiple computers together (clusters)
– Late 1990s: multiple chip PCs
– Early 00's: multiple “cores” on single PC chip
• Now: hundreds of cores on specialist hardware
But: Multiple cores brings additional problems for
software design, optimal code ‘placement’ on
cores.
An IT Services and Oxford e-Research Centre Partner facility
How does it work?
• We achieve higher performance by looking for parallelism – Items of work you can do independently at the same time.
• A Problem can have:
– lots of parallelism (little communication)
– no parallelism!
– some parts parallelise, others don't (typical)
– trivial parallelism (no communication except start/end)
An IT Services and Oxford e-Research Centre Partner facility
Analogy: Building a House Building the foundation and
the roof – no parallelism
Laying the bricks – partial parallelism
Building the walls and laying the drive way - trivially parallel
An IT Services and Oxford e-Research Centre Partner facility
Too good to be true ?
• Splitting the task up isn't always straightforward
– Sometimes done by the application automatically
– Sometimes done by the ARC staff
– Rarely done by the end user
• Not a magic bullet approach –speedup and scaling
• Communication much slower than computation...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0
2
4
6
8
10
12
14
16
Example Speedups
Linear
Sub-Linear
Perf
orm
ance
Processors
An IT Services and Oxford e-Research Centre Partner facility
Program Speedup: Scaling
• Algorithm may not parallelise fully (Amdahl's law)
• If 20% of a problem won't parallelise, you can never make it more than 5 times faster
http://en.wikipedia.org/wiki/Amdahl%27s_law
The speedup of a program using multiple processors in parallel
computing is limited by the time needed for the sequential fraction
of the program.
For example, if a program needs 20 hours using a single
processor core, and a particular portion of the program which
takes one hour to execute cannot be parallelized, while the
remaining 19 hours (95%) of execution time can be parallelized,
then regardless of how many processors are devoted to a
parallelized execution of this program, the minimum execution
time cannot be less than that critical one hour.
Hence the speedup is limited to at most 20×
An IT Services and Oxford e-Research Centre Partner facility
Reasons for sub-linear scaling
Remember that computers are multi-component systems which run at different speeds
Processors, very fast
Cache, fast
Main memory, slow
Interconnect, slow
Disk, very slow
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0
2
4
6
8
10
12
14
16
Example Speedups
Linear
Sub-Linear
Different applications place different loads on each subsystem
An IT Services and Oxford e-Research Centre Partner facility
HPC Terminology?
High Throughput Computing
(Capacity)High Performance
Computing
(Capability)
An IT Services and Oxford e-Research Centre Partner facility
Capacity vs Capability Computing
Capacity Capability
Using computing power to process many problems simultaneously.
Using maximum computing power to solve a large problem that no other computer can
Individual tasks, no communicationbetween processes
One single individual task comprised of many child processes all communicating with each other
No specialist hardware interconnects needed between nodes in a cluster
Specialist, high performance, low latency, hardware interconnects needed between nodes in a cluster
An IT Services and Oxford e-Research Centre Partner facility
Clusters (Distributed memory)
• Problem split across smaller machines with software, typically MPI
• Sub-tasks occasionally exchange information
• Cheap to build
• Hard to program
• Good for partially self contained problems e.g. fluid flow
An IT Services and Oxford e-Research Centre Partner facility
Shared Memory Systems
Shared memory
• Lots of processors and memory in a box
• All sub processes in one box
• Acts like a large ‘desktop PC’
• Applications split into “threads”
• Usually programmed in something like OpenMP
• Easy to program
• Financially expensive
• Good for processes with common data e.g. stats, genomics.
An IT Services and Oxford e-Research Centre Partner facility
Other HPC Resources
SES: Centre for Innovation
– EMERALD: 372 Tesla GPUs
• 114 Teraflops
• Infiniband and 10Gb Ethernet/Gnodalinterconnect
– IRIDIS: 12,000 core Intel based cluster
• Capability/highly scaling work which cannot be accommodated on local systems
An IT Services and Oxford e-Research Centre Partner facility
UK e-Infrastructure
National and Regional HPC Centres
MidPlus
An IT Services and Oxford e-Research Centre Partner facility
HPC Matters ?
• Analysing data - see structure in the chaos
• Modelling endless combinations of molecules to find the cure for <insert disease>
• Turn back the clock 14 billion years
• Modelling the path of storms
• Every Day life– Investments, cars, shopping
• HPC Matters– http://sc14.supercomputing.org/about-sc14
An IT Services and Oxford e-Research Centre Partner facility
Contact us
• http://www.arc.ox.ac.uk
Problems
Ideas
Questions