lecture 2 tth 03:30am-04:45pm dr. jianjun hu csce569 parallel computing university of south...
TRANSCRIPT
![Page 1: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/1.jpg)
Lecture 2TTH 03:30AM-04:45PM
Dr. Jianjun Huhttp://mleg.cse.sc.edu/edu/csc
e569/
CSCE569 Parallel Computing
University of South CarolinaDepartment of Computer Science and Engineering
![Page 2: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/2.jpg)
Outline• Clusters and SMP systems at USC CEC• Ways of using High Performance
Systems• PBS Job Queuing System• How to write Job file• How to submit, delete, manage jobs
submitted to Linux Cluster• How to submit a large Number of Jobs
![Page 3: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/3.jpg)
Systems: NICK Linux OS
Hardware• 76 Compute Nodes w/ dual 3.4 GHz XEON 2ML2,
4GB RAM1 Master Node w/ dual 3.2 GHz 2ML2, 4GB RAM
• Topspin Infiniband Interconnect• Storage: 1 terabyte network storage
Software• Rocks 4.3 CentOS Base, OpenMPI, OpenPBS/Torque• Absoft Compilers • Intel Compilers• Bio Roll that includes the following bio-informatics
packages: HMMER, NCBI BLAST, MpiBLAST, biopython, ClustalW, MrBayes, T_Coffee, Emboss, Phylip, fasta, Glimmer, and CPAN
• Intel Math Kernel Library• TURBOMOLE• VASP• STAR-CD
![Page 4: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/4.jpg)
Systems: Optimus
Hardware• 64 Nodes: Dual CPU, 2.0 GHz
Dual-Core AMD Opterons, Totaling 256 Cores
• 8GB RAM • 1 Terabyte of Storage in Headnode• Gigabit Ethernet InterconnectSoftware• ROCKS 5.1• OpenMPI • OpenPBS Scheduler• GNU Compilers
![Page 5: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/5.jpg)
Systems: ZIA
SGI Altix 4700 Shared-memory systemHardware 128 Itanium Cores @ 1.6 GHz/ 8MB Cache 256 GB RAM 8TB storage NUMAlink Interconnect FabricSoftware SUSE10 w/SGI PROPACK Intel C/C++ and Fortran Compilers VASP PBSPro scheduling software Message Passing Toolkit Intel Math Kernel Library GNU Scientific Library Boost library
![Page 6: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/6.jpg)
Other SystemsNataku 8 Nodes: Dual CPU, 2.0 GHz Dual-Core AMD Opterons, Totaling
32 Cores 16 GB RAM in Headnode, 8GB RAM in compute nodes Chemical Engineering machine for Star-CD
Jaws2 8 Compute Nodes w/ dual XEON 2.6 GHz, 2GB RAM Remaining parts of original Jaws cluster, currently being rebuilt 1 Terabyte attached storage
Dr. Flora’s 12 CPU VASP Cluster
Dr. Heyden’s MAC Cluster for VASP
![Page 7: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/7.jpg)
Distributed Multiprocessor Cluster
HD1HD2 HD3
Front End Node
NFS
![Page 8: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/8.jpg)
How can we utilize large high performance machines like these to
speed up applications?
Question
![Page 9: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/9.jpg)
Ways of using Linux ClustersApp. Type1:
data1
data2
data3
dataK
Regular program Collect results
Each data set is computed in a function independently as a job and can be run independently on one CPU
![Page 10: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/10.jpg)
Ways of using Linux ClustersApp. Type2:
compute1
compute2
compute3
Compute 4
Communication between processes
result
Parallel processes can be executed on multiple CPUs and can be summarized together in the main process
data
Parallel program
![Page 11: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/11.jpg)
PBS System for ClustersPBS is a workload management system
for Linux clustersIt supplies commands for
◦ job submittion◦ job monitoring (tracing)◦ job deletion
It consists of the following components:◦ Job server (pbs_server)
provides the basic batch services receiving/creating a batch job modifying the job protecting the job against system crashes running the job
![Page 12: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/12.jpg)
PBPBS System for Clusterssing◦ Job Executor (pbs_mom)
receives a copy of the job from the job server sets the job into execution creates a new session as identical user returns the job's output to the user.
◦ Job Scheduler (pbs_sched) runs site's policy controlling which job is run
and where and when it is run PBS allows each site to create its own
Scheduler Currently Nick uses the Torque/Maui Scheduler
![Page 13: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/13.jpg)
OpenPBS Batch Processing
Maui communicates ◦ with Moms: monitoring the state of a system's
resources◦ with Server: retrieving information about the
availability of jobs to execute
![Page 14: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/14.jpg)
Steps needed to run your first production code
Suppose your application experiments are: $myprog data1 2 30
$myprog data2 3 12 ...
Steps to use PBS:1. Create a job script for each running experiment
containing the PBS options to request the needed resources (i.e. number of processors, wall-clock time, etc.) and user commands to prepare for execution of the
executable (i.e. cd to working directory, etc.).
2. Submit the job script file to PBS queue qsub prog1.sh
3. Monitor the job
![Page 15: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/15.jpg)
First example: job1.sh jobfile#!/bin/bash
#PBS -N MyAppName
#PBS -l nodes=1
#PBS -l walltime=00:01:00
#PBS -e /home/dgtest/dgtest0200/test.err
#PBS -o /home/dgtest/dgtest0200/test.out
#PBS -V
Export PATH=$PATH:yourdir/bin;
myprog data1 2 30
Where is your output file located?Where is the screen output?
![Page 16: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/16.jpg)
Jobfile for use on ZIA #!/bin/s
#PBS -N helloMPI #PBS -o hello.out#PBS -e hello.err#PBS -l select=1:ncpus=4#PBS -l place=free:shared
cd /home/<username>/test
mpirun -np 4 /home/<username>/test/hello
![Page 17: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/17.jpg)
PBS Options #PBS -N myJob
◦ Assigns a job name. The default is the name of PBS job script.
#PBS -l nodes=4:ppn=2◦ The number of nodes and processors per node.
#PBS -l walltime=01:00:00◦ The maximum wall-clock time during which this job can run.
#PBS -o mypath/my.out◦ The path and file name for standard output.
#PBS -e mypath/my.err ◦ The path and file name for standard error.
#PBS -j oe◦ Join option that merges the standard error stream with the standard
output stream
![Page 18: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/18.jpg)
PBS Options #PBS -k oe
◦ Define which output of the batch job to retain onthe execution host.
#PBS -W stagein=file_list◦ Copies the file onto the execution host before the job starts.
#PBS -W stageout=file_list◦ Copies the file from the execution host after the job completes.
#PBS -r n◦ Indicates that a job should not rerun if it fails.
#PBS –V#PBS –V◦ Exports all environment variables to the job.
![Page 19: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/19.jpg)
Procedure Use command line
◦ Use editor to create an executable script: vi myExample.sh Use first example code
◦ Make myExample.sh executable: chmod +x myExample.sh
◦ Test your script ./ myExample.sh
Submit your script:◦ qsub myExample.sh◦ remember your job identifier
i.e. 96682
![Page 20: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/20.jpg)
Monitor / Control a JobCheck wether your job runs
qstat qstat –a
◦ check status of jobs, queues, and the PBS server
qstat –f◦ get all the information about a job, i.e. resources requested,
resource limits, owner, source, destination, queue, etc.
qdel job.ID ◦ delete a job from the queue
qhold job.ID◦ hold a job if it is in the queue
qrls job.ID◦ release a job from hold
![Page 21: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/21.jpg)
ExerciseProblem: Given 10000 html pages, count
the frequency of all words and report it as: keyword frequeny
Keyword1 frequency1...
Use PBS to submit 100 jobs
![Page 22: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/22.jpg)
How to submit 100 jobsTypical ways:
1. read file list2. for each file, create a job file, and submit it to the PBS queueWrite a bash script, which submits a jobs for
different datasetsWrite a perl script to submit jobsWrite a C program to submit jobs
![Page 23: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/23.jpg)
Quick psubPsub is a perl script that can wrap a
command line program into a job file and submit to the cluster queue
>psub jobname.sh “ prog.pl –i=1” this will create a job file “jobname.sh” and
submit to the server for running. No need to edit a job file anymore
![Page 24: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/24.jpg)
Local Disk of Computing NodeNormally, the computing node of clusters
can directly read and write files on NFS storage space
If your program has intense write-read operation, reading and writing to NFS directory will cause high traffics
Solution:direct your output and input to local directories at computing nodes and after execution, copy the results file to NSF directory
/temp, /tmp /state/partition1
![Page 25: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/25.jpg)
SummaryTypeI parallel computing applicationHow PBS works in Linux Cluster ComputersHow to submit jobs to Linux clusters
![Page 26: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/26.jpg)
HomeworkProgramming Problem: Given a html page,
count the frequency of all words and report it as:
keyword frequenyKeyword1 frequency1...
Use PBS to submit 100 jobs to count frequency for 10000 html pages in next Lab session.
![Page 27: Lecture 2 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649cdf5503460f949a887c/html5/thumbnails/27.jpg)
Learn how to compile C programs on LinuxLearn how to create PBS job fileLearn how to submit jobsLearn how to submit multiple jobsLearn how to compile and run MPI program on
NICK
Homework