unix performance benchmarking - sas group...sas vs non-sas •disk io –not many options within sas...

Post on 23-May-2018

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Unix performance benchmarking

Isolating application performance issues

Establishing performance benchmarks

Why bother?

• Identify issues with shared Unix resources

• Understanding your SAS processes

• Helping System Administrators who don’t understand SAS

• Prove that something is wrong

Measurements

• Disk Read/Write (I/O)

• Memory

• CPU

SAS vs non-SAS

• Disk IO – Not many options within SAS code (bufno, bufsize)

• Memory – no Unix equivalent to some SAS features (realmemsize, sortsize)

• CPU – SAS threaded kernel too complex to replicate in Unix

Basic SAS Disk IO testing

options fullstimer ;

/* Write performance */

libname outlib ‘/disk_output_path’ ;

data outlib.mybigfile ;

do n = 1 to 100000 ; randnum = ranuni(0) ; output ; end ; run ;

/* Read performance */

data _null_ ; set outlib.mybigfile ; run ;

Basic Unix IO testing

$dd /path_to_big_file /path_to_write_location

Or

$./iotest.sh –t /path_to_write_location

Disk caching

• Most Unix servers are 64bit and have lots of memory.

• Disk IO is always a bottleneck

• Modern Unix uses spare memory instead of disk

• Performance is always great after a reboot

• Caching distorts IO performance measurement

Real Unix IO testing

• Disable caching (Linux)

• Lots of concurent processes

• Large files (20Gb) created by each process

• Capture results

File systems

• Local (ext4)

• Shared (gfs2, gpfs)

• External (xfs) – includes Fusion IO

• Temp (tmpfs)

On Linux, use df –hT

Typical results

Minumum SAS requirements

• ETL – 50-75MBs/sec per CPU core

• Adhoc analytics – 15-25MBs/sec per CPU core

• SASWORK - 50-75MBs/sec per CPU core

Comparing channels

Running tests

• Schedule a quiet time• Start small : 3 concurrent tests of 2Gb against /saswork:

$ ./iotestv1.sh -i 3 -t /saswork -b 15625 -s 128

• $ df /write_location_mountpoint• $ du –s /path_to_write_location• Clean up afterwards:

$ find /saswork -type f -name 'iotest*' -user $USER -exec rm-f {} \; 2>&1 | grep -v 'Permission denied'

Gathering results

Example script output in listing

dc5cad,07Jan2014:00:09:37,3,64,312500,60,30.17,/saswork,iotestv1.sh-writetest.out.2

dc5cad,07Jan2014:00:09:37,3,64,312500,60,2.02,/saswork,iotestv1.sh-readtest.out.2

Processing results

hostname streams blksize blocks target mode dtime iteratn filesz elapsed thruput

dc5cad 3 64 312500/saswork R 07JAN14:00:09:37

1 20000000 62.04 322372.66

dc5cad 3 64 312500/saswork R 07JAN14:00:09:37

2 20000000 62.02 322476.62

dc5cad 3 64 312500/saswork R 07JAN14:00:09:37

3 20000000 64.36 310752.02

dc5cad 3 64 312500/saswork W 07JAN14:00:09:37

1 20000000 90.18 221778.66

dc5cad 3 64 312500/saswork W 07JAN14:00:09:37

2 20000000 90.17 221803.26

dc5cad 3 64 312500/saswork W 07JAN14:00:09:37

3 20000000 48 416666.67

dc5cad 3 128 156250/saswork R 07JAN14:00:11:48

1 20000000 50.15 398803.59

dc5cad 3 128 156250/saswork R 07JAN14:00:11:48

2 20000000 50.1 399201.6

dc5cad 3 128 156250/saswork R 07JAN14:00:11:48

3 20000000 50.03 399760.14

dc5cad 3 128 156250/saswork W 07JAN14:00:11:48

1 20000000 79.7 250941.03

dc5cad 3 128 156250/saswork W 07JAN14:00:11:48

2 20000000 79.13 252748.64

dc5cad 3 128 156250/saswork W 07JAN14:00:11:48

3 20000000 79.14 252716.7

dc5cad 4 64 312500/saswork R 07JAN14:00:14:49

1 20000000 70.21 284859.71

Results for interpretation and analysis

At this point, the testing has resulted in a set of results. The results are groups of tests, varying by:

– The server used for execution;

– How many concurrent streams were executed;

– The block size of the files being transferred;

– Whether the data were being written to or read from disk.

File size is constant (20 GB).

Information available for analysis is elapsed time.

Diagrammatically, it looks like this:

• If we have two concurrent streams, each reading 20 GB, in 80 seconds, our throughput is logically 40 GB in 80 seconds, or .5 GB / second (500 MB / second).

• In the case of four concurrent streams, it would be 80 GB in 80 seconds, or 1 GB / second.

Therefore, for each group, we must collapse the iterations, selecting the MAX of elapsed time, the SUM of the data volumes (individually always 20 GB in our case), and keeping the host name, number of streams, block size, and mode as classification variables.

20 GB20 GB 80

Seconds20 GB20 GB20 GB20 GB

And we need to calculate Throughput as Volume / Elapsed, to use as the analysis variable in our graphs.

Impact of block size, 64 KB and 128 KB

Duration increases with concurrent streams

Throughput is independent of number of concurrent streams

Reading data

Writing data

Throughput reading data is much higher than writing

Does server performance vary?

Reading data

Writing data

References

• Margaret Crevar’s definitive guide to performance tuning: http://support.sas.com/rnd/papers/sgf07/sgf2007-iosubsystem.pdf

Authors

• Tom Kari: tom.kari.consulting@bell.net

• Andrew Farrer: acfarrer@gmail.com

Acknowledgements

• Dan Gelinas, IBM Canada for deep insights into the filesystem cache

• Clifford Myers, SAS Institute. The original author of iotest.sh

top related