high performance io hpc workshop – university of kentucky ......ibm ats deep computing © 2007 ibm...

Post on 01-Apr-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

IBM ATS Deep Computing

© 2007 IBM Corporation

High Performance IO HPC Workshop – University of KentuckyMay 9, 2007 – May 10, 2007

Andrew Komornicki, Ph. D.Balaji Veeraraghavan, Ph. D.

IBM ATS Deep Computing

© 2007 IBM Corporation

Agenda

Introduction

General IO performance

Results of some small tests.

Modular IO libraries, Linux and AIX

IBM ATS Deep Computing

© 2007 IBM Corporation

I/O Optimization

Analyze the IO pattern

Determine optimization method

Optimize in user space

Minimize source code changes

Possibly relink with libtkio.so

IBM ATS Deep Computing

© 2007 IBM Corporation

General I/O Performance

C: Do not use fopen(), fread(), or fwrite(); • These are inefficient due to small (4KB) IO blocks and extra

memory copies.Use instead: • POSIX open(), read(), write() • Direct (raw) IO will eliminate an additional memory copy

FORTRAN: Use unformatted IO

IBM ATS Deep Computing

© 2007 IBM Corporation

Asynchronous IO, an example

Non Blocking IO aio_read(), aio_write(), aio_return();

Completion Notification Polling with aio_error(); Block until complete with aio_suspend():

Cancellation of IO requests aio_cancel();

Large File enabled Removes the 2GB file size limitation

POSIX conforming

IBM ATS Deep Computing

© 2007 IBM Corporation

Results of Bonnie IO test

Run on Blade system in San Mateo Lab

System Memory, 5 Gbytes

File systems, ext2, and ext3

All tests done in four stages: Writing with putc()...done

Rewriting...done

Writing intelligently...done

Reading with getc()...done

Reading intelligently... done

IBM ATS Deep Computing

© 2007 IBM Corporation

Results of Bonnie IO test, Block IO performance

Size (MB) Write (Kbytes/sec) Read(Kbytes/sec)

__________________________________________

2000 841,524 2,233,282

4000 83,237 1,658,013

8000 56,599 50,974

16000 49,656 50,677

IBM ATS Deep Computing

© 2007 IBM Corporation

Results of Bonnie IO test

Results for ext2 file system, time in seconds

Size (MB) User System Elapsed

_________________________________

2000 48.7 10.8 84.3

4000 97.6 23.9 252.6

8000 194.9 56.4 1009.2

16000 388.4 111.8 2088.3

IBM ATS Deep Computing

© 2007 IBM Corporation

Results of Bonnie IO test

Results for ext3 file system, time in seconds

Size (MB) User System Elapsed

______________________________________

2000 48.7 19.2 96.3

4000 97.7 45.5 265.8

8000 194.6 90.1 1016.9

16000 396.9 201.8 2058.3

IBM ATS Deep Computing

© 2007 IBM Corporation

Modular I/O (MIO)Modular I/O (MIO)

Familiar and flexible runtime interfaceMIO modulesmiotracepf

MIO available on both Linux and AIX

IBM ATS Deep Computing

© 2007 IBM Corporation

MIO user code interface

open MIO_open

read MIO_read

write MIO_write

close MIO_close

lseek MIO_lseek

fcntl MIO_fcntl

ftruncate MIO_ftruncate

IBM ATS Deep Computing

© 2007 IBM Corporation

MIO run time interfaceMIO run time interface

MIO_STATS="file name"MIO_FILES=" *.dat* [trace|pf ] *.inp [aix]"MIO_DEBUG="ALL"MIO_DEFAULTS="trace/mbytes , pf/cache=10m“

IBM ATS Deep Computing

© 2007 IBM Corporation

trace moduletrace modulesummary of file activitybinary events filelow cpu overheadtypical options

/stats/mbytes /gbytes /tbytes/events=mio.evt

IBM ATS Deep Computing

© 2007 IBM Corporation

pf module

User selectable cache size

User selectable page size

User selectable prefetch depth

Direct or system buffered IO

Global or private cache

Usage summary

IBM ATS Deep Computing

© 2007 IBM Corporation

pf modulepf moduledetects sequential I/Ouser memory bufferingoptions

/global/cache_size=10m/page_size=1m/prefetch=1/stride=1/direct/stats

IBM ATS Deep Computing

© 2007 IBM Corporation

Relink with libtkio.a

libtkio.a has shared object memberstkio.so 32 bit and 64 bit

Entry points for• open,open64,close,read,write,lseek,lseek64• fcntl,ffinfo,fstat,fstat64,fstatfs,fsync• ftruncate,ftruncate64• unlink,aio_...

IBM ATS Deep Computing

© 2007 IBM Corporation

Default tkio behavior

Uses dlopen and dlsym for runtime linking

……libc(shr.o) fsyncfsynclibc(shr.o) lseek64lseek64libc(shr.o) writewritelibc(shr.o) readreadlibc(shr.o) closecloselibc(shr.o) open64open64

callstkio entry

IBM ATS Deep Computing

© 2007 IBM Corporation

tkio runtime interface

setenv TKIO_ALTLIB so_name/print/abort

export TKIO_ALTLIB=so_name/print/abort

so_name is name of shared library• Either name.so or libname.a(name.so)

tkio calls function in so_name that returns a structure filled with I/O entry points to replace default entry points

/print option outputs a print to stderr indicating success of load

/abort issues exit(-1) if load is not successfull

IBM ATS Deep Computing

© 2007 IBM Corporation

tkio using MIOsetenv TKIO_ALTLIB get_mio_ptrs_64.so

…libmio(mio.o) MIO_fsyncFsynclibmio(mio.o) MIO_lseek64Lseek64libmio(mio.o) MIO_writeWritelibmio(mio.o) MIO_readReadlibmio(mio.o) MIO_closeCloselibmio(mio.o) MIO_open64Open64

Calls tkio entry

IBM ATS Deep Computing

© 2007 IBM Corporation

kernel

Application libclibtkio

Fortran I/O

Demonstration only

open64writereadlseek64close

->open64->write->read->lseek64->close

stdiofopenfrwritefreadfclose

libmio

->MIO_open64->MIO_write->MIO_read->MIO_lseek64->MIO_close

X

IBM ATS Deep Computing

© 2007 IBM Corporation

kernel

libclibtkio

open64writereadlseek64close

->open64->write->read->lseek64->close

libmio

->MIO_open64->MIO_write->MIO_read->MIO_lseek64->MIO_close t

race

pf

aix

IBM ATS Deep Computing

© 2007 IBM Corporation

System buffered Data MovementSystem buffered Data Movement

user space

kernel

256kb

system buffers

MIO space

pf cached Data Movementpf cached Data Movement

user space

kernel

256kb

5 x 2mb

system buffers

MIO space

O_DIRECT Data MovementO_DIRECT Data Movement

user space

kernel

O_DIRECT

256kb

5 x 2mb

system buffers

MIO space

Asynchronous Data MovementAsynchronous Data Movement

user space

kernel

O_DIRECT

256kb

5 x 2mb

system buffers

MIO space

IBM ATS Deep Computing

© 2007 IBM Corporation

Trace close : program <-> pf : /bmwfs/cdh108.T20536_13.SCR300 :(281946/2162.61)=130.37 mbytes/s

current size=0 max_size=16277mode =0777 sector size=4096oflags =0x302=RDWR CREAT TRUNCopen 1 0.01write 478193 462.10 59774 59774 131072 131072read 1777376 1700.48 222172 222172 131072 131072seek 911572 2.83fcntl 3 0.00trunc 16 0.40close 1 0.03size 127787

MSC.NASTRANMSC.NASTRANtracetrace output from output from program <program <-->pf>pf

Min/MaxRequest sizein bytes

Mbytes requestedand Mbytes delivered

Number ofoccurances

IBM ATS Deep Computing

© 2007 IBM Corporation

Trace close : pf <-> aix : /bmwfs/cdh108.T20536_13.SCR300 : (276645/1460.73)=189.39 mbytes/s

current size=0 max_size=16276mode =0777 sector size=4096oflags =0x8000302=RDWR CREAT TRUNC DIRECTopen 1 0.01write 4382 154.86 684 684 131072 2097152awrite 33390 1.42 58491 58491 131072 2097152suspend 33390 240.00 242.27 mbytes/sread 5178 272.71 10354 10354 1048576 2097152aread 103560 5.70 207115 207115 524288 2097152suspend 103560 786.04 261.59 mbytes/sseek 136950 0.00fcntl 3 0.00trunc 16 0.40close 1 0.00size 11013pages 138477

MSC.NASTRAN MSC.NASTRAN tracetrace outputoutput

IBM ATS Deep Computing

© 2007 IBM Corporation

pf close for /bmwfs/cdh108.T20536_13.SCR300global cache 0: 150 pages of 2097152 bytes29739/29749 pages not preread for write138316/139754 prefetches : prefetch=3

29576 write behinds478193 writes1777376 reads

page writes 37772/33124mbytes transferredprogram --> 59774 --> pf --> 59176 --> aixprogram <-- 222172 <-- pf <-- 217469 <-- aix

MSC.NASTRAN MSC.NASTRAN pfpf outputoutput

IBM ATS Deep Computing

© 2007 IBM Corporation

time ( seconds )

file position ( bytes )

DataView file activity plotDataView file activity plot

IBM ATS Deep Computing

© 2007 IBM Corporation

time ( seconds )

file position ( bytes )

DataView file activity plotDataView file activity plot

IBM ATS Deep Computing

© 2007 IBM Corporation

time ( seconds )

file position ( bytes )

suspend time

hidden time

queuing time

Asynchronous I/O plottingAsynchronous I/O plotting

IBM ATS Deep Computing

© 2007 IBM Corporation

time ( seconds )

file position ( bytes )

cache page activitycache page activity

IBM ATS Deep Computing

© 2007 IBM Corporation

MSC.Nastran performance gainsMSC.Nastran performance gains

16 cpu 32GB NH2 node 2.2M dof, 767GB I/O, 8 copies2GB memory per copy

114MB/sec 198MB/sec

8 SSA, 16 loops, 4 disk/loop

IBM ATS Deep Computing

© 2007 IBM Corporation

MIO Summary

Demonstrated performance gains

Simple to implement

Flexible run time interface

Delivered as a shared object library

Contact: bauerj@us.ibm.com

top related