disk directed i/o for mimd multiprocessors
Post on 03-Feb-2016
38 Views
Preview:
DESCRIPTION
TRANSCRIPT
Disk Directed I/O for MIMD Multiprocessors
David KotzDepartment of Computer Science
Dartmouth College
Overview
Introduction Collective I/O Implementation Experiments Results Example Conclusions Questions
Introduction
Scientific applications
Traditional I/O
Disk directed I/O
MIMD Architecture With CPs and IOPs ( Message Passing)
Collective I/O Drawbacks of traditional file system
• Independent requests• Separate file-system call• Data declustering
Collective I/O Interface• CPs cooperate • Provides a high level interface• Enhances performance
Collective I/O ( Contd..) Implementation alternatives
• Traditional caching• Two phase I/O• Disk directed I/O
Traditional CachingNo explicit interfaceEach application calls IOP
Two phase I/OCPs collectively determine and carry out the optimized approach
Data layout must be same in the processors and in the disks
Collective I/O (Contd..)
Traditional Caching
Collective I/O ( Contd.. )
Two Phase I/O
Collective I/O ( Contd.. )
Disk-directed I/O
Collective I/O ( Contd.. )
Disk-directed I/O The I/O can confirm not only to the logical layout
but also to the physical layout. If the disks are RAIDS, the I/O can be organized to
perform full stripe writes for maximum performance.
Only one I/O request to each IOP. There is no communication among the IOPs Disk scheduling is improved by sorting the block
list for each disk Two buffers per CP per disk per file
Implementation
Files were striped across all disks , block by block
Each IOP served more than one disks Message passing and DMA
Each message was encoded Each request contained reply action
Memget and Memput messages Proteus simulator on DEC-5000 workstation
Implementation ( Contd.. )
Simulation parameters
Implementation ( Contd.. )
Implementing DDIO IOP creates new thread for each request Thread computes disk blocks, sorts based on
location and informs to disk threads Allocates two one-block buffers for each local disk Creates a thread to manage each buffer
Implementing traditional caching CPs didn’t cache or prefetch data CPs send concurrent requests to relevant IOPs Each IOP mainatained double buffer to satisfy
requests from each CP to each disk
Experiments
Different configurations File access patterns Disk layout Number of CPs Number of IOPs Number of disks
File and disk layout 10 MB file was striped across disks block by block Both contiguous and random layouts were used
Experiments ( Contd.. )
1D and 2D matrices are used
Access patterns
NONE -each dimension of the array could be mapped to entirely one CP
BLOCK-distributed among CPs in contiguous blocks CYCLIC-distributed round-robin among the CPs
Record size of 8 bytes and 8192 bytes are used
HPF array distribution
Experiment ( Contd.. ) Contiguous disk layout
Example LU Decomposition
In solving linear systems of equations N x N matrix Decomposed into L-lower triangular and U-
upper triangular LU=M Columns are stored in processor’s memory Each processor’s subset of columns is called
“slab”
Example ( Contd.. ) Performance measurement
8 CPs, 8 IOPs and 8 disks one for each IOP 4MB matrix data Slab size 16, 32 or 128 columns Random or contiguous layout Block size 1KB, 4KB or 8KB Traditional file system used 128 blocks of total
cache Disk directed file system used 16 blocks of total
buffer space Results
DDIO always improved the performance of the LU decomposition when both contiguous and random layouts are used
Related work PIFS
Data flow is controlled by the file system rather than by the application.
Jovian collective-I/O library Combines fragmented requests from many CPs into
larger requests that can be passed to the IOPs. Transparent Informed prefetching ( TIP )
Enables applications to submit detailed hints about their future file activity to the file system, which can use the hints for accurate, aggressive prefetching.
TickerTAIP RAID controller Uses collaborative execution similar to disk directed
file system.
Conclusions
Disk-directed I/O avoided many of the pitfalls inherent in the traditional caching method, such as thrashing, extraneous disk-head movements etc.
Presorts disk requests to optimize head movement and had smaller buffer requirements.
It is most valuable when making large, collective transfers of data between multiple disks and multiple memories.
Questions
What is collective I/O ?
What are the advantages of disk-directed I/O ?
top related