andrew hanushevsky stanford linear accelerator center
DESCRIPTION
Disk Cache Management In Large-Scale Object Oriented Databases http://www.slac.stanford.edu/~abh/CHEP2000/Cache/. Andrew Hanushevsky Stanford Linear Accelerator Center Produced under contract DE-AC03-76SF00515 between Stanford University and the Department of Energy. Motivation. Problem - PowerPoint PPT PresentationTRANSCRIPT
Andrew Hanushevsky 7-Feb-2000 1
Andrew HanushevskyStanford Linear Accelerator Center
Produced under contract DE-AC03-76SF00515 between Stanford University and the Department of Energy
Disk Cache Management
In Large-Scale Object Oriented Databaseshttp://www.slac.stanford.edu/~abh/CHEP2000/Cache/
Andrew Hanushevsky 7-Feb-2000 2
Motivation
Problem More data (>2 PB) than affordable disk space (< 300 TB)
Realization Only about 10% of the data is used at any one time
Solution Hierarchical Mass Storage System
Most data on tape (cheap) in-use data on disk (expensive)
Problem (it’s all circular) Effectively manage the disk cache to keep the most useful data Disk cache performance
Andrew Hanushevsky 7-Feb-2000 3
Basic Disk Caching Architecture
Control Data
DatabaseManagement
CacheManagement
Andrew Hanushevsky 7-Feb-2000 4
Volume Manager + Journaled File System (e.g., Veritas) Catenates disk devices to form very large capacity logical devices High performance (60+ MB/Sec) journaled file system for fast recovery Allows for fast streaming I/O and efficient small block transfers
Problems Low random access performance Limited to 1TB of cache/filesystem in most implementations Unpredictable load balancing
The Direct Solution: One Big Filesystem
Andrew Hanushevsky 7-Feb-2000 5
Still Need a Volume Manager + Journaled File System But can spread the load across multiple heads I/O adapeters Virtually unlimited cache size
Problems Need to manage multiple filesystems Need tools to balance the load
If not done automatically
The Indirect Solution: Multiple Smaller Filesystem
Andrew Hanushevsky 7-Feb-2000 6
Supporting Multiple Filesystem
/cache1/databases:mydbfile
/databases/mydbfile
/cache2
/cache3
symlink
Index AreaOptional data cache
Default data area
Data AreaAny numberAny SizeChosen based on free
space in LRU order
MultipleIndependentFilesystems
Naming conventionallows for
audit and index recovery
Andrew Hanushevsky 7-Feb-2000 7
Staging Manager
Copies files into the cache Uses index space to link wanted name to actual file location Uses allocation manager to select target filesystem Uses lock manager to serialize access to target files & directories Uses resource manager to control tape drive usage
Andrew Hanushevsky 7-Feb-2000 8
File Placement (i.e., filesystem selection)
Round-robin allocation Good for spreading the load
Maximum fit (fuzz == 0) Filesystem with largest amount of free space Good when size not known
Maximal fit (0 < fuzz < 1) Filesystem with largest amount of free space within a delta Good when size unknown but want to keep round-robin allocation
First fit (fuzz == 1) First filesystem that can accommodate the file Good when size known and want to spread the load
Andrew Hanushevsky 7-Feb-2000 9
Asks the staging manager to pre-fetch files Allows user to transparently map objects to files Avoids resource wait time (i.e., files available when job runs) Notifies user synchronously or asynchronously when request completes Uses client/server model of implementation for isolation
Pre-Staging Manager
Andrew Hanushevsky 7-Feb-2000 10
Copies modified files from cache to Mass Storage System File must not have been changed for x seconds
Reduces chance of multiple migrations of same file prior to purge Specific files can be migrated on a priority basis by request
Uses client/server model of implementation for isolation
Migration Manager
Andrew Hanushevsky 7-Feb-2000 11
Removes unused migrated files from the cache Files purged in LRU order across all filesystems
File must not have been used for at least x seconds Tries to maintain free-space in each file system at a target amount
Purging starts when free space falls below a specified file system threshold Targets are specific to a filesystem but may be the same for all
Either a space percentage or absolute value, and a global file count Specific files can be purged on a priority basis by request
Uses client/server model of implementation for isolation• Implementation identical to migration priority queue
Files can be also pinned in the cache (i.e., not removable) For a specific period of time Until a certain date plus optional non-use time Indefinitely
Purge Manager
Andrew Hanushevsky 7-Feb-2000 12
Cache Management Utilities
ooss_Xeq provides a common management interface Audit cache disks (data files must be pointed to from the name space)
Optional fix-up allowed Audit name space (name space must point to actual data files)
Optional fix-up allowed Copy a file into the cache
Arbitrary source Create an empty file in the cache Rename a file in the index Relocate a file to another filesystem Remove a file from the index and cache
Optional removal from the Mass Storage System as well
Andrew Hanushevsky 7-Feb-2000 13
Components For Effective Disk Cache Management
Andrew Hanushevsky 7-Feb-2000 14
Conclusion
Effectively Managing A Large Disk Cache is Complex Performance
Multiple small (100 GB) caches Allocation Strategy Relocation Strategy External resource management (e.g., MSS tape drives)
Fault Tolerance Multiple loosely connected components Cache auditing and recovery
Usability End-user interfaces for staging, migration, and purge
Administration Extensive tools to safely manipulate cache contents