andrew hanushevsky stanford linear accelerator center

14
Andrew Hanushevsky 7-Feb-2000 1 Andrew Hanushevsky Stanford Linear Accelerator Center Produced under contract DE-AC03-76SF00515 between Stanford University and the Department of Energy Disk Cache Management Large-Scale Object Oriented Databa http://www.slac.stanford.edu/~abh/CHEP2000/Cache/

Upload: amery-hendricks

Post on 01-Jan-2016

24 views

Category:

Documents


2 download

DESCRIPTION

Disk Cache Management In Large-Scale Object Oriented Databases http://www.slac.stanford.edu/~abh/CHEP2000/Cache/. Andrew Hanushevsky Stanford Linear Accelerator Center Produced under contract DE-AC03-76SF00515 between Stanford University and the Department of Energy. Motivation. Problem - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 1

Andrew HanushevskyStanford Linear Accelerator Center

Produced under contract DE-AC03-76SF00515 between Stanford University and the Department of Energy

Disk Cache Management

In Large-Scale Object Oriented Databaseshttp://www.slac.stanford.edu/~abh/CHEP2000/Cache/

Page 2: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 2

Motivation

Problem More data (>2 PB) than affordable disk space (< 300 TB)

Realization Only about 10% of the data is used at any one time

Solution Hierarchical Mass Storage System

Most data on tape (cheap) in-use data on disk (expensive)

Problem (it’s all circular) Effectively manage the disk cache to keep the most useful data Disk cache performance

Page 3: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 3

Basic Disk Caching Architecture

Control Data

DatabaseManagement

CacheManagement

Page 4: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 4

Volume Manager + Journaled File System (e.g., Veritas) Catenates disk devices to form very large capacity logical devices High performance (60+ MB/Sec) journaled file system for fast recovery Allows for fast streaming I/O and efficient small block transfers

Problems Low random access performance Limited to 1TB of cache/filesystem in most implementations Unpredictable load balancing

The Direct Solution: One Big Filesystem

Page 5: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 5

Still Need a Volume Manager + Journaled File System But can spread the load across multiple heads I/O adapeters Virtually unlimited cache size

Problems Need to manage multiple filesystems Need tools to balance the load

If not done automatically

The Indirect Solution: Multiple Smaller Filesystem

Page 6: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 6

Supporting Multiple Filesystem

/cache1/databases:mydbfile

/databases/mydbfile

/cache2

/cache3

symlink

Index AreaOptional data cache

Default data area

Data AreaAny numberAny SizeChosen based on free

space in LRU order

MultipleIndependentFilesystems

Naming conventionallows for

audit and index recovery

Page 7: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 7

Staging Manager

Copies files into the cache Uses index space to link wanted name to actual file location Uses allocation manager to select target filesystem Uses lock manager to serialize access to target files & directories Uses resource manager to control tape drive usage

Page 8: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 8

File Placement (i.e., filesystem selection)

Round-robin allocation Good for spreading the load

Maximum fit (fuzz == 0) Filesystem with largest amount of free space Good when size not known

Maximal fit (0 < fuzz < 1) Filesystem with largest amount of free space within a delta Good when size unknown but want to keep round-robin allocation

First fit (fuzz == 1) First filesystem that can accommodate the file Good when size known and want to spread the load

Page 9: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 9

Asks the staging manager to pre-fetch files Allows user to transparently map objects to files Avoids resource wait time (i.e., files available when job runs) Notifies user synchronously or asynchronously when request completes Uses client/server model of implementation for isolation

Pre-Staging Manager

Page 10: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 10

Copies modified files from cache to Mass Storage System File must not have been changed for x seconds

Reduces chance of multiple migrations of same file prior to purge Specific files can be migrated on a priority basis by request

Uses client/server model of implementation for isolation

Migration Manager

Page 11: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 11

Removes unused migrated files from the cache Files purged in LRU order across all filesystems

File must not have been used for at least x seconds Tries to maintain free-space in each file system at a target amount

Purging starts when free space falls below a specified file system threshold Targets are specific to a filesystem but may be the same for all

Either a space percentage or absolute value, and a global file count Specific files can be purged on a priority basis by request

Uses client/server model of implementation for isolation• Implementation identical to migration priority queue

Files can be also pinned in the cache (i.e., not removable) For a specific period of time Until a certain date plus optional non-use time Indefinitely

Purge Manager

Page 12: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 12

Cache Management Utilities

ooss_Xeq provides a common management interface Audit cache disks (data files must be pointed to from the name space)

Optional fix-up allowed Audit name space (name space must point to actual data files)

Optional fix-up allowed Copy a file into the cache

Arbitrary source Create an empty file in the cache Rename a file in the index Relocate a file to another filesystem Remove a file from the index and cache

Optional removal from the Mass Storage System as well

Page 13: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 13

Components For Effective Disk Cache Management

Page 14: Andrew Hanushevsky Stanford Linear Accelerator Center

Andrew Hanushevsky 7-Feb-2000 14

Conclusion

Effectively Managing A Large Disk Cache is Complex Performance

Multiple small (100 GB) caches Allocation Strategy Relocation Strategy External resource management (e.g., MSS tape drives)

Fault Tolerance Multiple loosely connected components Cache auditing and recovery

Usability End-user interfaces for staging, migration, and purge

Administration Extensive tools to safely manipulate cache contents