cs 704 d aos distr file system
DESCRIPTION
Distributed File SystemTRANSCRIPT
CS 704DAdvanced Operating System
( Distributed File System)
Debasis Das
IT 703D Debasis Das
Distributed File System
●Service□Software providing specific functionality
●Server□An instance of service running on a specific
machine●Client
□Process that requests a service●Client Interface
□A set of operations that can be done by the client
●Inter-machine Interface□Interface for cross-machine interaction
IT 703D Debasis Das
●A file system where the individual files may actually be distributed over several nodes
●Yet the access by the user is the same as if the whole file system was locally available (transparency)
●A component unit is the smallest set of files that can be stored on a node
IT 703D Debasis Das
●Storage service□Management of storage service ( disk/block service)
●True file service□Management of files, creation, deletion, sharing, accessing etc.
●Naming service□Mapping between text names and actual files9directory service)
IT 703D Debasis Das
●Transparency : structure, access, naming and replication
●User mobility●Performance●Simplicity and ease of use●Scalability●High Availability●High reliability●Data integrity●Security●Heterogeneity
IT 703D Debasis Das
●Unstructured & structured files□Just a collection of data in the unstructured form, structured files have the data structured in some form, a collection of records
□structured files can be non-indexed or indexed
□Most modern file systems are unstructured
●Mutable & immutable files□The file is changed when any new write happens. In immutable model, a new version is created
IT 703D Debasis Das
●Accessing remote files□Remote service model: data packing,
communication overheads could be significant□Data caching model : bring data in a block
and keep it cached, cache coherency & other cache problems exist
●Unit of data transfer□File level transfer model: move the whole file□Block level transfer model: move block(s) of
data□Byte level transfer model: move bytes of data□Record level transfer model: move records in
case of a structured file with recordsIT 703D Debasis Das
●Usual cache issues, such as granularity, size, replacement policy, coherence etc.
●Ache location●Modification propagation●Cache validation
IT 703D Debasis Das
●Server’s main memory□Easy to implement, transparent to clients, easy to share
●Client’s disk□Does not work with diskless stations, reliable against crashes,scalable
●Client’s main memory□Maximum performance gain, workstations can be diskless, scalable
IT 703D Debasis Das
●Write Through□All writes are written through to server disk
□Works best when there are more reads than writes
●Delayed write□Write on ejection from cache□Periodic write□Write on close
IT 703D Debasis Das
●Client initiated approach□Checking before every approach□Periodic checking□Check on file open
●Server initiated approach□Server keeps track of when a file is opened and in what mode
□Whenever there is a potential for conflict, server must act
●Cannot let a file open for read to be opened for write
IT 703D Debasis Das
●Violates client-server model, complex coding required
●File servers will have to be Stateful, problems if server fails
●Check on open, client initiated validation is still required. A client can cache a file, open and the close after use. On re-opening the cache must be validated again
IT 703D Debasis Das
●File replication required for high availability
●Differences with caching□Replica is associated with server□Replication is decided by availability and performance requirements
□Replica is more persistent, widely known, secure, available, complete and accurate
□Cache needs to be validated against a replica
IT 703D Debasis Das
●Increased availability●Increased reliability●Improved response time●Reduced network traffic●Improved system throughput●Better scalability●Autonomous operation
□Replicate all required files on the node that requires them
IT 703D Debasis Das
●Naming of replicas
□All replicas of a immutable file can have the same
name
□Mapping system must identify the latest copy to kernel.
Some distance information for replicas also required
●Replication control
□Explicit replication
●Users control the replicas
□Implicit/lazy replication
●System takes care of how many replicas and where
to be placed. Lazy replication is done when system
s free to do it
IT 703D Debasis Das
●Read only replication□Immutable files only
●Read any write all protocol●read from any copy, write to all copies, lock them
first●Available copies protocol
□Write all is difficult to implement if one or more server with copies are down. Update the available copies only
●Primary copy protocol□Write to designated primary copy, update others
●Quorum based protocols□n copies of a file exists, r read quorum, w write
quorum such that r + w > n. at least one up to date copy exists
IT 703D Debasis Das
●Availability□A file may not be available to some nodes on failure & network partitioning
□Replication increase availability●Robustness
□Power to survive crashes, weakening of storage media
●Recoverability□Roll back to an earlier consistent state
IT 703D Debasis Das
●Volatile storage●Non-volatile storage●Stable storage
□Use of redundant units to ensure robustness
□Use the same ordinary units
IT 703D Debasis Das
●Stateful server□Server maintains the state of the operations
●Stateless server□Does not maintain state
●Effect of stateless server on fault tolerance●Unlike Stateful servers no complex crash
recover required●Client simply send the request again, no
recovery process required●Identifier translation imposes a time penalty●To keep the operations indempotent,
arguments need to be sent with every request
IT 703D Debasis Das
●A set of operations that create the desired output●All or nothing●Properties
□Atomicity●Failure atomicity
●If fails, roll back to earlier state●Concurrency atomicity
●On concurrent access, other processes cannot see intermediate states only the final result (consistency property)
□Serializability● Done in some serial order
□Permanence● Once done, it is permanent
IT 703D Debasis Das
Need for Transactionsin File Services
●Improving Recoverability●Failure can leave a file in inconsistent state●With transaction feature, the sate can be rolled back
●Allowing concurrent sharing of mutable files●Multiple client operations if not executed in proper order can leave files inconsistent
●Transaction ensures one set of operations from a specific client is done in an atomic manner
Transaction Based File System Operations
●Should use the primitives ●Begin transaction●End transaction●Abort transactions
●Some lower level primitives●T-read●T-write
Recovery Techniques●Data is updated only when committed●Complete the transaction, else abort transaction
●Commit if transaction completed successfully
●File versions approach●Write ahead log approach
Concurrency Control-1●Locking●Optimized locking
●Type specific locking●Read and write locks. When a read lock is set,
other reads may be allowed but not write; When a write lock is on other operations are permitted
●Intention-to-write locking; along with commit lock permits better concurrency
●Two phase locking●Granularity of locking
Concurrency Control-2●Granularity of locking●File, page or a record
●Handling of locking deadlocks●Avoidance, detection, timeouts
●Optimistic concurrency control●Let the first phase proceed without restrictions, check for validity before commitmentAdvantages
Free from deadlocks, allows maximum parallelism
Design Principlesof DFS
●Clients have cycle to burn: do as much as possible locally●Cache whenever possible: Never know when you are going to need the item again, keep a local copy●Exploit usage properties: group files●Minimize system wide knowledge and change●Trust the fewest possible entities●Batch if possible