cs 704 d aos distr file system

27
CS 704D Advanced Operating System ( Distributed File System) Debasis Das

Upload: debasis-das

Post on 22-Apr-2015

575 views

Category:

Education


1 download

DESCRIPTION

Distributed File System

TRANSCRIPT

Page 1: Cs 704 D Aos Distr File System

CS 704DAdvanced Operating System

( Distributed File System)

Debasis Das

Page 2: Cs 704 D Aos Distr File System

IT 703D Debasis Das

Distributed File System

Page 3: Cs 704 D Aos Distr File System

●Service□Software providing specific functionality

●Server□An instance of service running on a specific

machine●Client

□Process that requests a service●Client Interface

□A set of operations that can be done by the client

●Inter-machine Interface□Interface for cross-machine interaction

IT 703D Debasis Das

Page 4: Cs 704 D Aos Distr File System

●A file system where the individual files may actually be distributed over several nodes

●Yet the access by the user is the same as if the whole file system was locally available (transparency)

●A component unit is the smallest set of files that can be stored on a node

IT 703D Debasis Das

Page 5: Cs 704 D Aos Distr File System

●Storage service□Management of storage service ( disk/block service)

●True file service□Management of files, creation, deletion, sharing, accessing etc.

●Naming service□Mapping between text names and actual files9directory service)

IT 703D Debasis Das

Page 6: Cs 704 D Aos Distr File System

●Transparency : structure, access, naming and replication

●User mobility●Performance●Simplicity and ease of use●Scalability●High Availability●High reliability●Data integrity●Security●Heterogeneity

IT 703D Debasis Das

Page 7: Cs 704 D Aos Distr File System

●Unstructured & structured files□Just a collection of data in the unstructured form, structured files have the data structured in some form, a collection of records

□structured files can be non-indexed or indexed

□Most modern file systems are unstructured

●Mutable & immutable files□The file is changed when any new write happens. In immutable model, a new version is created

IT 703D Debasis Das

Page 8: Cs 704 D Aos Distr File System

●Accessing remote files□Remote service model: data packing,

communication overheads could be significant□Data caching model : bring data in a block

and keep it cached, cache coherency & other cache problems exist

●Unit of data transfer□File level transfer model: move the whole file□Block level transfer model: move block(s) of

data□Byte level transfer model: move bytes of data□Record level transfer model: move records in

case of a structured file with recordsIT 703D Debasis Das

Page 9: Cs 704 D Aos Distr File System

●Usual cache issues, such as granularity, size, replacement policy, coherence etc.

●Ache location●Modification propagation●Cache validation

IT 703D Debasis Das

Page 10: Cs 704 D Aos Distr File System

●Server’s main memory□Easy to implement, transparent to clients, easy to share

●Client’s disk□Does not work with diskless stations, reliable against crashes,scalable

●Client’s main memory□Maximum performance gain, workstations can be diskless, scalable

IT 703D Debasis Das

Page 11: Cs 704 D Aos Distr File System

●Write Through□All writes are written through to server disk

□Works best when there are more reads than writes

●Delayed write□Write on ejection from cache□Periodic write□Write on close

IT 703D Debasis Das

Page 12: Cs 704 D Aos Distr File System

●Client initiated approach□Checking before every approach□Periodic checking□Check on file open

●Server initiated approach□Server keeps track of when a file is opened and in what mode

□Whenever there is a potential for conflict, server must act

●Cannot let a file open for read to be opened for write

IT 703D Debasis Das

Page 13: Cs 704 D Aos Distr File System

●Violates client-server model, complex coding required

●File servers will have to be Stateful, problems if server fails

●Check on open, client initiated validation is still required. A client can cache a file, open and the close after use. On re-opening the cache must be validated again

IT 703D Debasis Das

Page 14: Cs 704 D Aos Distr File System

●File replication required for high availability

●Differences with caching□Replica is associated with server□Replication is decided by availability and performance requirements

□Replica is more persistent, widely known, secure, available, complete and accurate

□Cache needs to be validated against a replica

IT 703D Debasis Das

Page 15: Cs 704 D Aos Distr File System

●Increased availability●Increased reliability●Improved response time●Reduced network traffic●Improved system throughput●Better scalability●Autonomous operation

□Replicate all required files on the node that requires them

IT 703D Debasis Das

Page 16: Cs 704 D Aos Distr File System

●Naming of replicas

□All replicas of a immutable file can have the same

name

□Mapping system must identify the latest copy to kernel.

Some distance information for replicas also required

●Replication control

□Explicit replication

●Users control the replicas

□Implicit/lazy replication

●System takes care of how many replicas and where

to be placed. Lazy replication is done when system

s free to do it

IT 703D Debasis Das

Page 17: Cs 704 D Aos Distr File System

●Read only replication□Immutable files only

●Read any write all protocol●read from any copy, write to all copies, lock them

first●Available copies protocol

□Write all is difficult to implement if one or more server with copies are down. Update the available copies only

●Primary copy protocol□Write to designated primary copy, update others

●Quorum based protocols□n copies of a file exists, r read quorum, w write

quorum such that r + w > n. at least one up to date copy exists

IT 703D Debasis Das

Page 18: Cs 704 D Aos Distr File System

●Availability□A file may not be available to some nodes on failure & network partitioning

□Replication increase availability●Robustness

□Power to survive crashes, weakening of storage media

●Recoverability□Roll back to an earlier consistent state

IT 703D Debasis Das

Page 19: Cs 704 D Aos Distr File System

●Volatile storage●Non-volatile storage●Stable storage

□Use of redundant units to ensure robustness

□Use the same ordinary units

IT 703D Debasis Das

Page 20: Cs 704 D Aos Distr File System

●Stateful server□Server maintains the state of the operations

●Stateless server□Does not maintain state

●Effect of stateless server on fault tolerance●Unlike Stateful servers no complex crash

recover required●Client simply send the request again, no

recovery process required●Identifier translation imposes a time penalty●To keep the operations indempotent,

arguments need to be sent with every request

IT 703D Debasis Das

Page 21: Cs 704 D Aos Distr File System

●A set of operations that create the desired output●All or nothing●Properties

□Atomicity●Failure atomicity

●If fails, roll back to earlier state●Concurrency atomicity

●On concurrent access, other processes cannot see intermediate states only the final result (consistency property)

□Serializability● Done in some serial order

□Permanence● Once done, it is permanent

IT 703D Debasis Das

Page 22: Cs 704 D Aos Distr File System

Need for Transactionsin File Services

●Improving Recoverability●Failure can leave a file in inconsistent state●With transaction feature, the sate can be rolled back

●Allowing concurrent sharing of mutable files●Multiple client operations if not executed in proper order can leave files inconsistent

●Transaction ensures one set of operations from a specific client is done in an atomic manner

Page 23: Cs 704 D Aos Distr File System

Transaction Based File System Operations

●Should use the primitives ●Begin transaction●End transaction●Abort transactions

●Some lower level primitives●T-read●T-write

Page 24: Cs 704 D Aos Distr File System

Recovery Techniques●Data is updated only when committed●Complete the transaction, else abort transaction

●Commit if transaction completed successfully

●File versions approach●Write ahead log approach

Page 25: Cs 704 D Aos Distr File System

Concurrency Control-1●Locking●Optimized locking

●Type specific locking●Read and write locks. When a read lock is set,

other reads may be allowed but not write; When a write lock is on other operations are permitted

●Intention-to-write locking; along with commit lock permits better concurrency

●Two phase locking●Granularity of locking

Page 26: Cs 704 D Aos Distr File System

Concurrency Control-2●Granularity of locking●File, page or a record

●Handling of locking deadlocks●Avoidance, detection, timeouts

●Optimistic concurrency control●Let the first phase proceed without restrictions, check for validity before commitmentAdvantages

Free from deadlocks, allows maximum parallelism

Page 27: Cs 704 D Aos Distr File System

Design Principlesof DFS

●Clients have cycle to burn: do as much as possible locally●Cache whenever possible: Never know when you are going to need the item again, keep a local copy●Exploit usage properties: group files●Minimize system wide knowledge and change●Trust the fewest possible entities●Batch if possible