distributed file system summary ranjani sankaran
Post on 31-Mar-2015
229 Views
Preview:
TRANSCRIPT
DISTRIBUTED FILE SYSTEM SUMMARY
RANJANI SANKARAN
Outline
• Characteristics of DFS• DFS Design and Implementation• Transaction and Concurrency Control• Data and File Replication• Current Work• Future Work
DFS CharacteristicsDispersion • Dispersed Files
Location Transparent Location Independent
• Dispersed Clients login transparencyaccess transparency
Multiplicity • Multiple Files
Replication Transparency• Multiple Clients
Concurrency TransparencyOthers (general)• Fault Tolerance – crash of server or client, loss of message• Scalability – Incremental file system growth
DFS STRUCTURE[3]
DFS Root-Top level ; Holds links to shared folders in a DomainDFS Link- Share under the root; Link redirects to shared folderDFS Replicas or Targets- identical shares on 2 servers can be grouped together as Targets under one link.
MAPPING OF LOGICAL AND PHYSICAL FOLDERS[2]
DFS Design and Implementation
• Problems –File Sharing and File ReplicationFile and File Systems
File name -Mapping symbolic name to a unique file id (ufid or file handle) which is the function of directory service.
File attributes -ownership, type, size, timestamp, access authorization information.
Data Units - Flat / Hierarchical Structure File Access - sequential, direct, indexed-sequential
COMPONENTS IN A FILE SYSTEM
directory service
name resolution, add and deletion of files
authorization service
capability and / or access control list
file service
transaction
concurrency and replication management
basic
read / write files and get / set attributes
system service
device, cache, and block management
Overview of FS Services
DIRECTORY SERVICE – Search , Create, Delete, Rename files, mapping and locating, list a directory , traverse the file system.AUTHORIZATION SERVICE – Authorized access for security ; Read, Write , Append , Execute, Delete , List operations FILE SERVICE – Transaction Service : Basic Service : Read , Write , Open , Close , Delete ,Truncate , Seek SYSTEM SERVICE – Replication , Caching ,Mapping of addresses etc.SERVICES and SERVERSServers/Multiple Servers implement Services;Client <->Server relation ship is relative;
File Mounting
• Attach a remote named file system to the client’s file system hierarchy at the position pointed to by a path name
• Once files are mounted, they are accessed by using the concatenated logical path names without referencing either the remote hosts or local devices.
• Location Transparent• Linked information (mount table) is kept till they are
unmounted.• Different clients may perceive a different FS view– To achieve a global FS view – SA enforces mounting rules– Restrict/Allow mounting –Server’s export file.
Types of Mounting– Explicit mounting: clients make explicit mounting
system calls whenever one is desired– Boot mounting: a set of file servers is prescribed
and all mountings are performed the client’s boot time
– Auto-mounting: mounting of the servers is implicitly done on demand when a file is first opened by a client
Server Registration
• The mounting protocol is not transparent – the initial mounting requires knowledge of the location of file servers
• Server registration– File servers register their services, and clients
consult with the registration server before mounting
– Clients broadcast mounting requests, and file servers respond to client’s requests
Stateful and Stateless File Servers• Stateful file Server : file servers maintain state information about
clients between requests• Stateless file Server : when a client sends a request to a server, the
server carries out the request, sends the reply, and then remove from its internal tables all information about the request– Between requests, no client-specific information is kept on the
server– Each request must be self-contained: full file name and offset…
• State information could be:• Opened files and their clients• File descriptors and file handles• Current file position pointers, mounting information• Cache or buffer
File Access and Semantics of Sharing
• File Sharing– Overlapping access :Multiple copies of same file• Cache or replication, Space Multiplexing• Coherency Control: coherent view of shared files,
managing access to replicas, atomic updates.– Interleaving access: Multiple granularities of data access
operations• Time Multiplexing• Simple Read Write, Transaction, Session• Concurrency Control: Prevent erroneous /inconsistent
results during concurrent access
Semantics of Sharing/Replication
• Unix Semantics : Currentness : Writes propagated immediately so that reads will return latest value.
• Transaction Semantics: Consistency: Writes are stored and propagated when consistency constraints are met.
• Session Semantics:Efficiency:Writes done on a working copy; results made permanent during session close.
REPLICATION• Write Policies• Cache Coherence Control• Version Control
Transaction and Concurrency Control
• Concurrency Control Protocol required to maintain ACID Semantics for Concurrent transactions.
• Distributed Transaction Processing System:– Transaction Manager: correct execution of local and
remote transactions.– Scheduler: schedules operations to avoid conflicts
using locks, timestamps and validation managers.– Object Manager: coherency of replicas/caches;
interface to the file system.
Transaction and Concurrency Control
Serializability
• A schedule is Serializable if the result of execution is equivalent to that of a serial schedule. (without cyclic hold-wait deadlock situations, holding conflicting locks etc.).
• In Transactions, the transaction states must be consistent.
• Conflicts – write-write: write-read: read-write on a shared object
Interleaving Schedules
• Sched (1 ,3) and (2,4) are trying to perform similar operations on data objects C and D.
• (1,2) and (3,4) order is only valid.
Concurrency Control Protocols• Two Phase Locking: – Growing Phase, Shrinking Phase– Sacrifices concurrency and sharing for Serializability– Circular wait(deadlock) to : Write A=100 ; Write B =20t1 : Read A ,Read B 1. Write Sum in C;2.Write diff in Dt2 : Read A, Read B 3. Write sum in D;3.Write diff in CSolution : Release locks as soon as possibleProblem : Rolling aborts , Commit dependenceSolution : Strict 2 Phase Locking Systems
Time Stamp Ordering• Time Stamp Ordering
– Logical timestamps or counters ,unique timestamps for Txs.– Larger TS Txs wait for smaller TS Txs;Small TS Txs die and restart when
confronting larger TS Txs.– t0 ( 50 ms) < t1 (100 ms)< t2 (200 ms); t0 : write A=100 ; Write B = 20 ; ->Completedt1 : Read A ,Read B 1. Write Sum in C;2.Write diff in Dt2 : Read A, Read B 3. Read Sum in C;3.Write diff in C
Time Stamp Ordering Concurrency Control
• RD and WR –Logical TS for last read/write• Tmin is the minimum tentative time for pending write.
Optimistic Concurrency Control• Allows entire transaction to complete and then
validate the transaction before making its effect permanent
• Execution Phase ,Validation Phase , Update Phase
• Optimistic Concurrency Control mechanism• Validation : 2 Phase Commit Protocol by
sending validation request to all TMs.• Validated updates are committed in Update
Phase.
Data and File Replication• For Concurrent access and availability.• GOAL• One-copy Serializability:
– The execution of transaction on replicated objects is equivalent to the execution of the same transactions on non-replicated objects
– Read Operations : Read-one-primary, Read-one ,Read-quorum– Write Operations:Write-one-primary,Write-all,Write-all-
available,Write-quorum,Write-gossip• Quorum Voting:• Gossip Update Propagation• Casual Order Gossip Protocol
ARCHITECTURE
• Client chooses one or more FSA to access data object.• FSA acts as front end to replica managers RMs to provide
replication transparency.• FSA contacts one or more RMs for actual updating and
reading of data objects.
Quorum Voting/Gossip Update Propagation
• Quorum Voting : Uses Read Quorum, Write Quorum– Write-write conflict: 2 * Write quorum > all object copies– Read-write conflict: Write quorum + read quorum > all
object copies.• Gossip Update Propagation:
– Read: if TSfsa<=TSrm, RM has recent data, return it, otherwise wait for gossip, or try other RM
– Update :if Tsfsa>TSrm, update. Update TSrm send gossip. Otherwise, process based on application, perform update or reject
– Gossip : update RM if gossip carries new updates.
Gossip Updation Protocol
• Used in Fixed RM Configuration • Uses Vector Timestamps, Uses buffer to keep
order
Current WorkHere are some links to current distributed-file system and related projects:
Ceph: http://ceph.newdream.net/(Peta Byte Scale DFS which is Posix Compatible and fault tolerant)GlusterFS: http://www.gluster.org/HDFS: http://hadoop.apache.org/hdfs/HekaFS: http://www.hekafs.org/OrangeFS: http://www.orangefs.org/ and http://www.pvfs.org/KosmosFS: http://code.google.com/p/kosmosfs/MogileFS: http://danga.com/mogilefs/Swift (OpenStack Storage): http://www.openstack.org/projects/storage/FAST'11 proceedings: http://www.usenix.org/events/fast11/tech/
Future Work
• usability/scalability issues relate to the costs of traversal in Distributed File Systems as traditional model of file traversal might not be suitable for searching /indexing [3]
• File Systems adding support for their own indexing (Continuous/incremental updates of indexes)
• NFS family might become increasingly irrelevant for more geographically distributed enterprises.
• Innovations in the area of multi tenancy and security for Distributed/Cloud Computing
References1. R. Chow and T. Johnson, Distributed Operating Systems & Algorithms, 19972. http://
www.windowsnetworking.com/articles_tutorials/Implementing-DFS-Namespaces.html -DFS Namespaces reference
3. http://www.quora.com/Distributed-Systems/What-is-the-future-of-file-systems -Future of File Systems
4. http://www.cs.iit.edu/~iraicu/research/publications/2011_LSAP2011_exascale-storage.pdf -Issues with DFS at Exascale
5. http://www.usenix.org/publications/login/2010-08/openpdfs/maltzahn.pdf - Ceph as a scalable alternative to Hadoop.
6. http://www-sop.inria.fr/members/Patrick.Valduriez/pmwiki/Patrick/uploads//Conferences/dexa2011.pdf - Distributed Data Management in 2020?
7. http://www.greenplum.com/media-center/big-data-use-cases/agile-analytics -Hadoop might become the future solution
THANKS YOU
top related