wide area data sharing with logistical networking micah beck, assoc. prof. & director logistical...

24
Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science Department [email protected] End-to-End Workshop, Miami Feb 5, 2003

Upload: irene-mitchell

Post on 05-Jan-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Wide Area Data Sharing with Logistical Networking

Micah Beck, Assoc. Prof. & DirectorLogistical Computing & Internetworking (LoCI) LabComputer Science Department

[email protected]

End-to-End Workshop, Miami Feb 5, 2003

Page 2: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

» Funding• Dept. of Energy

SciDAC• National Science

Foundation ANIR• UT Center for Info

Technology Research

» LoCI Lab developers

Logistical Networking Research

» University of Tennessee• Micah Beck• James S. Plank• Jack Dongarra

» University of California, Santa Barbara• Rich Wolski

Page 3: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Data Sharing Problem

» Large data objects created as byproducts of common operations

» A large community of potential collaborators that might need access to the data

» Asynchrony between collaborators (especially when in different time zones)

» No single administrative domain» No centrally managed resource pool (DB or FS)» Control of access to data is necessary

Page 4: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Internet in Collaboration

» Providing bandwidth resources on demand» Network community is fluid, loosely organized» Any two endpoints can communicate» User authentication, security are managed by

endpoints» But: there is no persistence, hence no direct

support for asynchronous collaboration!

Page 5: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Logistical Networking in Collaboration

» Adding persistence of data to the network while maintaining other important properties:• Resources available to any community member

»10TB now, 50TB by 2003, 100TB-1PB goal• No centralized administrative domain

»Each allocation is individually managed• No central management of resource pool

»Allocations are time limited»Storage reclaimed when they expire!

• Access control, security managed by endpoints.

Page 6: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Network Storage Stack

Applications

Logistical File System

Logistical Tools

L-Bone

IBP

Local Access

Physical

exNode

• Our adaption of the network stack architecture for storage

• Like the IP Stack

• Each level encapsulates details from the lower levels, while still exposing details to higher levels

Page 7: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

IBP: The Internet Backplane Protocol

» Storage provisioned on community “depots”» Very primitive service (similar to block service, but

more sharable)• Goal is to be a common platform (exposed)• Also part of end-to-end design

» Best effort service – no heroic measures• Availability, reliability, security, performance

» Allocations are time-limited!• Leases are respected, can be renewed• Permanent storage is to strong to share!

Page 8: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Models of Sharing: Logistical Networking

» Moderately valuable resources• Storage, server cycles

» Sharing enabled by relative plenty

» Internet-like policies• Loose access control• No per-use accounting

» Primary design goal: scalability• Application autonomy• Resource

transparency» Burdens of scalability

• The End-to-End Principles

• Weak operation semantics

• Vulnerability to Denial of Service

Page 9: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Network Storage Stack

The L-bone:Resource Discovery& Proximity queries

IBP: Allocating and managing networkstorage (like a network malloc)

The exNode:A data structurefor aggregation

LoRS: The Logistical Runtime System:Aggregation tools and methodologies

Page 10: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Logistical Backbone (L-Bone)

» LDAP-based storage resource discovery.

» Query by capacity, network proximity, geographical proximity, stability, etc.

» Periodic monitoring of depots.

» Currently10 Terabytes of shared storage.• 50 TB awarded, 100TB proposed • Our goal is 1PB global total

Page 11: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

L-Bone: January 2003

Page 12: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Network Storage Stack

The L-bone:Resource Discovery& Proximity queries

IBP: Allocating and managing networkstorage (like a network malloc)

The exNode:A data structurefor aggregation

LoRS: The Logistical Runtime System:Aggregation tools and methodologies

Page 13: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The exNode

» The Network “File Descriptor» XML-based data structure/serialization» Map byte-extents to IBP buffers (or other

allocations).» Allows for replication, flexible decomposition of

data.» Also allows for error-correction/checksums» Arbitrary metadata.

Page 14: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

ExNode vs inode

exNode

inode

IBP Allocations

the network

local system

disk blocks

kernel

capabilities

block addresses

user

Page 15: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

ExNode Mobility

XML Serialization

The exNode serialization is a portable soft link

Page 16: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Network Storage Stack

The L-bone:Resource Discovery& Proximity queries

IBP: Allocating and managing networkstorage (like a network malloc)

The exNode:A data structurefor aggregation

LoRS: The Logistical Runtime System:Aggregation tools and methodologies

Page 17: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Logistical Runtime System

» Basic Primitives:• Upload, Download, Augment, Refresh

» End-to-end Services• Checksums, Encryption, Compression

» Other Things We Can Do• Routing through an intermediate depot to

reduce IP RTT, speeding up TCP transfers• Overlay multicast using either multiple

TCP streams or IP multicast at tree nodes» What’s missing?

• Management by Applications!

Page 18: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Upload

Page 19: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Augment

Page 20: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Download

Page 21: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

Routing through Intermediate Depots

Page 22: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

IBP Enables Data Intensive Collaboration

» Large files can be uploaded to nearby depots, then managed by movement between depots• End systems are not involved in long distance

transfers» Data can be moved near to distant collaborator

without being downloaded into their end system• Direct access to collaborators private storage is

not required» Depot-to-depot transfers can take advantage of

multithreading, UDP transfer, Net/Web 100, other high-performance optimizations

Page 23: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

The Next Step: Computation!

» Depots can store data, but cannot compute, e.g.• Recomputing checksums for stored data would

help maintain redundancy• Operations such as XOR required to recover

redundantly stored data in case of loss» The Network Functional Unit is an extension of the

depot that operates on stored data• NFU operations are limited, cannot access data

outside of depot• Management of “process state” must be

performed at end systems.

Page 24: Wide Area Data Sharing with Logistical Networking Micah Beck, Assoc. Prof. & Director Logistical Computing & Internetworking (LoCI) Lab Computer Science

LoCI Lab Online http://loci.cs.utk.edu

» IBP server and clients for Unix/Linux/OS X• Additional clients for Java, Win32

» Logistical Runtime System libraries and tools• Run under Unix/Linux/OS X natively• Ported to Windows under Cygwin • Includes visualization (Tcl/tk)• Web interface

» Logistical Backbone resource discovery server• Unix/Linux/OS X only

» Publications, documentation, L-Bone status