ins/twine : a scalable peer-to-peer architecture for intentional resource discovery
DESCRIPTION
INS/Twine : A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery. Magdalena Balazinska , Hari Balakrishnan, and David Karger MIT – Laboratory for Computer Science http://nms.lcs.mit.edu/. Problem Description. Abundant ubiquitous computation and communication - PowerPoint PPT PresentationTRANSCRIPT
INS/Twine : A ScalablePeer-to-Peer Architecture for
Intentional Resource Discovery
Magdalena Balazinska, Hari Balakrishnan, and David Karger
MIT – Laboratory for Computer Science
http://nms.lcs.mit.edu/
Problem Description
• Abundant ubiquitous computation and communication
• Increasingly large computing environments
• Dynamic environments
• Many possible “cool” applications Locate resources using intentional
descriptions
INS Overview
INR: Intentional Name Resolver
INR
INR
INR
INRINR
INR
Self-configuringresolver network
Resources advertise their
capabilities
Client describes attributes of required resources
Resource Discovery Goals
• Allow client applications to locate services and devices
• Handle sophisticated resource descriptions
• Handle dynamism in the operating environment
• Scale to large numbers of resources
Existing Solutions for Scalability
SensorProxy Sensor
Proxy
Resolver
SensorProxy
Resolver ResolverResolver
Partitioning
Resolver
Bldg 1 Bldg 2
Bldg 3
Floors 1-3 Floors 4-6
?
Resolver
Cameras
Existing Solutions for Scalability
SensorProxy Sensor
Proxy
Resolver
SensorProxy
Resolver ResolverResolver
Hierarchies
ResolverResolver
Resolver
INS/Twine Contributions
• Collaborating peer resolvers: no content or location constraints on queries
• Scalability and load distribution via hash-based partitioning of resource descriptions among resolvers
• Semi-structured resource descriptions with arbitrary attribute-set
• Network dynamism• Designed for an environment where all
resources are equally important to users
INS/Twine Approach Overview
Resolver
Resolver
Resolver
Resolver
Resolver
Resolver
Resolver
resource = camerasubject = traffic
resource = motion sensorsubject = traffic
subject = traffic
subject = traffic
resource = motion sensorresource = camera
SensorProxy
INS/Twine Approach Overview
1. A resource advertises its descriptions and network location to any resolver
2. The description is converted into a canonical form: attribute-value tree (AVTree)
3. Using the content of the advertised description, the resolver determines which resolvers should know about the resource
4. The resolver forwards the description to these resolvers for storage
5. Similarly for queries
Architecture of One ResolverResolver
0110 1001 0000Key
StrandMapper
h : 0110 1001 0000
Best(01101001000)K nodes are chosen
KeyRouter
0110 1001 0000
Distributed Hash Table
h = hash(a1v1-a2v2)
Res adv.
…
a1
v1
a2
v2
Strand
Splitting Descriptions into Strands
Resource description: AVTrees
traffic
root
subjectresource
camera
manufacturer
ACompany
model
AModel
Six strands
• Each strand is then hashed into a 128 bit value which determines the nodes that will store the resource information.
• All queries, even short stranded queries require asking only one resolver!
resource
camera
manufacturer
resource
camera
ACompany
model
AModel
resource
camera
resource
camera
manufacturer model
resource
camera
subject
traffic
Distributed Hash Table: Chord
• Nodes and keys have 160-bit ID’s• Chord maps ID’s to “successor”• Successor: Node with next highest ID
N32
N10
N5
N20
N110
N99
N80
N60
CircularID Space
Stores key-values for keys 21..32
Keys 33..60
Basic Lookup
N32
N90
N105
N60
N10N120
K80
“Where is key 80?”
“N90 has K80”
Successor pointer
“Finger table” allows log(N)-time lookups
N80
½¼
1/8
1/161/321/641/128finger[i] points to
successor (n + 2i)log(n) fingers in all
K = log(n) immediateSuccessors for robustness
Stabilization methods for concurrency
Back to Example
Resolver
Resolver
Resolver
Resolver
Resolver
Resolver
Resolver
resource = camerasubject = traffic
resource = motion sensorsubject = traffic
subject = traffic
subject = traffic
resource = motion sensorresource = camera
SensorProxy
Properties of INS/Twine
• For a resource description with a attributes, t at the top-level : – Number of strands is : s = 2a – t
• For R resources, S strands, K replication level, and N resolvers :– Storage requirement at each resolver : (RSK)/N
• Advertisement: – SK resolvers contacted (+ O(log N) for key routing)
• Query: – K resolvers contacted (+ O(log N) for key routing)– 100% success rate for less than K failures– Failure rate decreases exponentially with K
State Management
• Resources join, move, leave and fail
• Resolvers join and fail
• How to maintain consistency while achieving fault tolerance?– Hard state– Soft state– Hybrid solution implemented in INS/Twine
State Management
INR
INR: Intentional Name Resolver
INR
INR
INR
INR
INR
INR
INR
Resource
State Management
INR
INR: Intentional Name Resolver
INR
INR
INR
INR
INR
INR
INR
Resource
Remove RemoveRemove
State Management
INR
INR: Intentional Name Resolver
INR
INR
INR
INR
INR
INR
INR
Resource
State Management
INR
INR: Intentional Name Resolver
INR
INR
INR
INR
INR
INR
INR
Resource
ExpireExpire
Expire
Evaluation: Data Distribution
Cumulative fraction of resolvers
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45Fraction of resources
mp3 descr. Thresh. 50mp3 descriptions
Bib entries. Thresh. 100Bibliographical entries
Data distribution rather even. Each resolvers holds a small fraction of resource descriptions
Evaluation: Query Resolution
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.01 0.02 0.03 0.04 0.05 0.06
Cumulative fraction of resolvers
Fraction of queries
mp3 descriptionsBibliographical entries
Even distribution of queries among resolvers
Conclusion
• Intentional resource discovery
• Scalable peer-to-peer network of resolvers
• Hash-based mapping of resource descriptions to resolvers
• Dynamic and even distribution of resource information and queries
• Handles dynamism of resources and resolvers
http://nms.lcs.mit.edu/projects/twine/
Appendix
INS Overview
INR: Intentional Name Resolver
Describing Resources
• INS name-specifier<service>printer <type>color</type> <speed>slow</speed></service><cost> high</cost>
[service=printer [type=color] [speed=slow]][cost=high]
service
printer
root
cost
high
speedtype
color slow
• XML
• AVTrees
Problems using concatenation
1. If numerous resources share the same prefix, some nodes may receive significantly more load than others
2. Fully solving short stranded queries requires the colaboration of a linearly growing number of resolvers (with respect to network size)
3. 1) and 2) are contradictory requirements!
Distributed Hash Table: Chord
A distributed hash-table is used to map keys onto resolvers efficiently:
From: Chord: A Peer-to-Peer Lookup Service for Internet ApplicationsIon Stoica, Robert Morris, David Karger, Frans Kaashoek, Hari Balakrishnan Proc. ACM SIGCOMM Conf., San Diego, CA, September 2001.
Problems using prefixes
1. More insertions for each resource. Small factor since we expect descriptions to be rather short
2. Very popular prefixes may overload certain nodes : many advertisements and queries => the prefix should then become unusable
1. Nodes stop storing resources for that prefix
2. Nodes answer queries for the prefix specifying that they provide a partial answer due to the vague nature of the query