working group outbrief resource discovery & description and federation workshop on the...
TRANSCRIPT
Working Group OutbriefResource discovery & description and
federation
Workshop on the Development of a Next-Generation, Interoperable,
Federated Network Cyberinfrastructure October 1-2, 2014
Working Group Charge & Members
• Identify areas of commonality, differences, and holes in participant communities’ approaches to and capabilities for:– Resource description & discovery– Federation & interoperation
• Members: Ilya Baldin, Mark Berman (scribe), Dave Hart, Kate Keahey, Orran Krieger, Larry Landweber, Inder Monga, Warren Smith, Sejun Song
Group Composition and High-Level Findings
• Our working group broadly represented overall group of workshop communities
• We have significant differences in basic goals for resource discovery, description and federation– We agree that cyberinfrastructure federation is generally desirable
at a basic level, at a granularity not requiring detailed, machine-readable descriptions.
– Members are wary of seeking fully (or even moderately) general approaches to resource description and discovery or to CI interoperation.
– We agree that there are specific interoperation examples that merit additional effort, but these should be pursued individually to avoid over engineered solutions.
Resource Description and Discovery
• Questions considered:– What is the range of resources that must be
described?– What resource description frameworks are in place– What are the holes?– What level of detail in resource specification is useful– How should unimportant ("don't care") details be
abstracted?– How can researcher-support software tools work to
match resource descriptions to researcher needs?
Resource Description and Discovery Challenges
• Different projects represented within our group have very different needs for resource description– At one extreme, XSEDE users often tailor software to a specific
resource – they are intimately familiar with capabilities by resource name and would not benefit from detailed description.
– Different GENI, CloudLab, Chameleon users are expected to desire different levels of description, from “N compute nodes on a network” down to detailed hardware specs, ports, cabling, firmware versions and instrumentation.
– Resource descriptions may be intended for human consumption or automated processing by software tools.
– Resources may include compute, network, storage, cloud-like entities, scientific instruments, people, complex service configs, …
Resource Description and Discovery Findings 1 – Desired Capabilities
• Basic resource catalogs, with general human readable descriptions are often helpful
• When additional description is useful– Resource descriptions should be coupled with relevant policy (e.g.,
“Can I use this resource?”)– A wide spectrum of visibility is desirable, ranging from full
disclosure to views tailored to a particular user’s access permissions– Human- and machine-readable info is needed– Unique items (e.g. scientific instruments) should be described to
the level required for the desired level of interoperation, such as a data model
– Resource discovery should be a process, supporting multiple levels of detail, drill-down, negotiation, and exploration of alternatives
Resource Description and Discovery Findings 2 – Missing Capabilities
• Existing resource description frameworks lack adequate information models to support sophisticated tool-supported decision making– Desirable to (optionally) abstract resources into services offered. We
currently bridge this gap with expert knowledge.• Different users and tools need varying levels of resource and
service abstraction – this is true for both resource description and discovery / reservation– Example: provisioning a database – needs coordinated storage,
network, compute, software (may include a license)• Resource availability is a time-varying phenomenon, opening
potential for race conditions and scheduling failures– How to reconcile very different scheduling paradigms?
Federation and Interoperation• Questions considered:
– What capabilities from one community's cyberinfrastructure offer the most promising benefit to others?
– If you (or your user community) wanted to do your work using another community's cyberinfrastructure, what would you have to learn? What difficulties would you expect? What would you hope to gain?
– What underlying assumptions in one community's cyberinfrastructure may require rethinking to realize the advantages of federation? Areas to consider include:• Usage models (batch v. interactive; application process; etc.)• How much do users know or need to know about the infrastructure?• Who owns / manages the infrastructure v. who gets to use it? • What types of experiment are supported / accepted / desired?
– What concrete steps can be taken by research cyberinfrastructure developers and owners to advance federation quickly and bring mutual benefit?
Federation and InteroperationChallenges
• Group members have very different senses of what levels of integration and interoperation are useful for their cyberinfrastructure efforts.– Federation at the level of basic information services and user
access is widely useful• Human-readable descriptions• Single access for identity and user attributes• Common currency management for allocations in systems that have
this concept
– Interoperation among systems with highly divergent execution models may be more effort than merited• Example: XSEDE’s queue-based scheduling may be difficult to reconcile
with GENI’s immediate access model
Federation and InteroperationFindings 1 – Desired Capabilities
• Federation at basic level is possible and may be useful without significant interoperation– Multiple levels of integration: resource information only; user
access; “client-side” interoperation (manual); common gateway; policy-based interoperation
• We identified several desirable interoperation examples– Common portal and request embedding logic to enable
researchers to identify Chameleon, GENI, Grid’5000 as suitable execution environment, with or without data connectivity or multi-testbed configurations
– Extended DANCES example from yesterday, including coordinated compute and storage resource management based on measured network behavior and understanding of scarcity – resources on campus and in testbeds
Federation and InteroperationFindings 2 – Missing Capabilities
• Current representations of workflow and resource specification have embedded assumptions about usage models, degree of resource interchangeability, availability– Explicit representation of these factors is needed to support intelligent,
automation-assisted resource management, at configuration time and dynamically during execution.
– Dynamic resource management needs reliable instrumentation.• Common scheduling and/or workflow models are a prerequisite for
automated interoperation• These capabilities should be pursued on a selective basis to minimize
wasted effort– Cloud and grid computation models may be better suited to deep interoperation
than often hand-optimized HPC environmen• The anticipated approach to realizing these new capabilities is a
combination of underlying infrastructure functions with clever researcher support tools.