geni engineering conference -- ian foster
DESCRIPTION
I was invited to talk at the 18th GENI Engineering Conference (http://groups.geni.net/geni/wiki/GEC18Agenda) on experiences in the Grid community with creating and operating large shared infrastructures. I chose to focus on our experiences using Software as a Service (SaaS: aka Cloud) to reduce barriers to the use of the capabilities required to create and operate virtual organizations.TRANSCRIPT
www.ci.anl.govwww.ci.uchicago.edu
Hosted services for managing shared cyberinfrastructure
Ian FosterArgonne National Laboratory & The University of Chicago
Joint work with Rachana Ananthakrishnan, Josh Bryan, Kyle Chard, Mattias Lidman, Steven Tuecke, and others
GENI Engineering Conference, NYC, October 28, 2013
www.ci.anl.govwww.ci.uchicago.edu
Using cloud services to accelerate discoveryIan FosterArgonne National Laboratory & The University of Chicago
Joint work with Rachana Ananthakrishnan, Josh Bryan, Kyle Chard, Mattias Lidman, Steven Tuecke, and others
GENI Engineering Conference, NYC, October 28, 2013
www.ci.anl.govwww.ci.uchicago.edu
3
Cyberinfrastructure
• “a technological and sociological solution to the problem of efficiently connecting laboratories, data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge” [Wikipedia]
• AKA eScience, eResearch, Computer Supported Collaborative Work, Grid, …
www.ci.anl.govwww.ci.uchicago.edu
4
“The Anatomy of the Grid,” 2001 The … problem that underlies the Grid concept is coordinated
resource sharing and problem solving in dynamic, multi-institutional virtual organizations. The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource-brokering strategies emerging in industry, science, and engineering. This sharing is, necessarily, highly controlled, with resource providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the conditions under which sharing occurs. A set of individuals and/or institutions defined by such sharing rules form what we call a virtual organization (VO).
www.ci.anl.govwww.ci.uchicago.edu
5
Large Hadron Collider
Grid technology accelerates discoveryHiggs discovery “only possible because of the extraordinary achievements of … grid computing”—Rolf Heuer, CERN DG
http://gstat2.grid.sinica.edu.tw/gstat/vo/atlas/
LHC Computing Grid “virtual organizations”
www.ci.anl.govwww.ci.uchicago.edu
8
Complexity in research is large and growing
Run experimentCollect dataMove dataCheck data
Annotate dataShare data
Find similar dataLink to literature
Analyze dataPublish data
Time
www.ci.anl.govwww.ci.uchicago.edu
9
Process automation for discovery
Run experimentCollect dataMove dataCheck data
Annotate dataShare data
Find similar dataLink to literature
Analyze dataPublish data
Time
Discovery IT as a service
www.ci.anl.govwww.ci.uchicago.edu
10
First: File transfer as a service
DataSource
DataDestinatio
n
User initiates transfer request
1
Globus Online moves and syncs files
2
Globus Online notifies user
3
EasyFastReliableAvailableSecure
www.ci.anl.govwww.ci.uchicago.edu
12
Early adoption is encouraging
www.ci.anl.govwww.ci.uchicago.edu
13
Early adoption is encouraging
12,000 registered users; >150 daily>25 PB moved; >1B files
10x (or better) performance vs. scp99.9% availability
Entirely hosted on Amazon
www.ci.anl.govwww.ci.uchicago.edu
14
File X: Users A, B: RWDirectory Y: Group G: R
Next: Share big data from existing storage
DataSource
User A selects file(s) to share, selects user or group, and sets permissions
1
Globus Online tracks shared files; no need to move files to cloud storage!
2
User B logs in to Globus Online
and accesses shared file
3
X Y
www.ci.anl.govwww.ci.uchicago.edu
15
Globus Online is SaaS for science
Globus Nexus (Identity, Group, Profile)
Sharing Service
Transfer Service
Globus Toolkit
Glo
bu
s C
on
nect
SaaS
www.ci.anl.govwww.ci.uchicago.edu
16
We are now expanding to a platform
Globus Nexus (Identity, Group, Profile)
Sharing Service
Transfer Service
Globus Toolkit
Glo
bu
s O
nlin
e A
PIs
Glo
bu
s C
on
nect
SaaSPaaS
www.ci.anl.govwww.ci.uchicago.edu
17
Globus Toolkit
Sharing Service
Transfer Service
Globus Nexus (Identity, Group, Profile)
Glo
bu
s O
nlin
e A
PIs
Glo
bu
s C
on
nect
Globus Online: Platform-as-a-Service
www.ci.anl.govwww.ci.uchicago.edu
18
The identity challenge in science
• Research communities often need to– Assign identities to their users – Manage user profiles– Organize users into groups for authorization
• Obstacles to high-quality implementations– Complexity of associated security protocols– Creation of identity silos– Multiple credentials for users– Reliability, availability, scalability, security
www.ci.anl.govwww.ci.uchicago.edu
19
Sharing Service
Transfer Service
Globus Toolkit
Glo
bu
s O
nlin
e A
PIs
Glo
bu
s C
on
nect
Streamline collaborative tool development
Globus Nexus (Identity, Group, Profile)
Globus Nexus (Identity,
group, & profile management)
Custom Web Application
• Allows developers to focus on core application logic
• Simplifies integration with campus infrastructure
www.ci.anl.govwww.ci.uchicago.edu
20
Nexus provides four key capabilities• Identity provisioning
– Create, manage Globus identities• Identity hub
– Link with other identities; use to authenticate to services
• Group hub– User-managed groups; groups can
be used for authorization• Profile management
– User-managed attributes; can use in group admission
I
II I
I
Ia b
I
UV
G
Key points:1) Outsource
identity, group, profile management
2) REST API for flexible integration
3) Intuitive, customizable Web interfaces
www.ci.anl.govwww.ci.uchicago.edu
21
Identity provisioning
• Globus Nexus can act as an identity provider (IDP) for a project– User management, email validation…
• DOE Systems Biology Knowledge Base (kBase) is an example of such a project. ~400 identities to date
I
www.ci.anl.govwww.ci.uchicago.edu
22
Identity hub
• Link identities from other federated IDP(s) with a Nexus identity– E.g., InCommon/Campus (SAML), Google (OpenID),
XSEDE (OAuth MyProxy), IGTF-certified X.509 CA, SSH• Use linked identity to authenticate to Nexus
– E.g., use campus identity, XSEDE identity (via OAuth)• Leverage Nexus federated IDP to 3rd-party services
– Via OAuth or LDAP– E.g., to Jira, Zendesk, Drupal, Confluence
• Have Nexus cache delegated credentials– X.509, via CILogon and MyProxy
II I
I
www.ci.anl.govwww.ci.uchicago.edu
23
Identity management
www.ci.anl.govwww.ci.uchicago.edu
24
• Dr. Smith creates a Nexus id, via BIRN project interface• Dr. Smith links campus id and XSEDE id• Dr. Smith can then:
– Authenticate to BIRN with campus id– Query catalog (Nexus/BIRN id)– Request data transfer from BIRN
to campus (Nexus and campus ids)– Request transfer from BIRN
to XSEDE (Nexus and XSEDE ids)– Repeat these tasks: use cached
credentials
(BIRN=Biomedical Informatics Research Network)
BIRN Gateway
Campus(SAML)
BIRN Campus
CampusidentityNexus
identity
Name: Dr. SmithEmail: [email protected]: Dr. SmithEmail: [email protected] id: CampusLinked id: XSEDE
XSEDE
OAuthXSEDEidentity
Identity hub: Biomedical science
www.ci.anl.govwww.ci.uchicago.edu
25
Use linked identity
25
www.ci.anl.govwww.ci.uchicago.edu
26
Group hub
• User-managed group creation, management• Flexible control over admission policies and visibility• Groups can be used in authorization decisions
26
Example: kBase• Every kBase user
added to kbase_users• Subgroups also
created• Groups used for
access control
I
UV
G
www.ci.anl.govwww.ci.uchicago.edu
27
Group membership interface
27
www.ci.anl.govwww.ci.uchicago.edu
28
Branded sites
Open Science Grid University of ChicagoXSEDE
DOE kBase Indiana University University of Exeter
Globus Online NERSC NIH BIRN
www.ci.anl.govwww.ci.uchicago.edu
29
Implementation and deployment
Elastic Load Balancer
Monitoring
Logging
OSSEC
Nexus
REST APIWeb
Nexus
REST APIWeb
Nexus
REST APIWeb
www.ci.anl.govwww.ci.uchicago.edu
30
Globus Nexus usage as of 9/13
• >12,000 users and 4977 linked identities
• 557 groups totaling:– 1638 active members– 229 pending or
invited members– 162 rejected or
suspended members• Largest group (kbase)
has 402 members
Nov-10
Feb-11
May-11
Aug-11
Nov-11
Feb-12
May-12
Aug-12
Nov-12
Feb-13
May-13
Aug-130
2,000
4,000
6,000
8,000
10,000
12,000
14,000
Tota
l use
rs
1 11 21 31 41 51 61 71 81 91 1011111211311
10
100
1000
Use
rs in
gro
up
www.ci.anl.govwww.ci.uchicago.edu
31
Identities and groups in XSEDE• Proposal: Replace current ad-hoc systems with
Globus Nexus identity and group service– Reduce complexity, reduce cost, increase capability
• Careful process of documentation and review– “Architecture and development requirements: User
and identity management”– “User management proposal: Affected use cases”– “User management proposal: Motivating stories”– “Proposal: Refactoring XSEDE identity and group
capabilities”• Hope to reach closure by end of 2013
www.ci.anl.govwww.ci.uchicago.edu
32
Cloud services to accelerate discovery
Accelerate discovery and innovation worldwide by providing research IT as a service
Leverage software-as-a-service to• provide millions of researchers with
unprecedented access to powerful tools; • enable a massive shortening of cycle times in
time-consuming research processes; and• reduce research IT costs dramatically via
economies of scale
www.ci.anl.govwww.ci.uchicago.edu
Thanks to ...U.S . DEPARTMENT OF
ENERGY
www.ci.anl.govwww.ci.uchicago.edu
Thank you! Questions?
[email protected]@uchicago.edu
www.globusonline.org