The Big Questions
Nature of the universe Consciousness
Life & deathFuture of the planet
eSS 2007 Service-Oriented Science: Globus Software in Action 2
How Do We Answer Them?
<0 1700 1950 1990
Empirical
Data
Theory
Simulation
eSS 2007 Service-Oriented Science: Globus Software in Action 3
The Same is True of Smaller Questions
Designing new chemical catalysts
Selling advertising
Creating entertainment
Finding parking
eSS 2007 Service-Oriented Science: Globus Software in Action 4
Information Technology and Science
“Paul Erdös claimed that a mathematician is a machine for turning coffee into theorems. The scientist is arguably a machine for turning data into insight.”
— Service-Oriented Science, I. Foster, Science, 308, p. 814.
eSS 2007 Service-Oriented Science: Globus Software in Action 5
What are the Products of Science? Papers
“We learned this, and here’s how.” Data and Datasets
“We collected this data. Download it, write an analysis program, and see what you can learn from it.”
Web Portals “We constructed this scientific model. Use our data or bring your own, supply
some parameters, and see how it behaves.” Requires manual operation.
Web Services “Here’s our climate model. Integrate it with your models for [ocean
currents/weather/crop forecasts] and see what happens.” “Here’s our indexed data from the latest experiment run. Run your filters
against it and see if you can find anything interesting.” “Here’s our genome analysis engine. Upload your proteins and see what they
will do in a cell.”
Increasing degrees of
collaboration
eSS 2007 Service-Oriented Science: Globus Software in Action 6
Grid:An Enabler of eScience
Must we buy (or travel to) a power source?
Or can we ship power to where we want to work?
Enable on-demand access to, and integration of,diverse resources & services, regardless of location
The dubious electrical power grid analogy
eSS 2007 Service-Oriented Science: Globus Software in Action 7
1st Generation Grids Focus on aggregation of many resources for
massively (data-)parallel applications
EGEE
eSS 2007 Service-Oriented Science: Globus Software in Action 8
Second-Generation Grids
Empower many more users by enabling on-demand access to services
Science gateways (TeraGrid)
Service oriented science
Or, “Science 2.0”
“Service-Oriented Science”, Science, 2005
eSS 2007 Service-Oriented Science: Globus Software in Action 9
“Web 2.0” Software as services
Data- & computation-richnetwork services
Services as platforms Easy composition of services to create new
capabilities (“mashups”)—that themselves may be made accessible as new services
Enabled by massive infrastructure buildout Google projected to spend $1.5B on computers,
networks, and real estate in 2006 Many others are spending substantially
Paid for by advertising
Declan Butler,Nature
eSS 2007 Service-Oriented Science: Globus Software in Action 10
Automating Science
Human access to data is nice Automated access by software tools is
revolutionary “In the time that a human user takes to locate one
useful piece of information within a Web site, a program may access and integrate data from many sources and identify relationships that a human might never discover unaided.” - Foster
eSS 2007 Service-Oriented Science: Globus Software in Action 11
Science 2.0:E.g., Virtual Observatories
Data ArchivesData Archives
User
Analysis toolsAnalysis tools
Gateway
Figure: S. G. Djorgovski
Discovery toolsDiscovery tools
eSS 2007 Service-Oriented Science: Globus Software in Action 12
Service-Oriented Science
People create services (data or functions) …
which I discover (& decide whether to use) …
& compose to create a new function ...
& then publish as a new service.
I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!
I hope that this “someone else” can manage security, reliability, scalability, …!!
“Service-Oriented Science”, Science, 2005
eSS 2007 Service-Oriented Science: Globus Software in Action 13
Are Scientists Really Developing Web Services?
eSS 2007 Service-Oriented Science: Globus Software in Action 14
Cancer Bioinformatics Grid
Common system architecture (caGrid) provides the service interface “plumbing” and the service hosting capability Web services
Community participants supply useful services Data access, analysis, modeling, filtering,
authoring, etc. https://cabig.nci.nih.gov/
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
eSS 2007 Service-Oriented Science: Globus Software in Action 15
The Introduce Authoring Tool Define service Create skeleton Discover types Add operations Configure security Modify service
Introduce: Hastings, Saltz, et al., Ohio State University
Generates GT4-compatible WebServices
eSS 2007 Service-Oriented Science: Globus Software in Action 16
The Importance of “Hosting”and “Management”
Tell me aboutthis star Tell me about
these 20K stars
Support 1000sof users
E.g., Sloan DigitalSky Survey, ~10 TB;others much bigger
eSS 2007 Service-Oriented Science: Globus Software in Action 17
Skyserver Sessions(Thanks to Alex Szalay)
eSS 2007 Service-Oriented Science: Globus Software in Action 18
Hosting & Management:Application Hosting Services
ResourceProvider
ApplnCode
ApplnCode
Application client
AHS
management Hosting Service
Author ization
ResourceProvider
Provisioning
Persistence
Users
Admins
PDP
Policymanagement
Application deployment
ApplicationPrep Tool(s)
ApplnCode
Applicationproviders
eSS 2007 Service-Oriented Science: Globus Software in Action 19
Who Will Host Your Services?
Your institution (campus resources) (Inter)national systems
TeraGrid, Open Science Grid, UK Nat’l Grid Service, ChinaGrid, NaukaGrid, etc.
Science domain systems caBIG, NEES, Earth System Grid, Orion*,
LEAD, NEON*, LHC Computing Grid, etc. Commercial systems
Amazon, Google, etc.
eSS 2007 Service-Oriented Science: Globus Software in Action 20
Examples ofProduction Scientific Grids
APAC (Australia) China Grid China National Grid DGrid (Germany) EGEE NAREGI (Japan) Open Science Grid Taiwan Grid TeraGrid ThaiGrid UK Nat’l Grid Service
eSS 2007 Service-Oriented Science: Globus Software in Action 21
Application-Infrastructure Gap
Dynamicand/or
DistributedApplications
A
1
B
1
99
Shared Distributed Infrastructure
eSS 2007 Service-Oriented Science: Globus Software in Action 22
Bridging the Application-Resource Gap
IBM
IBM
Uniform interfaces,security mechanisms,Web service transport,
monitoring
Computers StorageSpecialized resource
UserApplication
IBM
IBM
GRAM GridFTPHost EnvUser Svc
DAIS
Database
ToolTool
Workflow
Credent.
Host EnvUser Svc
Registry
eSS 2007 Service-Oriented Science: Globus Software in Action 23
Grid Infrastructure
Distributed management Of physical resources Of software services Of communities and their policies
Unified treatment Build on Web services framework Use WS-RF, WS-Notification (or WS-
Transfer/Man) to represent/access state Common management
abstractions & interfaces