life science grid middleware in a more dynamic environment
DESCRIPTION
Life Science Grid Middleware in a More Dynamic Environment. Milena Radenkovic & Bartosz Wietrzyk The University of Nottingham, UK http://www.mygrid.org.uk. Talk Plan. From Grid middleware to WSRF and WSN myGrid overview Integrating myGrid with WSRF/WSN Future: self-organizing Grids. - PowerPoint PPT PresentationTRANSCRIPT
GADA Workshop 1-2 November 2005
Life Science Grid Middlewarein a More Dynamic Environment
Milena Radenkovic & Bartosz Wietrzyk
The University of Nottingham, UK
http://www.mygrid.org.uk
GADA Workshop 1-2 November 2005
Talk Plan
1. From Grid middleware to WSRF and WSN
2. myGrid overview
3. Integrating myGrid with WSRF/WSN
4. Future: self-organizing Grids
GADA Workshop 1-2 November 2005
Web Services
• Web services - the application-centric Web– Standards for message exchanges and interfaces– XML based– Programming language and platform independent
• Convergence of Grid and Web Services• Web Services and the State• Failure of the Open Grid Service Infrastructure
– No modularity– Limited compatibility with existing Web Services– Too object oriented
GADA Workshop 1-2 November 2005
WSRF and WSN
• Web Service Resource Framework (WSRF)– Generic and open framework
for modelling and accessing stateful resources using Web Services
– Standardizing the design patterns and message exchanges for expressing state
– Instruction set for the Grid [Priol, 2005]
• Web Service Notification (WSN)– WSRF based
publish/subscribe notification
WSDLSOAP WS-Addressing
WSRF WS-Notification
WS-ResourceProperties
Obligatory
WS-BaseFaults
WS-RenewableReferences
WS-ResourceLifetime
WS-ServiceGroups
Optional
WS-BaseNotification
WS-Topics
WS-BrokerNotification
Obligatory
Optional
GADA Workshop 1-2 November 2005
Resource modelling in WSRF
• Stateful resource + stateless Web Service = WS-Resource• WS address + resource identifier = WS-Resource qualified
endpoint reference• Dynamic creation/destruction of resources• The resource state defined by the resource properties
document
GADA Workshop 1-2 November 2005
Talk Plan
1. From Grid middleware to WSRF and WSN
2. myGrid overview
3. Integrating myGrid with WSRF/WSN
4. Future: self-organizing Grids
GADA Workshop 1-2 November 2005
myGrid
• One of the leading EPSRC eScience pilot projects• Open Source Semantic Grid middleware for
Bioinformatics• High-level services for data and application integration
– resource discovery– distributed query processing– workflow enactment
• Additional services supporting scientific method– provenance management– change notification– personalization
GADA Workshop 1-2 November 2005
myGrid architecture
Legacyapplications Web sites Web services
OGSA-DAIdatabases
SoaplabGowlab
OGSA-DAI DQPservice
AMBITtext extraction
service
myGridInformation
model
Feta semantic discovery
Pedro semantic
publication
Pedro semantic
publication
Ser
vice
and
wor
kflo
w
disc
over
y
mIR metadata
store
myGridontology
Pedro semantic
publication
Met
adat
a m
anag
emen
t
Provenance capture
ExternalServices
FreefluoWorkflowengine
WorkflowManagement
mIR myGridInformationrepository
DataManagement
Notificationservice
E-Sciencemediator
LSID support
E-Science coordination
E-S
cience events
Tavernae-Scienceworkbench
Webportals e-Science
processpatterns
CoreServices
LSIDLaunchapad
Haystack
UtopiaThi
rd-
part
y to
ols
Web service communication fabric
GADA Workshop 1-2 November 2005
In silico experiments in myGrid
Scufl Simple Conceptual Unified Flow LanguageTaverna Writing, running workflows & examining resultsSOAPLAB Makes applications available
Freefluo Workflow engine to run workflows
Freefluo
SOAPLABWeb Service
Any Application
Web Service e.g. DDBJ BLAST
SeqHoundService
GADA Workshop 1-2 November 2005
Soaplab Service
WSDL Web Service BioMOBY Service
Local Java Service
GADA Workshop 1-2 November 2005
Talk Plan
1. From Grid middleware to WSRF and WSN
2. myGrid overview
3. Integrating myGrid with WSRF/WSN
4. Future: self-organizing Grids
GADA Workshop 1-2 November 2005
myGrid’s stateful components
• myGrid Information Repository (MIR)– Data entities
• Workflow Enactment– Enactment services– Workflow enactments
• myGrid Notification Service
GADA Workshop 1-2 November 2005
myGrid Information Repository (MIR) – before WSRF
• MIR data model comprises entity types associated with XML schemas
• Entities are:– described by attributes– stored in a relational database– accessed through the Web Service
interface– Identified by Life Science IDs
GADA Workshop 1-2 November 2005
Our model
MIR entity WS-Resource
Entity type WS-Resources type
Entity attribute WS-Resource property
LSIDWS-Resource
Qualified Endpoint Reference
GADA Workshop 1-2 November 2005
Our new data architecture
WS-Resource
Data resource hosting server
Public resource factory and
discovery serverRFAD service
RFADservice
Client
ClientWS-Resource
WS-Resource
Data resource hosting server
Public resource factory and
discovery serverRFAD serviceRFAD
service
Client
ClientWS-Resource
Res
ourc
e cr
eatio
n an
d di
scov
ery
GADA Workshop 1-2 November 2005
myGrid’s stateful components
• myGrid Information Repository (MIR)– Data entities
• Workflow Enactment– Enactment services– Workflow enactments
• myGrid Notification Service
GADA Workshop 1-2 November 2005
Our new enactment architecture
EnactmentGroupresource
Enactmentresource
Enactment server
Ena
ctm
ent c
reat
ion Enactment creation and
discovery server
EnactmentFactory
Client
Client
Ena
ctm
ent c
reat
ion
Enactmentresource
Enactmentresource
Enactment server
EnactmentFactory
Enactmentresource
GADA Workshop 1-2 November 2005
myGrid’s stateful components
• myGrid Information Repository (MIR)– Data entities
• Workflow Enactment– Enactment services– Workflow enactments
• myGrid Notification Service
GADA Workshop 1-2 November 2005
myGrid Notification Service• Every WS-Resource can be a notification producer and manage its
subscription• Notification Brokers are optional – not necessary for simple deployments
NotificationProducer 1
NotificationProducer 2
NotificationProducer 3
NotificationProducer 4
NotificationProducer 5
NotificationConsumer
NotificationConsumer
NotificationConsumer
NotificationConsumer
NotificationConsumer
NotificationConsumer
Topicaggregation
DistributedMessageDelivery
Topic Set 1 Topic Set 2 Topic Set 3 Topic Set 4 Topic Set 5
Topic Set 3
Topic Set 4
Topic Set 5
Topic Set 1
Topic Set 2Notification
Broker
Topic Set 3
Topic Set 4
Topic Set 5
Topic Set 1
Topic Set 2Notification
BrokerTopic Set 3
Topic Set 4
Topic Set 5
Topic Set 1
Topic Set 2Notification
Broker
• Notification Brokers can:– Aggregate topics from
different notification producers to support their discovery
– Distribute the task of message delivery to increase its speed and decrease the network congestion
GADA Workshop 1-2 November 2005
Why Apache WSRF/Pubscribe?
• Increased compatibility with the implemented myGrid components (Java API)
• Dynamic creation of WS-Resources• Call-backs for modification of WS-Resources• High portability (compatible with any Java servlet
container)• Free and Open Source
GADA Workshop 1-2 November 2005
Advantages of the integration
• More flexible, distributed and scalable architecture• More scalable, distributed and lightweight notification
infrastructure• One coherent interface to all components• Decreased design efforts in the future• Compatibility with any servlet container• Easier integration with third party software and UK’s
National Grid Service
GADA Workshop 1-2 November 2005
Talk Plan
1. From Grid middleware to WSRF and WSN
2. myGrid overview
3. Integrating myGrid with WSRF/WSN
4. Future: self-organizing Grids
GADA Workshop 1-2 November 2005
Future: self-organizing Grids
• Current limitations of myGrid:– Naming scheme depends on the DNS servers– State is only available when the hosting machine is
online– Deployment and maintenance requires high
administration effort• Our current work:
– Using Distributed Hash Tables (DHTs) to provide self-organization
– Using self-organized, distributed caching of the state to increase its availability
GADA Workshop 1-2 November 2005
Conclusions
• Our work is generic and applicable for other existing higher level middleware projects
• WSRF/WSN standards are well suited for the complex higher level middleware
• However migration may require a significant coding effort
GADA Workshop 1-2 November 2005
EPSRC funded UK eScience Program Pilot Project
Some slides taken from Carole Goble
GADA Workshop 1-2 November 2005
Core• Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro
Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Jan Humble, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pocock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Ian Roberts, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson, Jimi Worthington and Chris Wroe.
Users• Simon Pearce and Claire Jennings, Institute of Human Genetics School of
Clinical Medical Sciences, University of Newcastle, UK• Hannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester,
UK• Steve Kemp, Liverpool, UKPostgraduates• Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, Keith Flanagan, Antoon
Goderis, Tracy Craddock, Alastair Hampshire, Bartosz WietrzykIndustrial • Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM)• Robin McEntire (GSK)Collaborators• Keith Decker
GADA Workshop 1-2 November 2005
References
• Publications on– Home page: www.mrl.nott.ac.uk/~bzw/– myGrid site: www.mygrid.org.uk