the clarens web service framework frank van lingen (caltech) [email protected] on behalf of the...

36
The Clarens Web Service Framework Frank van Lingen (Caltech) [email protected] on behalf of the Clarens developers

Upload: destiney-plumb

Post on 30-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

The Clarens Web Service Framework

Frank van Lingen (Caltech)[email protected]

on behalf of the Clarens developers

Page 2: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Outline• Introduction• Core Services

– VO Management– ACL Management– Shell Service– Discovery– File Access

• Test framework• Portals• Performance• Project Web Service Description (if time permits)

Page 3: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Core development and testing

• Conrad Steenberg (Python)

• Michael Thomas (Java)

• Tahir Azim (Java)

• Frank van Lingen (Python)

Page 4: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Web Services (Framework)

• A Web Service is a component performing a task, most likely over a network. A Web Service can be identified by a URI and its public interfaces and bindings are described using WSDL. At the basis of a Web Service call (invocation) is a protocol (frequently, but not exclusively this is XML-RPC or SOAP

• A Web Service Framework is an application that provides support for developing and deploying Web Services.

Page 5: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Motivation for Clarens

• Scientific collaborations are becoming more and more geographically dispersed. – Example CMS Experiment: 2000 Physicists, 150

institutes, 30 countries

• Web Services identified as an important component to create a scalable globally distributed system

• Initially Clarens was driven by the CMS community: Provide a framework to support distributed physics analysis

Page 6: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Web Service in Clarens

• Web Service + :– Access Control– Authorization– Discoverable in Distributed System– State management

Page 7: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Some of the Functionality needed to Support (Scalable) Distributed Analysis (within CMS)

• Authentication • Access control on Web Services.• Remote file access (and access control on files).• Discovery of Web Services and Software.• Shell service. Shell like access to remote machines

(managed by access control lists).• Proxy certificate functionality• Virtual Organization management and role

management.• Good performance of the Web Service Framework

This is not an exhaustive list

List not restricted to CMS but Scientific Collaborations in general

Page 8: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

dbsvr=Clarens.clarens_client('http://….:80/dbsvr=Clarens.clarens_client('http://….:80/clarens/')clarens/') dbsvr.echo.echo('alive?')dbsvr.echo.echo('alive?')dbsvr.file.size('index.html')dbsvr.file.size('index.html')dbsvr.file.ls('/web/system','*.html')dbsvr.file.ls('/web/system','*.html')dbsvr.file.find(['//web'],'*','all')dbsvr.file.find(['//web'],'*','all')

ClienClientt

Web Web serverserver

ServiceService

33rdrd party party applicatioapplicatio

nn

Secure cert-based accessed to Secure cert-based accessed to services through browserservices through browser

http/http/httpshttps

ClarenClarenss

ClarenClarenss

Clarens

XML-RPCSOAPJava RMIJSON RPC

Page 9: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Clarens (Python and Java)

HTTP Client

Tomcat Web Server

XML-RPC engineAXIS (SOAP)

/xmlrpc servlet

JClaren

s

Service Management

Remote File Access

VO Management

Discovery

Databases

PKI Security

Core ServicesUtilities

Service Management

Remote File Access

VO Management

Discovery

Databases

PKI Security

Core ServicesUtilities

Process Management

XML-RPC SOAP GET

MOD_PYTHON

Apache Web Server

HTTP Client

(P)C

lare

ns (WAN) Network

Configuration

Configuration

Page 10: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Virtual Organization and Role Management

DN1, DN2, …

Members

Group: Admins

DN1, DN2, …

DN1, DN2, …

Members

Admins

Group A

DN1, DN2, …

DN1, DN2, …

Members

Admins

Group B

DN1, DN2, …

DN1, DN2, …

Members

Admins

Group C

DN1, DN2, …

DN1, DN2, …

Members

Admins

Group A.1

DN1, DN2, …

DN1, DN2, …

Members

Admins

Group A.2

DN1, DN2, …

DN1, DN2, …

Members

Admins

Group A.3

•Clarens server instance manages tree-like group structure•Group administrators authorized to add/delete group members, as well as groups at lower levels •Groups can define VOs•Subgroup of VOs can define roles. •User with multiple roles has DN in multiple groups

/O=doesciencegrid.org/OU=People/CN=John Smith 12345

/O=doesciencegrid.org/OU=People Entry Examples:

Page 11: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers
Page 12: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Access Control Management

• Enables administrators to deny or allow groups (VOs) of using resources.

• ACLs on the server system is controlled by a set of hierarchical ACLs modelled after the access control (.htaccess) files used by Apache – method.service– method.submodule.method

• In Python Clarens, set ACLs in .clarens_access files

Page 13: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Access Control Management (example)

Object Field Value

mod orderallow DNsdeny DNsdeny groups

deny, allow/O=doesg.org/OU=People/CN=John J/O=doesg.org/OU/People/CN=Kate KPhysics.LHC.CMSPhysics.CDF/O=oldumi/OU=physics/CN=old

accountCrackers

mod.meth orderAllow DNsAllow groupsDeny DNsDeny groups

deny,allowPhysics.USA.CaltechPhysics.USA.UFL/O=Caltech/OU=CACR/CN=Ed Peng

Page 14: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

ORDER_ALLOW_DENY=0ORDER_DENY_ALLOW=1access=[("",[ORDER_DENY_ALLOW, # Order ["/"], # Allow everybody who can log in [], # Allow group ["/guest"], # Deny indiv default=all [] , # Deny default=all [None, None, None]]), ("delete_admin",[ORDER_ALLOW_DENY, # Order [], ["admin"], # Allow group [], # Deny indiv default=all [] , # Deny default=all [None, None, None]]), ("list_admin",[ORDER_ALLOW_DENY, # Order [], ["admin"], # Allow group [], # Deny indiv default=all [] , # Deny default=all [None, None, None]]), ("auth",[ORDER_ALLOW_DENY, # Order ["/"], [], # Allow group [], # Deny indiv default=all [] , # Deny default=all [None, None, None]])] # modtime, start_time, end_time

Page 15: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers
Page 16: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Remote File Access• Enable scientists access to remote data

using well known (file) interfaces

• Deny or allow read or write access on these remote files, to groups of collaborators.

file.read(<file name>)file.ls(<dir>)file.md5(<file name>)file.stat(<file name>)

Several File Service Methods

Page 17: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers
Page 18: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Discovery• Servers/Services/Software:

– Crash– Disappear– Move– Are upgraded– Locally Controlled

Dynamic Distributed Environment!

register(<service/software description>)

find_server(<query>)

find(<service/software query>)

deregister(<service/software description>)

Discovery Service Interface

Page 19: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Discovery

MonALISA JINI Network

Clarens Servers

Clarens Discovery Servers (JINI Clients)

Clients

CSCS

CS CS

SSSS

DSDS

Station Servers

CLCL

CL

Can be a Clarens Server

Page 20: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers
Page 21: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Shell Access

• Controlled Access to sites using a shell environment

• Users DNs are mapped to a local user name

• Execution of commands/applications in a sandbox

• File service can be used to navigate sandbox hierarchy

Page 22: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Testing

• Deploy Clarens servers and services, but are they working as intended?

• Who should be notified when something fails

Page 23: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers
Page 24: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers
Page 25: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Portal Functionality

• User's point of access to a Grid system.• Provides environment where user can:

– Access Grid resources and services.– Execute and monitor Grid applications.– Collaborate with other users. – One stop shop for Grid needs

Portals can lower the barrier for users to access Web Services and using Grid

enabled applications

Page 26: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Clarens Portals

• Clarens does not have a framework to build portals

• Portals are dynamic JavaScript based web pages (with authorization and access control)

Web Web serverserver

ServiceService

33rdrd party party applicatioapplicatio

nn

http/httpshttp/https

ClarenClarenss

JavaScript/JavaScript/HTML GUIHTML GUI

JSON

Page 27: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Performance•Dual 2.8 GHz Xeon server with 1 GB of memory, accessed •100 Mb/s local area network.•Configurable number of unencrypted client connections were opened and set to access the system.list_methods as rapidly as possible. •Client: a 2.6 GHz Pentium 4 workstation as a single process opening connections to the server and completing requests asynchronously

Page 28: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Performance

Page 29: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Projects Using Clarens

• Ultralight. The network as resource.• PEAC. PROOF in a distributed service environment.• Physh. Client using web services to access and merge

catalog information.• SPHINX. Grid scheduler.• Lambda Station. Programmatically based access to

routers and switches.• IGUANA. Graphical Display that can access data

through Clarens Web Services.• MCPS. Providing Services to submit batch analysis jobs.• HotGrid. Gradual access to Grid resources.• OSG (Open Science Grid). Uses the Clarens discovery

service

Page 30: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Third Party Service Example

MCPS

Page 31: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

•Provide a simple front end for user to allow them to execute (potentially complex) workflows on the Grid by accessing a Web Service on a tier2.

•Specify (user exposed parameters)•Upload user code and proxy•Verify•Execute

•Enable users to use the output of one workflow as input to another (creating their own workflow)•Workflows have been specified within Python based on RunJob (Tool developed at FNAL)•User should be able to close their laptop and resume later•Provide a simple client (first python, in the future a browser interface) to minimize user exposure to Grids and Web Services

Design Goals:

Page 32: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Generate Simulate Analyze

Many different generators, and simulations plugin user code

Filter 1 (merge)

Filter2 (differ)

Filter3 (skim)

Dataset1

Dataset2

Dataset3

Dataset4

Dataset6

Dataset5

Specified inRunJob and expose several (but not all) parameters

to users

Page 33: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

class Workflow_Example1:

def __init__(self): i=0

def execute(self,parameter_values,user_sandbox):# //# // The factory file needs to be pointed at by the SHAHKAR_FACTORY_EXT variable#// f=open(user_sandbox+'/MCPS_test.txt','w') f.write('test succeeded') f.close()

# for now we do not have any verification. But if we have they would need # to write the output in a file that can be read by the server. If everything # is ok. this file will be empty. def verify(self,user_parameters,user_sandbox): i=0

How does a workflow specification look like?

Specified in a class•Two methods: verify and execute•Execute contains the “meat”: The MCPS code and ingests the user parameter choices.•Verify can potentially be used to do some MCPS specific verification (optional)•Both methods have a user_sandbox parameter which might be needed by MCPS when executing job

Page 34: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

How does a workflow specification look like? Some real MCPS code in the execute method.

def execute(self,parameter_values,user_sandbox): factoryPath = "/home/users/evansde/MCPSInstall/MCPS-dist/Xml/MCPSRunjobFactory.xml" os.environ['SHAHKAR_FACTORY_EXT'] = factoryPath

installPath = "/home/users/evansde/MCPSInstall/MCPS-dist/Python" sys.path.append(installPath)

from MCPSPython.MCPSUserInterface import createUserInterface

mcps = createUserInterface( JobCacheArea = "./jobs", JobName = "Example4-%s" % os.environ['USER'] , UniqueID = 1, RuntimePythonBinary = "/usr/bin/python2", ShREEKExecutorOptions = "-v --exitonerror" )

cfgfilesDir = "/home/users/evansde/cfgfiles/"

cmkin = mcps.attachExecutable("CMKIN") cmkin['ProjectVersion'] = parameter_values['ProjectVersion'] cmkin['Executable'] = "kine_make_ntpl_pyt6227.exe" cmkin['BeamEnergy'] = 14000. cmkin['NumEvents'] = parameter_values['NumEvents'] cmkin['RunNumber'] = parameter_values['RunNumber'] cmkin['InputCardfile'] = cfgfilesDir + "cmkin-cardfile.txt" cmkin['ExistingInstallation'] = parameter_values['ExistingInstallation']

User parameters

(up to the workflow designer to decide what parameters to expose)

Page 35: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Future Work

• Web Start based GUI

• IM based functionality

• Integration with dcache (mass storage) and SRM (Storage Resource Broker)

• Hierarchical access controlled metadata

Page 36: The Clarens Web Service Framework Frank van Lingen (Caltech) fvlingen@caltech.edu on behalf of the Clarens developers

Conclusions

• Provides a Java/Python based high performance Web Service Framework.

• Contains a set of core services (file,service management,access control, discovery) needed to support scientific analysis.

• Adopted by several projects for Web Service development.

• More information @: http://clarens.sourceforge.net