restful web services for scientific computing
DESCRIPTION
RESTful Web Services for Scientific Computing. Joshua Boverhof, LBNL Shreyas Cholia, NERSC/LBNL OSCON 2011 July 28 2011, Portland OR. NERSC. National Energy Research Scientific Computing Center DOE Office of Science HPC User Facility at Lawrence Berkeley Lab - PowerPoint PPT PresentationTRANSCRIPT
RESTful Web Services for Scientific ComputingJoshua Boverhof, LBNLShreyas Cholia, NERSC/LBNLOSCON 2011July 28 2011, Portland OR
NERSC
•National Energy Research Scientific Computing Center
•DOE Office of Science HPC User Facility at Lawrence Berkeley Lab
•Provides high performance compute, data, network and information services to scientists across the world
NERSC HPC Clusters
Web Gateways
•Old way - SSH + command line + batch system
•People now expect web interfaces for everything
•Usability - scientific computing should be as easy as online-banking
•don’t want generic options/tools not applicable to your science
•don’t want to deal with backend, middleware, UNIX CLI etc.
NERSC Scientific Gateways
DeepSky
- Astronomical Image Database 11 million images (70TB) The Gauge Connection
- QCD Lattice Gauge Data CXIDB
- X-Ray Image Data Bank 20th Century Reanalysis
- Reanalysis of 20th Century Climate Data Dayabay
- Dayabay Neutrino Detector Gateway ESG
- Earth System Grid Climate Gateway and Data-node
Motives for developing NERSC Web Toolkit (NEWT)
•Make it very easy for science teams to build web gateways to their data and computation
•We have already built several science specific gateways - want to encapsulate common patterns
•Provide Web APIs for access to backend resources for portal and web front-end developers.
NEWT Web Stack
•Web Service
•Built with Django Web Framework
•Exposes NERSC Resources as HTTP URLs
•Generally use REST conventions
•Access HPC Resources over the web using HTTP + JSON
•Frontend Development
• javascript Library “newt.js”
•AJAX
Things you can do ...
•Authenticate using NERSC credentials
•Check machine status
•Upload and download files
•Submit a compute job
•Monitor a job
•Get user account information
•Store app data
•Issue UNIX commands
Architecture
NEWT Django
Client: Web Application - HTML 5/AJAX
System Resources (via Globus)
Persistent Store (NoSQL DB)
Accounting Information
Files
Batch Jobs
Shell Commands
Status
CouchDB NIM
Authentication
MyProxy CA
Internal DB:session, cred, user information
http request JSON data
RESTful Conventions
•Resources represented as a set of URLs
• HTTP verbs
• GET: Idempotent operation, retrieve resource representation
• PUT: Idempotent operation, set resource representation
• DELETE: Idempotent operation, delete resource
• POST: Avoid overloading to use as RPC. Typically use as a factory resource.
NEWT Resources
resource description
login Login information
/file/[machine]/[path]/ List, Upload, Download file
/status/[machine] Machine status, uptime, queue stats
/job/jobs/[id] The user’s jobs across all resources
/job/[machine]/fork/ Fork factory resource
/job/[machine]/batch/ Batch factory resource
/queue/[machine]/ Batch queue factory resource
/account/[NIM resource] Account information, cpu hours, accounts
Login Resource: Authenticate
$.newt_ajax({
url: ”/auth/",
type: ”POST",
data: {'username':username, 'password':password},
success: (res, textStatus, jXHR) {} });
Login Resource
$.newt_ajax({
url: "/login/",
type: ”GET",
success: function(data){},
});
•200 OK
•{"username": ”joe", "session_lifetime": 14384, "auth": true}
Queue Resource: PBS job submission
$.newt_ajax({
url: "/queue/hopper/",
type: "POST",
data: {"jobfile": filename},
success: function(data){
$("#output").append(data.jobid);
},
});
This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like
{"status": "OK", "error": "", "jobid" : "hop1234.id" }
Queue Resource: PBS job submission
$.newt_ajax({
url: "/queue/franklin/",
type: "POST",
data: {"jobscript”: “#PBS -l mppwidth=8\n mpirun -n 8 /bin/hostname”},
success: function(data){
$("#output").append(data.jobid);
},
});
This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like
{"status": "OK", "error": "", "jobid" : "7259874 " }
Command Resource: Fork job submission
$.newt_ajax({
url: “/command/franklin",
type: "POST",
data: {”executable": “/bin/date”},
success: function(data){
$("#output").append(data.jobid);
},
});
This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like
{"output": "Wed Jul 20 22:51:58 PDT 2011", "error": ""}
Simple Usage curl$ curl -k -c cookies.txt -X POST -d "username=boverhof&password=$PASS" https://portal-auth.nersc.gov/newt/auth;
{"username": "boverhof", "session_lifetime": 14397, "auth": true}
$ curl -k –b cookies.txt -X GET https://portal-auth.nersc.gov/newt/status/franklin;
{"status": "up", "system": "franklin"}
$ curl -k –b cookies.txt -d "executable=/bin/date" https://portal-auth.nersc.gov/newt/job/franklin/fork/;
{"status": null, "executable": "/bin/date", "user_id": 18, "url": "https://franklingrid.nersc.gov:60886/81661/1311833735/", "jobmanager": "fork", "submitted": "2011-07-20T06:15:35", "machine": "franklin", "finished": null, "output": null, "id": 47789}
$ curl -k –b cookies.txt -X GET https://portal-auth.nersc.gov/newt/job/jobs/47789;
{"status": "DONE", "executable": "/bin/date", "user_id": 18, "url": "https://franklingrid.nersc.gov:60886/81661/1311833735/", "jobmanager": "fork”, "submitted": "2011-07-20T06:15:35", "machine": "franklin", "finished": "2011-07-20T06:15:36", "output": "Wed Jul 20 23:15:36 PDT 2011\n", "id": 47789}
Django settings: Pluggable Authentication
•Authenticate using NERSC credentials to a myproxy-server
•Add AuthenticationMiddleware
django.contrib.auth.middleware.AuthenticationMiddleware
• Configure authentication backend
AUTHENTICATION_BACKENDS = ( 'newt.authnz.myproxy_backend.MyProxyBackend’ )
• Implement authentication backend
class MyProxyBackend:
def authenticate(self, username=None, password=None):
# Myproxy logon
Django settings: File Upload
•File Upload: Upload to portal, store in temporary file, then transfer to remote file system.
•Configure file upload handler ( settings.py )
FILE_UPLOAD_HANDLERS= ( 'newt.file.uploadhandler.RemoteCopyTemporaryFileUploadHandler’ )
• Implement authentication backend
from django.core.files.uploadhandler import TemporaryFileUploadHandler as _TemporaryFileUploadHandler
class RemoteCopyTemporaryFileUploadHandler(_TemporaryFileUploadHandler):
def upload_complete(self):
# Transfer to remote filesystem
Implementation Details ( Hacks )
• Django v1.[1,2,3?] support for HTTP verbs lacking
•PUT: Data is not loaded, used code “coerce_put_post” from django-piston
• Looking at using Tastypie, a webservice API framework for Django. It provides a convenient, yet powerful and highly customizable, abstraction for creating REST-style interfaces.
NOVA: VASP portal
NOVA: VASP portal
NOVA: VASP portal
NOVA: VASP portal
https://newt.nersc.govhttps://portal-auth.nersc.gov/nova/