opendap present and future an overview encompassing current projects & potential new directions...

18
OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

Upload: ferdinand-stafford

Post on 23-Dec-2015

218 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP Present and Future

An Overview Encompassing Current Projects & Potential New Directions

Dave Fulker and James Gallagher

Page 2: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 2

Rough Outline

• Background• OPULS (an OPeNDAP-Unidata collaboration)– DAP4 (to supersede DAP2)– Experimental extensions (Async access, UGRID subsets)

• Hyrax over Amazon/S3• Elaboration on server functions– Perhaps binning, masking, a functional language?– Relationship to WPS & other Web services

• Hyrax (& WCS) in OWS-9

Page 3: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 3

Origins

• Scientists (ocean fluxes & temps) envisaged use of http for remote data access (1993)

• Collaboration with the designer of the JGOFS data system…

• Led to Distributed Ocean Data System (DODS)• DODS later was renamed OPeNDAP

(to be explained momentarily…)

Page 4: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

4

OPeNDAP Now Is:

• An acronym– “Open-source Project for a Network Data Access Protocol”– Often a synonym for “DAP”

• A not-for-profit corp. developing/supporting– “DAPx” - a web-services protocol for data access• Deployed by hundreds of data providers internationally• Employed in many analysis packages (MATLAB, e.g.)• Designated a “Community Standard” by NASA

– Server & client implementations* of DAP*Note: there are other implementations

Page 5: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 5

Available Software

• Free end-user applications that include DAP support: panoply, idv, nco, …

• Commercial: IDL, Matlab, ArcGIS• SDKs: The netCDF C and Java libraries; OC;

libdap; Java OPeNDAP, PyDAP– Each of these provides its own API and they span

C, C++, Java and Python • Data serves: PyDAP, Hyrax, TDS, …

Page 6: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

6

Concept:

Clients Get Just the Data They Need, as They Need them

• Accessing data via URLs (i.e., URL = dataset)– Appending query strings to subset or run server functions

• Getting responses of two (general) types:– Metadata - dataset descriptions & catalogs (textual)– Content - values and metadata (binary or textual)

• Using responses in diverse ways, e.g.– MATLAB maps responses to its internal math types– netCDF library allows apps to work as though

reading a local file

Page 7: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

7

NOAA grant for

OPeNDAP-Unidata Linked Servers (OPULS)

• Goal 1: conformance & linkage between OPeNDAP & Unidata DAP-servers, with short-term outcomes:– New data-model & protocol specs: DAP4

• Consistent behaviors of OPeNDAP & Unidata servers• Data-type richness (NetCDF4, HDF5, RDBs)

– Extensions (i.e., new server behaviors): • Irregular-mesh subsetting• Asynchronous access

• Goal 2: common framework for OPeNDAP & Unidata servers, aiming for an architecture that– Underpins the unique strengths of both– Reduces likelihood of redundant effort

Page 8: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

8

OPULS Progress So Far

• Draft of DAP4 data model & protocol specs– Sufficient for the full richness of NetCDF-4 and

HDF-5 files (including “Groups,” e.g.)• Progress on rigorous conformance-testing• Successful extensibility experiments– Irregular-mesh (i.e., UGRID) subsetting– Asynchronous access (as may be useful for

near-line data storage)– Amazon cloud deployment (more later…)

Page 9: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 9

Other technologies OPULS considered

• JSON responses as an alternative to XML– Decided they added too much bulk to the specification

and two many requirements for implementers– Could be added in a future version– Can be built using XSLT from DAP4 XML

• OpenSearch– Not incorporated into DAP4 for many of the same

reasons• The DAP4 metadata response specifically includes

support for these

Page 10: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 10

OPULS and Feedback

• OPULS is ready for community feedback• Design documents are online– Web site: http://docs.opendap.org/– The current draft specification is there as well

• Many features are already available in C++ and C implementations

Page 11: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 11

Hyrax over Amazon/S3

• Exploits a natural fit between DAP-based services and cloud services

• Initial progress already achieved under the OPULS grant

• Bears interesting similarities to the challenge of asynchronous data access

• May yield a new community of OPeNDAP users

Page 12: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 12

More about clouds…

• Hyrax is trivial to run on the Amazon cloud• We are looking at ways to work with data held

in S3• S3 characteristics: – Flat;– Modest response times;– Simple GET/PUT type API

Page 13: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 13

Using S3

• Tried S3 file systems – found them wanting– Not interoperable (hardly surprising, but limiting)– Extra layer to software stack

• Now working with XML ‘catalogs’– XML documents create a faux hierarchy– XML + XSLT HTML (i.e., a ‘free’ web interface)– XML + Hyrax + caching DAP access– The XML is very similar to THREDDS catalogs

Page 14: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 14

Elaboration on Server Functions

• Proposition: the future of OPeNDAP may lie in provision of data-proximate (i.e., server-side) functions that:– Deliver precisely defined subsets– Reduce the number of off-target retrievals• I.e., enable querying of complex dataset properties

– Remap/transform data to simplify data use, especially multi-source data integration

• Effective caching will be required

Page 15: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 15

Server Functions, DAP4

• DAP2 supports functions and functional composition

• Currently, DAP4 treats ‘functions’ and a ‘functional language’ as an extension

• DAP4 provides more complete support for functions, including metadata responses (DAP2 does not provide this; a gap in the DAP2 specification)

• Support for POST

Page 16: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 16

Server Functions, experimentation

• UGrid: Unstructured Grid (irregular mesh) subsetting

• We have implemented a clone of the GDS server’s syntax for functions

• Enables current netCDF-based DAP clients (e.g., ECMF) to use the Ugrid function

• Other projects: Multi-instrument inter-calibration

Page 17: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 17

Some Server-Function Ideas• Binning: returns a distribution (as a raster of

boolean values on a user-specified grid) of data values satisfying some criteria

• Masking: accepts a raster of zero/nonzero values as a query argument, perhaps as a geospatial selection criterion, e.g.

• Perhaps some (limited?) form of functional language for very rich capabilities

• WPS, et al.

Page 18: OPeNDAP Present and Future An Overview Encompassing Current Projects & Potential New Directions Dave Fulker and James Gallagher

OPeNDAP, Inc. 18

Summary

• DAP is based on a domain neutral data model and an expression-based constraint language

• While not ‘RESTful’ in the strictest sense, it is a REST design in spirit (DAP predates the term by several years)

• OPULS is a collaborative project between OPeNDAP and Unidata that intends to update DAP

• We are also running several experimental mini-projects within its context:– Asynchronous access, Unstructured Grid access, Cloud computing and

an expanded, function-based, server-side processing system

• DAP servers provide a good platform on which to build OGC web services, as described in the following presentation.