Page 1
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The Development of a Geospatial Grid by Integrating OGC Services with Globus-
based Grid TechnologyLiping Di, Aijun Chen, Yaxing Wei, Yang Liu, and Wenli Yang
Laboratory for Advanced Information Technologies and Standards (LAITS)
George Mason University
Piyush Mehrotra and Chaumin Hu
NASA Ames Research Center
Dean Williams
DOE Lawrence Livermore National Laboratory
Page 2
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Introduction
Most data (>80%) are geospatial in nature. Geospatial data are
Heterogeneous (e.g., format, content, discipline, etc); Voluminous (e.g., NASA EOS collects > 2TB data every day); and
Geographically distributed (e.g., different data centers).
The information extraction and knowledge discovery capability lags far behind the data collection capability.
Some of the challenges the data user community faces include Difficult to find and access potentially useful data; Lack of resources (technical, computational, etc) to process the data; Incompatibility among data, services, and service interfaces.
It is of significant value to provide the user community the technologies that can make fully, effective, wise, and easy use of geospatial data.
Page 3
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The Grid Technology
The Grid technology is developed for securely sharing computational resources within an virtual organization.– Computer CPU cycles– Storage– Networks– Data, Information, algorithms, software, services.
It was originally motivated and supported from sciences and engineering requiring high-end computing, for sharing geographically distributed high-end computing resources.
The core of the technology is the the open source middleware, Globus Toolkit.– The latest version (4.0.1) of Globus implements the Open Grid
Service Architecture (OGSA) and converged with Web Services technology.
Page 4
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The Benefits of Grid to the EO Community
Earth Observation (EO) community is one of the key communities that collect, manage, process, archive and distribute geospatial data and information.
EO data and associated computational resources are highly distributed due to geographically distributed archiving and processing facilities.
The multi-disciplinary nature of global change research and applications requires the integrated analysis of huge volumes of multi-source data from multiple data centers, which requires sharing of both data and computing powers among data centers.
Grid technology can provide valuable help in this area.
Page 5
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The Need for Geospatial Extensions to Grid
Geospatial data and information are significantly different from those in other disciplines.– Highly complex and diverse
• Formats, spatial reference systems, resolutions.• Hyper-dimensions: spatial, temporal, spectral, thematic.• Raster and vector types
– Multidisciplinary– Tremendous data volume
• more than 80% of data human beings has collected is spatial data.
The geospatial community has developed a set of standards specifically for geospatial data and information that users have been familiar with. (e.g., OGC, ISO, FGDC).
Grid technology is developed for general sharing of computational resources and not fine tuned to meet the unique geospatial requirements.
Page 6
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The OGC Web Service Specifications
The Web Coverage Services (WCS) specification: defines the standard interfaces between web-based clients and servers for accessing coverage data.– All imagery type of remote sensing data is coverage data.
The Web Feature Services (WFS) specification: defines the standard interfaces between web-based clients and servers for accessing feature-based geospatial data.– vector and point data are feature data.
The Web Map Services (WMS) specification: define the standard interfaces for accessing and assembling maps from multiple servers. – visualization of geospatial data
The Catalog Services for Web (CSW) specification: defines the interfaces between web-based clients and servers for finding the required data or services from registries.
Page 7
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The OGC Web Service Specifications (cont.)
The Web Coordinate Transformation Services (WCTS) specification: defines the standard interfaces between web-based clients and servers for performing spatial coordinate reference system transformations.
– Transformation among different Spatial Reference Systems.
The Web Image Classification Services (WICS) specification: defines the standard interfaces between web-based clients and servers for classifying an imagery into categorical classes.
– Especially for land use/cover classification of remote sensing imagery, both supervised and unsupervised.
WCS, WFS, CSW, WCTS, and WMS form the foundation for the interoperable geospatial data access and service environment.
WICS is a relatively higher level information extraction service, widely used in EO community.
OGC Web Service Architecture specifies a common architectural framework for OGC Web Services.
Page 8
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Objectives
Making NASA EOSDIS data easily accessible to Earth science modeling and applications communities by combining the advantages of both OGC and Grid technology – Develop the geospatial extensions of Grid technology to make it
geospatially enabled (Geospatial Grid).– Enable OGC geospatial clients access Grid-managed distributed
geospatial resources.– Provide virtual/intelligent geospatial products in the Grid environment.– Test methods for automating the process from geospatial data to
knowledge in the Grid environment. Demonstrate the geospatial Grid technology in realistic
NASA EOS data environment. Contribute technology, software, and the data pool
application to the CEOS Grid testbed
Page 9
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The Geospatial Extensions to GRID
Incorporating geospatial-specific characteristics in Grid– Extend Globus toolkit to handle the spatial, spectral,
temporal, thematic based geo-data and geo-information management.
– Develop Grid-enabled tools for geospatial data processing, information extraction, and knowledge building.
Making use of established standards and standard protocols for data/information access and services in the geospatial community.– The Open GIS Consortium’s Web Data Access/Service
interfaces (e.g., OGC WCS, WMS, WFS, and CSW).– ISO/FGDC/ECS/GCMD metadata standards and
service/data type schema.
Page 10
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Grid Security (GSI) and VO Setup
GMU (Solaris) (laits.gmu.edu)GT 3.2 with CEOS Certs.
GMU (Mac)(geobrain.laits.gmu.edu)
Globus 4.0 with Laits Certs.
LAITS CA center
Ames ipg05 (Linux)(ipg05.ipg.nasa.gov)Globus 3.2 with IPG
Certs.GMU LAITS VO NASA IPG VO
GMU (Linux)(data.laits.gmu.edu)
Globus 4.0 with Laits Certs.
IPG CA center
NASA SGT (Linux)(arao2.sgt-inc.com)Globus 3.2 with CEOS
Certs.
NASA (Linux)(former.intl-interfaces.net)Globus 3.0 with CEOS Certs.
CEOS VO
Authentication among different VO
LLNL esg2 (Linux)(esg2.llnl.gov)
Globus 3.2 with ESG Certs.
LLNL ESG VO
ESG CA center
GMU (Linux)(salmon.laits.gmu.edu)Globus 4.0 with Laits
Certs.
Page 11
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The VO Hardware Environment
The testbed has been created with seven machines in three organizations.
The flagship machine in the testbed is GMU’s Apple cluster server:– 6 Apple G5 server nodes- 3 with dual 2.5GHz CPU and 3 with
dual 2.0 GHz CPU with total of 12 GB RAM.– 22.6TB RAID storage.– 1GB network to Internet II and 100 MB to Internet I.– Hosted at ESDIS network lab of NASA GSFC.
Page 12
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Data in the Virtual Organization
Populated the G5 server with– Landsat data covering Globe for year 1975, 1990 and 2000 (7TB data
ingested to date).
– Shuttle DEM data covering Globe for year 2000 (1TB).
– Other sample EOSDIS data (e.g., MODIS, Aster, etc. 2TB to date).
Converted part of DOE LLNL 4-D netCDF modeling data to HDF-EOS format and loaded into the LLNL node.
Replicated some typical EOSDIS data at NASA Ames node.
The total size of data in the testbed is over 10TB and is growing.
Page 13
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The VO Grid Software
Globus 4.0 installed in two GMU nodes.
Globus 3.2 installed in all other nodes.
The geospatial Grid software developed by GMU installed at all nodes.
Setup and issue CAs– Set up LAITS CA, issued LAITS certificates to Mac machine and
Linux machine of GMU LAITS.
– Set up IPG CA, issued IPG certificates to Linux machine at NASA Ames.
– From CEOS CA, requested CEOS certificates for Solaris machine at GMU LAITS.
– Tested and debugged the authentication between any two different CAs’ certificates among all of the above boxes.
Page 14
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
An OGC Catalog Service for Web (CSW) server is developed.
The CSW server is Grid enabled as a Grid service, GCSW.
The GCSW is deployed on three nodes -- GeoBrain (Mac), LAITS (Solaris) and Data (Linux).
ISO 19115 Part one
ISO 19115 Part two(FGDC extension)
NASA ECSISO 19119ebRIM IM
GCMD Service Type IM
Extended Data Type IM
CSW Information Model
GCMD Service Type IM
Extended Data Type IM
Grid-enabled Catalog Service for Web
Page 15
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
targetObjectsourceObject
RegistryObject
Association
ServiceBinding
Orgnization
User
ExternalIdentifierClassification
RegistryEntry
ClassificationScheme
Service
ExtrinsicObject
Slot
SpecificationLink
ClassificationNode
ExternalLink
SV_OperationMetadata
SV_Parameter
Slot
CSWExtrinsicObject
0..*
0..*
0..*
11
0..*
0..* 0..*
WCSCoverage
WMSLayer
DataGranule
The CSW Architecture
Page 16
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Catalog Service Federation
The CSF developed in another project is being Grid-enabled so that the geospatial Grid can connect other data/data catalog.
Page 17
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Grid-enabled WCS and WCS Portal
WCS enable to process 4-D HDF-EOS data converted from LLNL netCDF modeling data.
Enhanced the WCS to be Grid enabled (GWCS).
Developed OGC standard compatible WCS portal supported by Grid Services to access to GWCS.
Deployed on three nodes: Geobrain (Mac), Laits (Solaris) and Data (Linux).
Page 18
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Grid-enabled WMS and WMS Portal
Enhanced the WMS to be Grid enabled (GWMS).
Developed OGC standard compatible WMS portal supported by Grid Services to access to GWMS.
Deployed on three nodes: Geobrain (Mac), Laits (Solaris) and Data (Linux).
Page 19
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Intelligent Grid Service Mediator
WCS PortalWMS Portal
GCSWGWCS
GWMS
iGSM
ROS MDS
DTS
The iGSM is developed to dispatch user requests from WCS/WMS portal to the most appropriate GWCS/GWMS in the VO.
Page 20
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Functional Overview of iGSM
Managing geospatial-data access requests from OGC WCS portal and WMS portal and transfer those requests to GWCS (Grid-enabled Web Coverage Service) or GWMS (Grid-enabled Web Map Service)
– Accepts geodata requests from default WCS portal and WMS portal.
– Queries a ROS (Replica Optimization Service) for an optimized PFNInfo (Physical File Name Information) object
• Each PFNInfo contains a physical file name, a GridWCS service ID, and the host where the data file located
– When the received PFNInfo contains a valid service ID
• Requests a GridCSW (Grid-enabled Catalog Service for Web) for corresponding GridWCS/WMS URL to the service ID.
– When the received PFNInfo contains a null service ID
• Requests a GridCSW for available GridWCS(s) /WMS(s) among the Grid resources.
• Requests a ROS (Replica and Optimized Service) for selecting the best GridWCS/WMS among the resources returned from the GridCSW
• Requests a DTS (Data Transfer Service) for transferring the data to the selected system
– Querying the GridCSW deployed in the selected system for the geodata URI
Page 21
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Globus RLS as a Grid Service
The Replica Optimization Service
Globus Index service
Globus MDS scripts modification
LRC (Laits) LRC (Laits-data) LRC (Ames/LLNL)
RLI (Laits) RLI (Laits-data)
ROS
MDS Index Service (MDS)
Page 22
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
GridFTP as a Grid Service
Data Transfer Service
Machine AGlobus Security
Machine BGlobus Security
Machine BGlobus Security
Machine CGlobus Security
Machine AGlobus Security
Data
Secure Request
Secure Request
Data
Secure Request
Page 23
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Geospatial Grid with GCSW/GWCS/GWMS/iGSM/ROS/DTS
WCS PortalWMS Portal
GCSW GWMS
GWCS
iGSM
ROS
MDS
DTS
CSW Portal
User/Client Interface (Web Download & MPGC)
GeospatialCatalog
DB Replica DB
HDF-EOS Data
12 2
Laits (3)
Ames
LLNL
Page 24
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
A Data Request Scenario for Access to Real Datasets
RLS
LAITS WCS/WMS Portal
CSW Portal
ESG Catalog
Client
1
2
3
Retrieval Manager
4
LAITSGridWCS
AmesGridWCS
LLNLGridWCS
+ default WCS portal IP
ROS
5
2
Logical data name
Physical data/service ID
MDS
6
Best server ID7
iGSM
Other WCS
LAITS GridCSW
HDF-EOS Data
Other Data
Ames DTS
9
9
9
8
Page 25
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Virtual Geospatial Datasets
A virtual dataset is a dataset that has the following characteristics:– It does not physically exist in a data and information system.– The system knows how to create it on-demand.– Once created, it can be kept to meet the same type of
request from multiple users without having to be regenerated. The client/data user is not aware of the difference
between a real dataset and a virtual dataset. A virtual dataset can be materialized by
– invoking a single module dedicated to the production of the virtual dataset (dedicated module approach).
– chaining and executing a series of services, each fulfilling a component in the process of the virtual dataset materialization (service oriented approach).
Page 26
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
The Service Approach to Virtual Datasets
A service is self-contained, self-describing, modular application that can be published, located, and dynamically invoked across a network.
– It performs functions, which can be anything from simple requests to complicated business processes.
– Once a service is deployed, other applications (and other services) can discover and invoke the deployed service.
A service can be implemented in the Web environment, called a web service, or in the Grid environment, called a Grid service.
Standards on service discovery, declaration, binding, and invocation allow dynamically chaining individual services across a network together to fulfill a complex task.
A virtual dataset, in the service environment, is essentially a service chain that describes steps to be taken to produce the virtual dataset.
With enough elementary service models, it is possible to provide unlimited numbers of virtual datasets by just creating the service chains.
Page 27
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Archived data (geo-object)
User requested geo-object
Intermediate geo-object
Without service With service Modeling and virtual data services
User request
User received
Data transformation services (format transformation, reprojection, regriding, etc)Data access services (spatial/temporal/parameter subsetting, mapping, etc)
High level services (geophysical parameters, modeling, etc)
Geo-object, Geo-tree, Virtual Dataset, Geospatial Models
Page 28
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
User Creation of Geospatial Models
A user-requested product may not exist, either virtually or physically, in a data/information system, which is often the case if the product is at high level or is of specific purpose.
If the user possess the knowledge of deriving the data product from available lower-level data, he/she can create a logical geospatial model for the product.– With help of a friendly user interface and the availability of service
modules and models/submodels, the user can construct a geospatial model/virtual data product interactively.
– The system will materialize the virtual data product (the user-created logical model) through an instantiation process.
– The user-created model can, when proved to be useful, be incorporated into the system as a part of the virtual datasets with which the system can provide in the future.
Page 29
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
User Creation of Geospatial Models (cont.)
The capability of the system will grow with time, possibly at a tremendous speed, if there are enough user requesting/modeling.
Advantages of virtual product modeling include: – Users can get ready-to-use scientific information without having to
obtain lower level data and to go through all the data processing process locally, thus significantly reducing the data traffic between the users and the geospatial Grid.
– It allows users to explore huge resources available at a data Grid and to conduct tasks that they never be able to conduct before.
– It can create a very powerful system.
Page 30
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
User
CSW
WCS
VWCS
Register AM
Query
GeospatialData
InstantiateService
WorkflowEngineService
WCS
WICS
WCTS
1
2
LAITS
ESG
ECHO
GSI (gt4.0)
CSWQueryM
Architecture of Virtual Data Implementation
Page 31
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
GCSW and iGSM cooperate as Ganglia
GWCS, GWMS, GWICS, GWCTS ROS, GridDTS as Nerve Cell
Grid and Web Services related technologies as basic infrastructure
More Cells and more powerful Ganglia will be developed for more easier and more complete Earth Science Data accesses.
More specific Ganglia and Cells will be provided for special domain user requirements.
Future Direction
Page 32
LAITS
Laboratory for Advanced Information Technology and Standards
GGF15 Community Activity: Building Geographic Information Grids 10/04/2005, Boston
Acknowledgement
The Project is funded by NASA Advanced Information System Technology Program (AIST)