CWIC Client Partner’s Guide Version: 1.1

CEOS CWIC Project

CWIC Client Partner’s Guide

Mar. 2012

Document version 1.1

Page 1

CWIC Client Partner’s Guide Version: 1.1

Revision History

Date / Version / Brief Description / Author
3/7/2012 / 1.1 / CWIC client partner’s guide / Yuanzheng Shao
Lingjun Kang
Archie Warnock

Page 1

CWIC Client Partner’s Guide Version: 1.1

Table of Contents

Revision History

Table of Contents

List of Tables

List of Figures

CWIC Client Partner’s Guide

1.Before You Begin

1.1.CWIC Background

1.2.CWIC Concept and Design

1.3.Skills You Will Need As a Client Partner

1.4.CWIC Terms and Definitions

1.5.CWIC Systems

2.Metadata Model in CWIC

2.1.Dublin Core Metadata

2.2.ISO 19115-2 Metadata

2.3.WGISS Search Criteria

3.CSW Query Interface

3.1.Introduction

3.2.GetCapabilities Operation

3.3.DescribeRecord Operation

3.4.GetRecords Operation

3.4.1. Request Attribute

3.4.2. Identifier of Dataset

3.4.3. Spatial Searching Clause

3.4.4. Temporal Searching Clause

3.5.GetRecordById Operation

3.5.1. Request Attribute

3.5.2. Identifier of Granule

4.CWIC Client Partner Guide

4.1.Clients Start From GCMD

4.2.Clients Start From CWIC Capabilities

5.Use Case

5.1.Retrieve Entry ID From GCMD

5.1.1. Retrieve Entry ID From GCMD Web Portal

5.1.2. Retrieve Entry ID From GCMD CSW

5.2.Interact With CWIC Server

6.Appendix

6.1.NOAA CLASS wrapper

6.2.NASA ECHO wrapper

6.3.USGS LandSAT wrapper

6.4.INPE wrapper

6.5.China AOE CSW

7.Reference

List of Tables

Table 1 The Dublin Core Metadata Element in CWIC Request

Table 2 The Dublin Core Metadata Elements in CWIC Response

Table 3 The ISO 19115-2 Elements in CWIC Response

Table 4 OGC CSW Operations implemented in CWIC

Table 5 Details of query support on data inventory level

Table 6 Table of Request Attributes

Table 7 Table of Request Attribute

Table 8 Use Case: Retrieve Entry ID from GCMD web portal

Table 9 Use Case: Retrieve Entry ID from GCMD CSW

Table 10 Use Case: Interact with CWIC Server

Table 11 Dataset in NOAA CLASS

Table 12 Dataset in USGS LandSAT

Table 13 Dataset in INPE

List of Figures

Fig. 1 The Mediator-Wrapper Architecture

Fig. 2 The System Architecture of CWIC

Fig. 3 Example of ows: BouondingBox

Fig. 4 Element of dct:references

Fig. 5 Element of ows:WGS84BoundingBox

Fig. 6 Element of ows:BoundingBox

Fig. 7 The UML of CWIC implemented 19115-2 Elements

Fig. 8 Extended FederationMetadata Element in CWIC capabilities

Fig. 9 GetRecords POST Request Template in CWIC

Fig. 10 Dataset Identifier Clause

Fig. 11 Spatial Searching Clause

Fig. 12 Temporal Searching Clause

Fig. 13 GetRecordById POST Request Template in CWIC

Fig. 14 Granule Identifier Clause

Fig. 15 Example of FederationMetadata Element

Fig. 16 GCMD Dataset Query Interface

Fig. 17 Refine GCMD results by Full Text

Fig. 18 Dataset information in GCMD

Fig. 19 GCMD GetRecords request XML payload – Filter predicate

Fig. 20 GCMD GetRecords request XML payload – CqlText predicate

Fig. 21 GCMD GetRecords response

Fig. 22 GCMD GetRecordById request XML payload

Fig. 23 CWIC GetRecords request XML payload

Fig. 24 CWIC GetRecords response

Fig. 25 CWIC GetRecordById request XML payload

Fig. 26 CLASS Data Ordering Response

Page 1

CWIC Client Partner’s Guide Version: 1.1

CWIC Client Partner’s Guide

  1. Before You Begin
  2. CWIC Background

For scientists who conduct multi-disciplinary research, there may be a need to search multiple catalogs in order to find the data they need. Such work is very time-consuming and tedious, especially when the catalogs may use different metadata models and catalog interface protocols. It would be desirable, therefore, for those catalogs to be integrated into a catalog federation, which will present a well-known and documented metadata model and interface protocol to users and hide the complexity and diversity of the affiliated catalogs behind the interface. With such a federation, users only need to work with the federated catalog through the public interface or API to find the data they need instead of working with various catalogs individually.

Committee on Earth Observation Satellite (CEOS) addresses coordination of the satellite Earth Observation (EO) programs of the world's government agencies, along with agencies that receive and process data acquired remotely from space. Working Group on Information Systems and Services (WGISS) is a subgroup of CEOS, which aims to promote collaboration in the development of systems and services that manage and supply EO data to users world-wide. To realize a federated catalogue for data discovery from multiple EO data centers, CEOS WGISS Integrated catalog (CWIC) was implemented. CWIC was expected to provide inventory search to WGISS agency catalog systems for EO data.

1.2.CWIC Concept and Design

The mediator-wrapper architecture has been widely adopted to realize the integrated access to heterogeneous, autonomous data sources. As depicted in Fig. 1, the data source archives data and disseminate it through the Internet. The wrapper on top of the data source provides a universal query interface by encapsulating heterogeneous data models, query protocols, and access methods. The mediator interacts with the wrapper and provides the user with an integrated access through the global information schema.

Wrappers offer query interfaces hiding the particular data model, access path, and interface technology of the partner catalog systems. Wrappers are accessed by a mediator, which offers users a front-end integrated access through its global schema. The user poses queries against the global schema of the mediator; the mediator then distributes the query to the individual systems using the appropriate wrappers. The wrappers transform the queries so they are understandable and executable by the partner catalog systems they wrap, collect the results, and return them to the mediator. Finally, the mediator integrates the results as a user response.

Fig. 1The Mediator-Wrapper Architecture

Based on the mediator-wrapper architecture, currently version 1 of CWIC has been developed and operational with five member catalog systems: the NASA Earth Observing System Clearinghouse (ECHO), NOAA Comprehensive Large Array-data Stewardship System (CLASS), USGS Landsat Catalog System, INPE Catalog System (Brazil), and the data catalog system of the Academy of Optic-Electronic (AOE) of Chinese Academy of Science.

Different query interfaces and heterogeneous metadata models were used in the five member catalog systems. ECHO exposes the query interface through its native Application Programming Interface and adopts ECS metadata model in its response; CLASS catalog uses its native metadata model and is accessible through the interface provided by NOAA Enterprise Archive Access Tool (NEAAT). Both USGS and INPE provide HTTP GET query interfaces for data search and use their native self-defined metadata models in the response. In order to implement a one-stop federated catalog system, wrappers have been developed for those individual member catalogs that are not compliant with the CWIC standard protocols.

Fig. 2The System Architecture of CWIC

Fig. 2 illustrates the system architecture of CWIC. Four wrappers were implemented for four data inventories (i.e., NOAA CLASS, USGS Landsat System, NASA ECHO, and INPE). The wrapper is responsible for translating and dispatching the request to different data inventories.

Since AOE data center has provided CSW query interface and adopted ISO-19115 metadata model (2), a wrapper for AOE was not implemented within CWIC. As to the other four data center, the corresponding wrapper are created with the capability to translate CSW query language to its native query interface, and convert its native metadata to ISO-19115 (2) metadata schema. The mediator is in charge of dispatching query request and returns the response to data user.

1.3.SkillsYou Will Need As a Client Partner

As an ECHO Data Partner, you need to be familiar with basic software development and Service Oriented Architecture (SOA) concepts such as:

  • XML and XML Schema (XSD)
  • Service-based Application Programmer’s Interface (API)
  • Web development programming language
  • CWIC Terms and Definitions

For the purposes of this document, the following terms and definitions apply:

(1)client

A software component that can invoke an operation from a server

(2)data clearinghouse

The collection of institutions providing digital data, which can be searched through a single interface using a common metadata standard

(3) identifier

A character string that may be composed of numbers and characters that is exchanged between the client and the server with respect to a specific identity of a resource

(4) operation

The specification of a transformation or query that an object may be called to execute

(5) profile

A set of one or more base standards and - where applicable - the identification of chosen clauses, classes, subsets, options and parameters of those base standards that are necessary for accomplishing a particular function

(6) request

The invocation of an operation by a CWIC client

(7) response

The result of an operation, returned from CWIC server to CWIC client

(8) collection

A grouping of granules that all come from the same source, such as a modeling group or institution. Collections have information that is common across all the granules they "own" and a template for describing additional attributes not already part of the metadata model.

(9) dataset

Has the same meaning as collection, see (8)

(10) granule

The smallest aggregation of data that can be independently managed (described, inventoried, and retrieved). Granules have their own metadata model and support values associated with the additional attributes defined by the owning collection.

(11) GCMD

NASA's Global Change Master Directory, a comprehensive directory of information about Earth science data, accessible at

(12) IDN

The CEOS International Directory Network, a Gateway to the world of Earth Science data and services accessible at

(13)DIF Entry ID

Entry ID is the unique document identifier of the metadata record, which is determined by the metadata author or data center contact personnel and may be identical to identifiers used by the data provider’s data center or organization.

(14)Catalog ID

Identifiers to stand for EO data center, such as ‘NASA’, ‘NOAA’.

1.5.CWIC Systems

There are two CWIC systems that you, as a Client Partner, have access to:

  • CWIC Operations. This is the current operational system for CWCI and is available to all users.
    Location:
  • CWIC Partner Test. This is a test system area used by partners and CWIC developers to test before changes to the CWIC system go operational.
    Location:
  1. Metadata Model in CWIC

The metadata model defines the conceptsof query parameters. Different data inventories design and maintain different metadata models. CWIC adopts universal metadata models to integrate heterogeneous metadata models, which provides CWIC client with universal catalog discovery. Two metadata models are adopted in CWIC: Dublin Core Metadata and ISO 19115-2 Metadata. The Dublin Core Metadata is referred to describe the parameters in catalog request and response. The ISO 19115-2 Metadata is referred to describe the parameters in catalog response. Not all elements in Core Metadata or ISO 19115-2 Metadata are implemented in CWIC. A synopsis of implemented elements is present in following sections.

2.1.Dublin Core Metadata

Dublin Core Metadata elements referred in CWIC’s catalog REQUESTare list in Table 1.

Element / Definition d / Expression
dca: subject / Dataset identifier / CatalogID + Colon + GCMD Entry ID
dctb: coverage.dateStart / Start of temporal searching criterion / yyyy-MM-ddTHH:mm:ssZe or
yyyy-MM-dd HH:mm:ss
dct: coverage.dateEnd / End of temporal searching criterion / yyyy-MM-ddTHH:mm:ssZe or
yyyy-MM-dd HH:mm:ss
owsc: BoundingBox / Rectangle of spatial searching criterion / See Fig. 3
a: xmlns:dc="
b: xmlns:dct="
c: xmlns:ows="
d: “Definition” represents the semantic meaning of element in CWIC. It is slightly different from the genetic meaning in Dublin Core Metadata.
e: ISO 8601 – see

Table 1 The Dublin Core Metadata Element in CWIC Request

<ogc:BBOX

<ogc:PropertyName>ows:BoundingBox</ogc:PropertyName

<gml:Envelope srsName="EPSG:4326"a

<gml:lowerCorner>SouthBoundLatitude WestBoundLongitude</gml:lowerCorner>b

<gml:upperCorner>NorthBoundLatitude EastBoundLongitude</gml:upperCorner>b

</gml:Envelope>

</ogc:BBOX>

a: Searching area is supposed to be defined with coordinates under EPSG:4326.

b: Coordinate under EPSG:4326 conforms to the form: latitude + blank + longitude

Fig. 3 Example of ows: BouondingBox

Dublin Core Metadata elements referred in CWIC’s catalog RESPONSE (Table 2). Owing to different response types (i.e., “brief”, “summary”, “full”) defined in OGC CSW[1], The metadata model elements do not necessarily present in the responses of all types.

Element / Definition d / Present in “brief” response / Present in “summary” response / Present in “full” response
dca: identifier / Granule identifier / Yes / Yes / Yes
dc: title / Description of granule / Yes / Yes / Yes
dc: type / Indicator of granule retrieval (i.e., downloadable) / Yes / Yes / Yes
dc: subject / Subject of granule / No / Yes / Yes
dct b: modified / Date on which the record was created or updated within the catalogue / No / Yes / Yes
dct: abstract / Abstract of granule / No / Yes / Yes
dct: coverage.dateStart / Start of temporal coverage / No / No / Yes
dct: coverage.dateEnd / End of temporal coverage / No / No / Yes
dct: references / See Fig. 4 / No / No / Yes
ows c: WGS84BouondingBox / See Fig. 5 / Yes / Yes / Yes
ows: BoundingBox / See Fig. 6 / Yes / Yes / Yes
a: xmlns:dc="
b: xmlns:dct="
c: xmlns:ows=“
d: “Definition” represents the semantic meaning of element in CWIC. It is slightly different from the genetic meaning in Dublin Core Metadata.

Table 2TheDublin Core Metadata Elements in CWIC Response

<dct:references scheme=retrieval schemaaGranule retrieval URL</dct:references>

a: There are three kinds of retrieval schema:

“urn:x-cwic: Onlink”: Granule retrieval URL with this schema is a metadata retrieval URL.

“urn:x-cwic:Browse”: Granule retrieval URL with this schema is data browsing URL.

“urn:x-cwic: Order”: Granule retrieval URL with this schema is a data ordering URL.

Fig. 4 Element of dct:references

<ows:WGS84BoundingBox>

<gml:lowerCorner>SouthBoundLatitude WestBoundLongitude</gml:lowerCorner>

<gml:upperCorner>NorthBoundLatitude EastBoundLongitude</gml:upperCorner>

</ows:WGS84BoundingBox>

Fig. 5 Element of ows:WGS84BoundingBox

<ows:BoundingBox>

<gml:lowerCorner>SouthBoundLatitude WestBoundLongitude</gml:lowerCorner>

<gml:upperCorner>NorthBoundLatitude EastBoundLongitude</gml:upperCorner>

</ows:BoundingBox>

Fig. 6 Element of ows:BoundingBox

2.2.ISO 19115-2 Metadata

ISO 19115-2 Metadata elements referred in CWIC’s catalog REPONSEare list inFig. 7(UML) and Table 3.

Fig. 7 The UML of CWIC implemented 19115-2 Elements

Element / Definition c / Present in “brief” response / Present in “summary” response / Present in “full” response
gmi a: MI_Metadata / Root entity representing granule level imagery or gridded data. It is extended from gmd: MD_Metadata / Yes / Yes / Yes
gmd b: fileIdentifier / Identifier of granule / Yes / Yes / Yes
gmd: language / Language used for documenting metadata / Yes / Yes / Yes
gmd: contact / Contact information of party responsible for metadata maintenance / Yes / Yes / Yes
gmd: dateStamp / Date that the metadata was created / Yes / Yes / Yes
gmd: dataSetURI / Uniformed Resource Identifier
(URI) of the dataset to which the
metadata applies / Yes / Yes / Yes
gmd: identificationInfo / Include data and service identification. (i.e., keyword, category, geo-spatial extent) / Yes / Yes / Yes
gmd: characterSet / Character coding standard used for the metadata set / No / Yes / Yes
gmd: metadataStandardName / Name of metadata standard (including profile name) used / No / Yes / Yes
gmd: metadataStandardVersion / Version (profile) of the metadata standard used / No / Yes / Yes
gmd: hierarchyLevel / Name of the hierarchy levels for which the metadata is provided / No / No / Yes
gmd: distributionInfo / Information about the distributor of and options for obtaining the resource(s) / No / No / Yes
gmd: dataQualityInfo / Overall assessment of quality of a resource(s) / No / No / Yes
gmi a: acquisitionInformation / Information about the conceptual schema of a dataset / No / No / Yes
a: xmlns:gmi="
b: xmlns:gmd=“
c: “Definition” represents the semantic meaning of element in CWIC. It is slightly different from the genetic meaning in ISO 19115-2.

Table 3 The ISO 19115-2 Elements in CWIC Response

2.3.WGISS Search Criteria

The WGISS search criteria are a set of searchable and returnable elements identified by WGISS as being of potential use in guiding users to relevant data. The full set is described in a spreadsheet from Liping Di of George Mason University and available from the CEOS web site at

CWIC does not yet support search on the WGISS search criteria but it is in the development plan for a future release. Many inventory systems do not support search on some or all of the possible elements, but CWIC will utilize those that are.

  1. CSW Query Interface

Query interface stipulates the protocol between client and catalog server. CWIC adopts OGC CSW specification[1]as query interface.

3.1.Introduction

The OGC CSW specification stipulates interface for catalog service. The interfaces are divided into three categories, which are OGC service interface, CSW discovery interface and CSW manager interface.Specifically, the GetCapabilities operation is OGC service interface, which provides summary information of CWIC catalog. The operations under CSW discovery interface include GetRecords, GetRecordById, DescribeRecord and GetDomain. The operations under CSW manager interface include Transaction, Harvest. Detailed operation information is list in Table 4. Specifically, “CWIC Support” field indicates whether this operation is implemented in CWIC. “Supported HTTP Protocol” field indicates which HTTP protocol is implemented for that operation.

Operation / Operation Description / CWIC Support / Supported HTTP Protocol
GetCapabilities / Retrieve catalog summary information / Yes / GET
GetRecordsa / Retrieval dataset information. A list of granule data within the dataset will be returned. / Yes / POST
GetRecordByIda / Retrieval granule information. / Yes / GET/POST
DescribeRecord / Retrieve the information models supported in CWIC / Yes / GET
GetDomain / Retrieve the runtime information about information models. / No / N/A
Transaction / Interface for creating, modifying and deleting catalog records. / No / N/A
Harvest / Interface for pulling data reference from inventories to catalog. / No / N/A
a: Operations of GetRecords and GetRecordById differ in data inventory support. See Table 5

Table 4 OGC CSW Operations implemented in CWIC