GWD-R17 September 2003

Grid Data Service Specification: The Relational Realisation

Status of This Memo

This memo provides information to the Grid community regarding the specification of Grid Database Services. The specification is presently a draft for discussion. It does not define any standards or technical recommendations. Distribution is unlimited.

Copyright Notice

GGF Copyright to be inserted here.

Abstract

The Relational Model is widely used as a way of managing and storing data, Relational Database Management Systems are a prime widespread example of the use of relational data models. This specification extends the Grid Data Service Specification [GDSS] into a relational world of data. As such it includes details of extensions to the Grid Data Service Specification to allow description of relational data, creation of services to represent relational data and access to relational data.

The proposal is presented for discussion within the Global Grid Forum (GGF) Database Access and Integration Services (DAIS) Working Group, with a view to the document evolving to become a proposed recommendation. There are several respects in which the current proposal is incomplete, but it is hoped that the material included is sufficient to allow an informed discussion to take place concerning both its form and substance.

Contents

Abstract

1.Introduction

2.Notational Conventions

3.Specification Overview

3.1Scope of specification

3.2Mapping to Data Service model (portType hierarchy diagrams)

3.3Relationships with other specifications

4.Description PortTypes

4.1RelationalDescription (extends Data Description)

4.2RowsetDescription (extends Data Description)

5.Factory PortTypes

5.1sqlFactory (extends Data Factory)

5.2RowsetSelectionFactory (extends Data Factory)

5.3RemoteDBOperationFactory (extends Data Factory)

6.Access PortTypes

6.1SQLAccess (extends Data Access)

6.2RowsetAccess (extends Data Access)

6.3RemoteDBOperationAccess (extends Data Access)

7.Stored Procedures

8.Use Cases

8.1Synchronous sqlQuery

8.2“Asynchronous” sqlQuery

8.3Creating a new RDBMS (no virtualisation)

8.4Creating a virtualisation of a new RDBMS

8.5Creating an empty resource (no virtualisation)

8.6Removing a resource

8.7Creating and virtualising a new resource

8.8Calling a stored Procedure

9.Security

10.Conclusion

11.REMOVED

12.MISSING

13.UNRESOLVED

Author Information

Trademarks

Intellectual Property Statement

Full Copyright Notice

References

1.Introduction

This document describes how the base interfaces proposed in the Grid Data Services Specifcation [GDSS] are extended to describe and allow access to relational based data sources. How these interfaces are composed into services is not described in this document. The compositional aspects of how interfaces are combined to form services will largely be in the remit of WS-Agreement [WS Agreement]. Agreements will form the basis that specify what interfaces are to be supported by services instantiations, amongst other things. This document presumes an understanding of the material covered in the OGSA Data Services [Data Services] and Grid Data Services Specification [GDSS] documents.

Norman is writing introductions to all the documents.

2.Notational Conventions

The key words “MUST,” “MUST NOT,” “REQUIRED,” “SHALL,” “SHALL NOT,” “SHOULD,” “SHOULD NOT,” “RECOMMENDED,” “MAY,” and “OPTIONAL” are to be interpreted as described in RFC-2119 [RFC2199]

There is (WILL BE) an accompanying file, which contains full GWSDL descriptions for what is described in this specification. This GWSDL relies upon additional descriptions found elsewhere, the namespaces used are listed in the table below. Note that the choice of any namespace prefix is arbitrary and not semantically significant.

Prefix / Namespace
dais /
ogsi /
gwsdl /
sd /
wsdl /
http /
xsd /
xsi /
webr / (contents subject to change)
sqlxml / (temporary location)

Mario: namespaces not used in this document should be removed from the final version.

3.Specification Overview

3.1Scope of specification

This document describes how to extend the base interface specification for Data Services to allow access to relational forms of data. This does not imply that the data must be physically stored in a relational form, but merely that some implementation is providing the Data Services with a relational view.[m1]

Currently the creation and removal of databases is not considered within this document, there are potential overlaps with management models, which need to be resolved[m2].[JM3]

3.2Mapping to Data Service model

3.3Relationships with other specifications

  • SQL – permitted query language, need to be able to specify version etc of SQL allowed [REF to SQL standards].
  • WebRowSet – one possible dataFormat for response from SQLAccess operations. (

4.DataDescription PortTypes

The DataDescription portTypes allow metadata for data virtualizations to be made available. In the relational data model this has a hierarchical nature whereby you have metadata describing virtualizations of the RDBMS, a database and a result set that comes out of a database. Although the metadata can be made to apply across these boundaries it falls most naturally into one of the above views. In this document metadata corresponding to the virtualization of RDBMs is not considered (why?).

[JM4]

4.1RelationalDescription

4.1.1Database virtualization SDEs

  • schema[m5]: Describes the schema of the relational view of the data, such as tables, columns, column types, keys, etc[m6].

<sd:serviceData name=”schema”

type=”schemaType”

minOccurs=”1” maxOccurs=”1”

mutability=”mutable”

modifiable=”false”

nillable=”true”/>

  • indexes: Describes the names and definitions of any indexes available.

<sd:serviceData name=”indexes”

type=”IndexType”

minOccurs=”1” maxOccurs=”unbounded”

mutability=”mutable”

modifiable=”false”

nillable=”true”/>

  • storedProcedures: Describes the name, input and output types of stored procedures available.

<sd:serviceData name=”storedProcedures”

type=”StoredProcedureType”

minOccurs=”1” maxOccurs=”unbounded”

mutability=”mutable”

modifiable=”false”

nillable=”true”/>

  • userDefinedTypes: Describes the names and definitions of the user defined types deployed.

<sd:serviceData name=”userDefinedTypes”

type=”UserDefinedTypesType”

minOccurs=”1” maxOccurs=”unbounded”

mutability=”mutable”

modifiable=”false”

nillable=”true”/>

  • userDefinedFunctions: Describes the names and definitions of the user defined functions deployed.

<sd:serviceData name=”userDefinedFunctions”

type=”UserDefinedFunctionsType”

minOccurs=”1” maxOccurs=”unbounded”

mutability=”mutable”

modifiable=”false”

nillable=”true”/>

4.2RowsetDescription (extends Data Description)

4.2.1SDE’s

  • RowSchema[m7]: The schema for representing the RowSet.

<sd:serviceData name=”RowSchema”

type=”RowSchemaType”

minOccurs=”1” maxOccurs=”1”

mutability=”mutable”

modifiable=”false”

nillable=”false”/>

  • noOfRows: The number of rows within the Rowset.

<sd:serviceData name=”noOfRows”

type=”xsd:int”

minOccurs=”1” maxOccurs=”1”

mutability=”mutable”

modifiable=”false”

nillable=”false”/>

5.Factory PortTypes

Factories within this specification should be named giving indication as to the means by which the data behind the new service is selected. For example SQLFactory indicates a Factory, which uses SQL to select the data to be placed behind the new service. RowSetSelectionFactory indicates a Factory, which allows you to select between RowSets as the data to be placed behind a new service.

A Factory will create a service satisfying the createService operation, however the creation of a Grid Data service can be considered a two stage process,

  1. Creation of a service which provides the operations/port types required.
  2. Qualification of the data that should be behind this service. (i.e. For SQLFactory the results of the SQL Query)

The handle for this new service should be returned at the end of step 1. This releases the factory from service creation as soon as possible. Step 2 will occur asynchronously with respect to the Factory createService request and may or may not be prompted by an interaction with the new service handle.

Regarding Step 2 there are several further scenarios:

a)Qualification of the data occurs only once

b)Qualification of the data occurs every time an interaction occurs with the service.

Here b should imply that any updates which occurred to this resulting service will flow to the original data as updates there.

[JM8]

The creation of a relational view on the underlying data corresponds to the creation of a new data service based upon the original factory.

As specified in the Grid Data Service Specification, WS-Agreement [WS Agreement] must be used and will be the mechanism by which a request can specify the port types it wishes the resulting service to implement.

5.1sqlFactory (extends Data Factory)

An agreement document holding an SQL Query, the interfaces to be created and behaviour specification is passed to sqlFactory::createService. The factory will create a service fulfilling the desired behaviour, exposing the desired port types and representing the results of the SQL Query.

5.1.1SDE’s

All those defined for a Factory within Grid Data Services Specification. No additional SDE’s required.

5.1.2Operations

5.1.2.1sqlFactory::createService

Create a new Data Service which corresponds to the results of an SQL Query, see Grid Data Service Specification.

<xsd:complexType name=”sqlFactoryServiceParameterType”>

<xsd:complexContent>

<xsd:extension base=”gsa:AgreementType”>

<xsd:element name=”createInterface” type=”xsd:QName”

minOccurs=”1” maxOccurs=”unbounded”/>

<xsd:element name=”sqlExpression” type=”sqlExpressionType”/>

</xsd:extension>

</xsd:complexContent>

</xsd:complexType>

5.2RowsetSelectionFactory (extends Data Factory)

An agreement document holding the start position for first row and number of rows, the interfaces to be created and behaviour specification is passed to RowsetSelectionFactory::createService. The factory will create a service fulfilling the desired behaviour, exposing the desired port types and representing the requested rows.

5.2.1SDE’s

All those defined for a Factory within Grid Data Services Specification. No additional SDE’s required.

5.2.2Operations

5.2.2.1RowsetSelectionFactory::createService

Create a new Data Service, which corresponds to the rows selected, see Grid Data Service Specification.

<xsd:complexType name=”rowsetSelectionFactoryServiceParameterType”>

<xsd:complexContent>

<xsd:extension base=”gsa:AgreementType”>

<xsd:element name=”createInterface” type=”xsd:QName”

minOccurs=”1” maxOccurs=”unbounded”/>

<xsd:element name=”startPosition” type=”xsd:int”/>

<xsd:element name=”count” type=”xsd:int”/>

</xsd:extension>

</xsd:complexContent>

</xsd:complexType>

5.3RemoteDBOperationFactory (extends Data Factory)

An agreement document holding the remote Database Operation desired, the interfaces to be created and behavior specification is passed to remoteDBOperationFactory::createService. The factory will create a service fulfilling the desired behavior, exposing the desired port types and representing the requested rows.

5.3.1SDE’s

All those defined for a Factory within Grid Data Services Specification. No additional SDE’s required.

5.3.2Operations

5.3.2.1RemoteDBOperationFactory::createService

Create a new Data Service which corresponds to a Remote Database Operation such as Stored Procedure or User Defined Function, see Grid Data Service Specification.

<xsd:complexType name=”remoteDBOperationFactoryServiceParameterType”>

<xsd:complexContent>

<xsd:extension base=”gsa:AgreementType”>

<xsd:element name=”createInterface” type=”xsd:QName”

minOccurs=”1” maxOccurs=”unbounded”/>

<xsd:element name=”remoteDBOperation” type=”remoteDBOperationType”/>

<xsd:element name=”operationParameters” type=”operationParametersType”/>

</xsd:extension>

</xsd:complexContent>

</xsd:complexType>

6.Access PortTypes

The dataFormats SDE lists all possible dataFormats of results from Access porttype operations. A GSH is a valid dataFormat in which the case the creation of a new service occurs, this new services should inherit the agreement from the existing service implementing the Access PortType. This new service should therefore exhibit the same behaviour characteristics as the existing service (see behaviour SDE’s).

6.1SQLAccess (extends Data Access)

This allows access to the underlying data by means of SQL expressions.

6.1.1SDE’s

All those defined for a Factory within Grid Data Services Specification. No additional SDE’s required.

6.1.2Operations

In all operations below the expression input is a single SQL expression, i.e. only one result arises, new PortTypes would need to be specified to allow block execution.

6.1.2.1sqlAccess::sqlQuery

Direct an SQL Query to relational view of the data where the result of the query is returned as conforming to the DataFormat requested.

Input

  • expression: the SQL query string that is to be run on the data resource.

<xsd:element name=”sqlExpression” type=”sqlExpressionType”/>

  • dataFormat: the format selected from SDE dataFormats, which the return type will conform to.[JM9]

<xsd:element name=”dataFormat” type=”dataFormatType”/>

Output

  • dataSet: output the results in the dataFormat.

Fault(s)

  • InvalidQuery: the supplied SQL is syntactically incorrect or fails during evaluation.
  • Fault: any other fault.
6.1.2.2sqlAccess::sqlUpdate

Direct an SQL update to the data

Input

  • expression: the SQL update string that is to be run on the data resource.

<xsd:element name=”sqlExpression” type=”sqlExpressionType”/>

  • dataFormat: the format selected from SDE dataFormats, which the return type will conform to.[JM10]

<xsd:element name=”dataFormat” type=”dataFormatType”/>

Output

  • updateCount: the updateCount returned from the sqlUpdate.[JM11]

Fault(s)

  • InvalidQuery: the SQL supplied to do the update is incorrect or an SQL query statement is being supplied to this operation.
  • Fault: any other fault.
6.1.2.3sqlAccess::sqlExecute

Direct an SQL statement to the data, can be an update or a query or anything!

Input

  • expression: the SQL string that is to be run on the data resource.

<xsd:element name=”sqlExpression” type=”sqlExpressionType”/>

  • dataFormat: the format selected from SDE dataFormats, which the return type will conform to.[JM12]

<xsd:element name=”dataFormat” type=”dataFormatType”/>

Output

  • dataSet:output the results in the dataFormat

Fault(s)

  • InvalidQuery: the SQL supplied to do the update is incorrect or an SQL query statement is being supplied to this operation.
  • Fault: any other fault.

6.2RowsetAccess (extends Data Access)

This allows access to the underlying data by means of rows.

6.2.1SDE’s[JM13]

  • nTupleFormat: The format and data types of the tuples

<sd:serviceData name=”nTupleFormat”

type=”nTupleFormatType”

minOccurs=”1” maxOccurs=”unbounded”

mutability=”mutable”

modifiable=”false”

nillable=”false”/>

6.2.2Operations

6.2.2.1rowsetAccess::getNextNTuples

Return a specified number of tuples from a service that virtualizes a result set

Input

  • startPosition the position of the first tuple to be returned (1st tuple is position 1)

<xsd:element name=”startPosition” type=”xsd:int”/>

  • count: the number of tuples.

<xsd:element name=”count” type=”xsd:int”/>

Output

  • tuples: the result of the update operation.

Fault(s)

  • InvalidCount: Cannot return that number of tuples.
  • Fault: any other fault.

6.3RemoteDBOperationAccess (extends Data Access)

This allows access to the underlying data by means of executing a Remote Database Operation.

6.3.1SDE’s[JM14]

All those defined for a Factory within Grid Data Services Specification. No additional SDE’s required. There are SDE’s within the RelationalDescription PortType that allow the discovery of available Remote DB Operations.

6.3.2Operations

6.3.2.1remoteDBOperationAccess::getReturnValue

Get the return value of the remote DB.

Input

  • remoteDBOperation: the remote Database operation to be performed

<xsd:element name=”remoteDBOperation” type=”remoteDBOperationType”/>

  • operationParameters: the parameters required to complete the Remote Database Operation.

<xsd:element name=”operationParameters” type=”operationParametersType”/>

Output

  • dataSet: output the results.[JM15]

Fault(s)

  • InvalidParameters: the supplied parameters are invalid for this Operation.
  • Fault: any other fault.
6.3.2.2remoteDBOperationAccess::getResultSets

Direct an SQL update to the data

Input

  • remoteDBOperation: the remote Database operation to be performed

<xsd:element name=”remoteDBOperation” type=”remoteDBOperationType”/>

  • operationParameters: the parameters required to complete the Remote Database Operation.

<xsd:element name=”operationParameters” type=”operationParametersType”/>

Output

  • resultSets: the resultSets returned by operation.

Fault(s)

  • InvalidParameters: the supplied parameters are invalid for this Operation.
  • Fault: any other fault.
6.3.2.3remoteDBOperationAccess::getOutputParameter

Direct an SQL statement to the data, can be an update or a query or anything!

Input

  • remoteDBOperation: the remote Database operation to be performed

<xsd:element name=”remoteDBOperation” type=”remoteDBOperationType”/>

  • operationParameters: the parameters required to complete the Remote Database Operation.

<xsd:element name=”operationParameters” type=”operationParametersType”/>

Output

  • output: The output parameter from the operation

Fault(s)

  • InvalidParameters: the supplied parameters are invalid for this Operation.
  • Fault: any other fault.

7.Stored Procedures

RemoteDBOperationFactory and RemoteDBOperationAccess PortTypes have been defined above to allow access to data by the use of Remote Database Operations, these include Stored Procedures.

The RelationalDescription PortType facilitates the discovery of which stored procedures, user defined functions etc exist.

8.Use Cases

The following use case show how the specification should be used to solve particular scenarios.

8.1Synchronous sqlQuery

Service implements SQLAccess portType.

  1. Client sends sqlQuery request (as XML document) to SQLAccess::sqlQuery
  2. Service performs query and returns results in the response document

8.2“Asynchronous” sqlQuery

Service1 implements sqlFactory portType.

  1. Client sends sqlQuery request to the Factory::createService
  2. Service1 creates new service Service2 implementing RowsetAccess portType.
  3. Service1 returns GSH of Service2 to Client.
  4. Service2 performs query and caches results.
  5. Client may query the status SDE of Service2 to find out whether data is ready.
  6. When data is ready Client uses DataAccess::RowsetAccess to retrieve the results.

8.3Creating a new RDBMS (no virtualisation)

Service implements RDBMSAccess portType

  1. Client requests creation of new RDBMS.
  2. Service creates new database in the relational system
  3. Service returns success code.

8.4Creating a virtualisation of a new RDBMS

Service1 implements RDBMSAccess portType

  1. Client sends name of RDBMS Factory::createService and requests an RDBMSAccess interface.
  2. Service1 creates new service Service2 implementing RDBMSAccess portType. Service2 is associated with Collection2.
  3. Service1 returns GSH of Service2 to Client.

8.5Creating an empty resource (no virtualisation)

Service implements RDMSAccess portType.

  1. Client requests creation of new (empty) resource Resource1 from XMLCollectionAccess::createResource.
  2. Service creates new resource Resource1.
  3. Service returns success code.

8.6Removing a resource

Service implements RDBMSAccess portType.

  1. Client requests deletion of resource Resource1 from RDBMS::removeResource.
  2. Service removes resource
  3. All services associated with this resource are terminated.
  4. Service returns success code.

8.7Creating and virtualising a new resource

Service1 implements RDBMSFactory portType.

  1. Client1 requests creation of new resource Resource1 from RDBMSFactory::createService and creation of interface SQLAccess
  2. Service1 creates new resource Resource1.
  3. Service1 creates Service2 implementing SQLAccess portType.
  4. Service returns GSH of Service2.
  5. Client1 loads data into Service2 using SQLAccess::sqlUpdate.
  6. Client2 queries Service2 using SQLAccess::sqlQuery.

8.8Calling a stored Procedure

9.Security

The Relational Realisation of a Grid Data Service will use standard Grid Security mechanisms as specified by OGSa Security working group combined with standard ways of relation Grid credentials and authorities to resource access rights. The assumption is that these standards will also indicate how to make information related to authentication, authorization security etc available.

10.Conclusion

11.REMOVED

Anything related to Management has been removed.

This includes

  • transactionCapabilities
  • securityCapabilities
  • physicalProperties
  • softwareCapabilities
  • Driver
  • Location

Additionally

  • bufferSizes
  • ANSILevel

RDBMSAccess has been removed, we need to resolve overlap with management models work.