European Commission – DG Eurostat
ESS.VIP Programme
Cross-cutting project on sharing statistical SERVices
Statistical Service Implementation
STRUVAL Structural Validation Service
Phase 1 – based on SDMX Converter
Version 0.95
19/10/2015
Table of Contents
1General information
1.1Service name
1.2Service version
1.3Relation to Service Definition and Specification
2Invocation Protocols
3Data-by-reference Protocols
4Canonical data models
5Non-canonical data models
6Distribution
7Service Contract
7.1Operation <validate>
7.1.1Function
7.1.2Statistical methods
7.1.3Invocation protocols
7.1.4Inputs
7.1.5Outputs
7.1.6Pre-conditions
7.1.7Post-conditions
7.1.8Metrics
7.1.9Business Exceptions
7.1.10Compensation
7.1.11Specific requirements
8Parameterization
9Requirements for security
9.1Security mechanisms
9.1.1Non-repudiation
9.1.2Integrity
9.1.3Authentication and trust domains
9.1.4Self-registration
9.1.5Authorization
9.1.6Encryption
9.1.7Data at rest
9.1.8Data in transfer (end-to-end)
9.2Data protection
10Policies
10.1Security assertions
10.2Quality of service assertions
10.3Message format assertions (compliance)
10.4Other Policies
10.5Terms of use
11Non-functional characteristics (QoS)
11.1Reliability
11.2Availability
11.3Performance
11.4Multilingual support
11.5Error handling
11.6Process metrics
12Technical Dependencies
13SOA Layering
1General information
1.1Service name
STRUVAL Structural Validation
1.2Service version
1.0
1.3Relation to Service Definition and Specification
In this document we describe the implementation of the Phase 1 SDMX Converter-based STRUVAL Service, along the lines set out in the service definition and specification.
The STRUVAL Service is based on an extended version 5.1 of the SDMX Converter Web Service, which is a part of the SDMX Converter toolchain (alongside with the API, GUI, and command line version).
The structural validations covered by this initial release of the STRUVAL Service include:
- Verifying that the SDMX-ML message (the dataset) is a well-formed XML document.
- Verifying that the structural elements in the SDMX-ML message (header, dataset, groups, series, observations, etc.) are correctly ordered and nested.
- Detecting misplaced, undefined, and missing dimensions and attributes at the dataset, group, series, and observation levels.
- Detecting invalid data format and invalid values for time-period concepts.
- Detecting invalid codes, based on the code lists and the dataflow constraints.
- Detecting duplicated observations.
The result of the STRUVAL Service is a machine-readable validation report containing the overall success indication and, in case of validation errors, a list of detected errors (up to a user-configurable limit). Each detected error is characterized using a standard error code, a descriptive text, and either a line/column in the input file of the incorrect XML syntactic element, or the value dimensions for the data unit (series, observation) where the error has been detected.
The initial STRUVAL Service release has the following limitations:
- The DSD has to be sent (i.e., embedded) with the service call, and needs to be self-contained, i.e., needs to include all code lists and other artifacts referenced from the DSD. In future releases, it will be possible to refer to the DSD, code lists, and other artifacts, via SDMX registry.
- Currently only a subset of SDMX-ML formats is supported. The supported formats are SDMX v2.0 Comact and SDMX v2.1 Structure-specific messages. The support for other SDMX-ML formats will be added in future releases.
- XML syntax errors in the SDMX-ML input messages (datasets), which cause a message not to be a well-formed XML document, are currently non-recoverable, in the sense that the validation process stops upon the first encounter of such an error.
2Invocation Protocols
STRUVAL is being developed by Eurostat in order to assist the Member States and Eurostat in the process of structural validation of statistical datafile. These structure and dictionaries are defined in a DSD (dataset definition) stored in the Euro SDMX Registry.
The STRUVAL Service is implemented as a SOAP/HTTP Web service that extends the existing SDMX Converter Web service interface by introducing a new service operation named "validate" which accepts:
- The input SDMX-ML data set, embedded in the service request.
- The data structure file, embedded in the service request; this is normally a dataflow with the embedded DSD and, optionally, dataflow constraints.
- The user-defined maximal number of validation errors to be detected and reported by STRUVAL.
The STRUVAL Service returns the validation report.
3Data-by-reference Protocols
In the initial release of the STRUVAL Service, all data are passed by value. In future releases, it will be possible to refer to DSDs, dataflows, code lists, etc. stored in SDMX Registries.
4Canonical data models
Files to be successfully structurally validated must have SDMX-ML file format, and be compliant with the SDMX-ML information model.
5Non-canonical data models
N/A
6Distribution
The Phase 1 release will be distributed within Eurostat. Later versions of STRUVAL are planned to be made available for Member States.
7Service Contract
7.1Operation <validate
7.1.1Function
To validate that the given input is a valid SDMX-ML message (dataset) that conforms to the structural and coding rules defined by the SDMX standard and the given DSD/dataflow.
7.1.2Statistical methods
No statistical method.
7.1.3Invocation protocols
SOAP/HTTP
7.1.4Inputs
Parameter name / Type / DescriptioninputData / base64Binary / The embedded input SDMX-ML document (the dataset).
dsdStructure / Base64Binary / The embedded data structure file (DSD/dataflow with constraints).
maxErrorNumber / int / The maximal number of validation errors to report.
7.1.5Outputs
Parameter name / Type / DescriptionreturnCode / int / The overall return code:
- <0 for structural errors in DSD and/or malformed input data XML document
- 0 if no structural validation errors have been found
- >0 if one or more structural validation errors have bene found
errorsFound / int / The number of structural validation errors that were found, if returnCode>0.
moreErrors / Boolean / Set to true if there were more errors than reported (i.e., over the user-defined error limit)
Errors / XML [0..*] / Description of each encountered error, if returnCode>0, up to the user-configured error limit.
Each error has:
- Descriptive errorClass.
- Numeric code and textual description.
- Boolean fatal indicating that this error has forced further validation to stop.
- Attachment level (dataset, series, etc.) described by attachedTo string field.
- Either:
- Numeric line and column indicating the location in inputData, or
- Set dimensions giving the coordinates of the offending data.
7.1.6Pre-conditions
Pre-condition / DescriptionSelf-contained data structure / The dsdStructure data structure file has to be self-contained, i.e. needs to contain all necessary structural elements: DSD, code lists, constraints.
7.1.7Post-conditions
Pre-condition / DescriptionValidation performed / returnCode >= -1
7.1.8Metrics
Metric / KPI / DescriptionDuration of processing / KPI-1 / Duration of the structural validation process
Time before processing / KPI-2 / What is the maximum delay between service launch and start of the datafile validation?
Concurrent access / KPI-3 / What is the maximum number of concurrent access?
Maximum processing capacity / KPI-4 / Maximum processing capacity of the service (number of file x size of file processed at the same time or during a defined period)
7.1.9Business Exceptions
None. The validation result is always returned.
7.1.10Compensation
None.
7.1.11Specific requirements
The input encoding for the embedded files in inputData and dsdStructuremust be UTF-8.
8Parameterization
Currently the only parameter is the maximum number of validation errors to report.
9Requirements for security
9.1Security mechanisms
9.1.1Non-repudiation
N/A
9.1.2Integrity
N/A
9.1.3Authentication and trust domains
Authentication is done using ECAS.
9.1.4Self-registration
Done using ECAS.
9.1.5Authorization
The initial release of the STRUVAL service does not access any external resources, and therefore does not need user authorization of that kind. However, in future releases, the service may require additional authorizations to access data stored in the registry etc.
9.1.6Encryption
The STRUVAL service receives and returns plain-text data.
9.1.7Data at rest
N/A
9.1.8Data in transfer (end-to-end)
Case of confidential datafile:
-Transmission has to be encrypted (TLS)
9.2Data protection
The STRUVAL Service does not store any data in the file system or
10Policies
10.1Security assertions
Use of HTTPS as confidential data file may be sent for structure validation.
10.2Quality of service assertions
To be elaborated based on the exploitation data:
- Measurement of the request processing duration.
- Defining the delay after which the service is stopped.
10.3Message format assertions (compliance)
Message format is SDMX-ML compliant to SDMX 2.0 / 2.1.
10.4Other Policies
None.
10.5Terms of use
- General term of use defined for service at Eurostat.
- Service security policy of Eurostat.
- The first version will be a Proof of Concept and not yet ready for production use.
11Non-functional characteristics (QoS)
11.1Reliability
Message returns to the user is reliable in a sense of guaranteed delivery.
11.2Availability
The service should be available at minimum 95% of time during working hours 8:00 – 18:00 (only working days). This especially applies to the peak times between 10:30 and 16:00.
11.3Performance
(To be defined based on the exploitation data.)
11.4Multilingual support
No multilingual requirement here.
11.5Error handling
The STRUVAL Service should never fail and return a SOAP Service Fault message, unless under abnormal conditions of the execution environment (a network, servlet container, Web application server, Java Virtual Machine, or operating system failure).
The STRUVAL Service should always return a response described in Section 7.1.5 within a finite amount of time.
11.6Process metrics
-Number of concurrent processing
-Maximum size of the file to validate
-Maximum duration of a validation process
-Maximum delay between 2 processes.
12Technical Dependencies
The first version of the STRUVAL Service is self-contained, and has no external technical dependencies.
13SOA Layering
STRUVAL - SERV_Statistical_Service_Implementation v0.95 (05/09/2014)