/ EUROPEAN COMMISSION
EUROSTAT
Directorate B: Methodology, corporate statistical and IT services
Unit B-1: Methodology and corporate architecture

ESTAT/B1/WGM(16)1.2
Available in EN only

1st meeting of the
Working Group on Methodology (WGM)

Luxembourg, 7 April 2016

Eurostat building BECH, Room Quetelet

Start: 7 April 2016, 9:30

End: 7 April 2016, 17:00

Item 2.4
ESS Vision 2020 Validation project: Follow-up actions

ESS Vision 2020 Validation project: Follow-up actions

1.Purpose

The ESS.VIP Validation ended in November 2015. This document presents the main deliverables of the project andthe follow-up actions foreseen. The document also advances proposals on how the Working Group Methodology could contribute to the next steps to be taken in the area of validation in the ESS. In particular, the document proposes the creation of a Task Force on Validationreporting to the Working Group Methodology.

2.Expected outcome

The Working Group Methodology is expected to:

  • Take note of the deliverables of the ESS Vision 2020 Validation project and of the proposed follow-up actions.
  • Discuss and endorse the proposal for the creation of a Task Force on Validation reporting to the Working Group Methodology.

3.Problem statement

ESS.VIP Validation: objectives and outcomes

In November 2012, the ESS Vision 2020 Validation project was launched. The objective of the ESS Vision 2020 Validation project was to provide the methodological and architectural frameworks necessary to ultimately achieve the two following medium-term goals.

  • Ensure the transparency of the validation procedures applied to the data sent to Eurostat by the ESS Member States through a common validation policy focusing on the attribution of validation responsibilities among the different actors in the production process of European statistics.
  • Improve the interoperability between Eurostat and Member States through the sharing and re-use of validation services across the ESS on a voluntary basis.

The Validation project ended in November 2015. A summary report on the main activities of the project can be found in Annex I.

Among the deliverables of the ESS Vision 2020 Validation project, the ESS methodological handbook for validation is of particular interest to the Working Group Methodology. The ESS methodological handbook for validation is expected to provide a common business language (i.e. definitions, classifications, metrics) to underpin all future discussions about validation in the ESS. A brief summary of the contents of the handbook can be found in Annex II.

Follow-up actions: the role of the Working Group Methodology

The ESS Vision 2020 Validation project provided several deliverables that can improve the validation process for the data sent by Member States to Eurostat. However, in order to reap the benefits of the investment made on the Validation project, the ESS must implement these deliverables into production.

To this end, Eurostat and the ESS are in the process of defining a comprehensive set of follow-up actions to the ESS Vision 2020 Validation project. Afirst high-level proposal for follow-up actions was discussed and endorsed at the ESSC meeting in February 2016. A more detailed proposal for follow-up actions was subsequently discussed and received broad support at the DIME/ITDG plenary meeting at the end of February 2016. The set of follow-up actions discussed at the DIME/ITDG can be found in Annex III. Eurostat is currently working with a subgroup of DIME/ITDG members with the goal of finalising the set of follow-up actions to be proposed at the ESSC in May 2016.

The success of the follow-up actions listed in Annex III relies on the appropriate involvement of Member States in their implementation. To this end, at its plenary meeting in February, the DIME/ITDG endorsed Eurostat's proposal to create a Task Force on Validation under the responsibility of the Working Group Methodology.

4.Proposed action

In accordance with the conclusions of the DIME/ITDG plenary meeting in February, the Working Group Methodology is asked to endorse the creation of a Task Force on Validation under its responsibility. The Task Force on Validation would accompany and support some of the follow-up actions described in Annex III until the end of 2017. In particular, as discussed at the DIME/ITDG, the Task Force would contribute to the following activities:

  • Pilot implementation of the methodological handbook in statistical domains: Over the course of 2016 and 2017, pilot implementations of the ESS methodological handbook will be carried out in several statistical domains. The implementation will consist in applying the common definitions and classifications contained in the handbook to document validation rules according to a common template. The Task Force on Validation will contribute to monitoring these pilot implementations and to identify potential improvements to the ESS methodological handbook.
  • Pilot implementation of VTL in statistical domains: Over the course of 2016 and 2017, pilot "pencil-and-paper" exercises to encode validation rules using the improved version of VTL will be carried out in several statistical domains. The Task Force on Validation will contribute to monitoring these pilot implementations and to assess the maturity of VTL as a standard ESS validation syntax. The Task Force will also serve to widen awareness of VTL in the ESS and to ensure that Member State needs are taken into account in its further development.
  • Finalisation of the Business and IT architecture for validation in the ESS:As detailed in Annex I, the ESS Vision 2020 Validation project produced a proposal for a Business and IT architecture for validation in the ESS. The proposal outlines the to-be state for validation in the ESS. At this stage, the document is only a proposal and further consultation with Member States is needed before it is presented for approval to ESS decision-making bodies. The Task Force on Validation will contribute to its finalisation.

The proposed mandate of the Task Force can be found in Annex IV. While the Task Force will be under the responsibility of the Working Group Methodology, some of its activities will require occasional reporting to other Working Groups under the DIME/ITDG governance (e.g. the Working Group Standards as regards the monitoring of the pilot implementations of VTL).

5.Timeline

The table below outlines the steps that have been and will be taken towards thefinal approval by the ESSC of the follow-up actions to the ESS Vision 2020 validation project.

Time / Actor / Action
February 2016 / ESSC / Presentation of the end-of-project report for the Validation project and high-level discussion on follow-up actions.
February 2016 / DIME/ITDG / Discussion of detailed proposal for follow-up actions.
Decision to create a Task Force under the WG Methodology to support the follow-up actions.
April 2016 / WG Methodology / Approval of the Task Force on Validation.
April 2016 / VIG / Consultation of the VIG on draft ESSC proposal for follow-up actions
May 2016 / ESSC / Presentation of the final proposal for follow-up actions for approval

Annex I

Summary of the activities of the ESS Vision 2020 project

Data validation in the ESS: Drivers for change

The validation of data sent by Member States to Eurostat is a joint effort involving both national data providers (i.e. NSIs or other national administrations) and Eurostat. Together, these organisations must ensure that the coherence and consistency of the data they exchange is in line with expected quality standards. The overall quality of the ESS data validation process is therefore heavily dependent on the quality and depth of the collaboration between Eurostat and national data providers.

While they vary considerably between different statistical domains, current validation practices exhibit shortcomings which could be corrected through strengthened collaboration. The main shortcomings are listed below:

  • In several domains, the lack of a clear repartition of validation responsibilities among the different partners involved in the production process leads to double-work in the ESS on the one hand; and to the risk of "validation gaps", i.e. cases where essential validation procedures are not carried out by any of the actors on the other hand.
  • The lack of shared and easily accessible documentation on validation procedures can lead to time-consuming misunderstandings between Eurostat and ESS data providers when data validation problems arise (this phenomenon has been dubbed "validation ping-pong"). It can also lead to difficulties in assessing whether the quality assurance mechanisms applied to data sent to Eurostat are "fit-for-purpose".
  • The lack of common standards for validation solutions leads to a duplication of IT development and integration cost in the ESS. In particular, the ESS is currently incurring high opportunity costs in the area of validation by not exploiting the general trend in the IT world towards Service-Oriented Architecture (SOA) and its potential benefits in terms of reuse and sharing of software components. The work done by the ESDEN and SERV projects would constitute a key enabler in this respect.

The response of the ESS: the ESS Vision 2020 Validation project

In order to respond to these issues, in May 2012 Eurostat presented the ESSC with a strategic paper on "the general principles and orientation for a review of validation policy in the ESS". The paper, which was endorsed by the ESSC, proposed to modernise the way the data sent by Member States to Eurostat is validated by setting two ambitious medium-term goals:

  • Ensuring the transparency of the validation procedures applied to the data sent to Eurostat by the ESS Member States through a common validation policy focusing on the attribution of validation responsibilities among the different actors in the production process of European statistics.
  • Improving the interoperability between Eurostat and Member States through the sharing and re-use of validation services across the ESS on a voluntary basis.

On the basis of this paper, Eurostat presented the ESS Vision 2020 Validation project to the ESSC in November 2012. The objective of the ESS Vision 2020 Validation project was to provide the methodological and architectural frameworks necessary to ultimately achieve the medium-term goals mentioned above.

The ESSC approved the Validation project in November 2012 and gave the green light for the beginning of the execution phase in November 2014. The project ended in November 2015. Collaboration with the Member States was ensured through a Task Force during the course of 2014 and through an ESSnet (the ValiDat Foundation ESSnet) during the course of 2015. The main elements of the project's life span are summarised in the picture below.

2013 / 2014 / 2015 / 2016 / 2017 / 2018 / 2019 / 2020
Initiation Planning Execution Closing
2013 / 2014 / 2015 / 2016 / 2017 / 2018 / 2019 / 2020

Main outcomes of the ESS.VIP Validation

The main deliverables provided by the ESS Vision 2020 Validation project are listed below. All deliverables are available on the project's CIRCABC space[1].

  • An ESS methodological handbook for validation. The handbook is expected to provide a common business language (i.e. definitions, classifications, metrics) to underpin all future discussions about validation in the ESS.
  • The contribution to the creation and piloting of version 1.0 of the VTL (Validation and Transformation Language) validation syntax. VTL 1.0 was developed by the SDMX community and represents the first step towards an unambiguous standard to express and share validation rules in the ESS. The ESS Vision 2020 Validation project piloted the standard in two domains (Animal Production Statistics and SIMSTAT). Moreover, the ESS Vision 2020 Validation project, through the ValiDat Foundation ESSnet, conducted a thorough assessment of VTL 1.0 and identified avenues for improvement.
  • A proposal for a Business and IT architecture for validation in the ESS. This document outlines a comprehensive roadmap to move towards the medium-term goals for validation in the ESS.
  • The creation of two early prototypes for possible ESS validation services, whose purpose was mainly to start collecting know-how on the construction of ESS validation services. The project delivered prototypes for a validation rule Graphical User Interface (GUI) and an SDMX-based structural validation service (i.e. a service which validates whether data conform to a predefined data structure).

The table below describes how each of these main deliverables will be used and deployed by the European Statistical System to improve the validation process for the data sent by Member States to Eurostat. It should be noted that all the deliverables of the project can also be used on a voluntary basis by Member States to modernise their internal validation processes.

ESS Vision 2020 Validation project deliverable / Deployment scenarios
Methodological handbook / The classifications in the handbook will be used by domain-specific ESS Working Groups as the basis for the systematic documentation of the validation rules applied to the data sent to Eurostat.
VTL version 1.0 / The development of VTL 1.0 represents a first step towards a standard syntax for validation in the ESS. Prior to deployment, it will need to be improved in the future by taking into account the assessment of the ValiDat Foundation ESSnet, as well as the results from the pilots conducted in concrete statistical domains by the ESS Vision 2020 Validation project. Increased involvement of Member States will be a cornerstone of the further development of VTL.
In the medium-term, it is expected that domain-specific ESS Working Groups will use the improved validation syntax to unambiguously document the validation rules applied to the data sent to Eurostat.
Proposal for a Business and IT architecture for validation in the ESS / The deliverable proposes a to-be state and a roadmap for how validation of the data sent to Eurostat should be performed in the future. The deliverable will form the basis for further discussions on the topic with Member States.
Prototypes for ESS Validation services / Since they are only prototypes, these two deliverables are not meant to be immediately deployed. The prototypes will be tested and reviewed in order to gather input for further developments.
It should be noted that, as specified in the proposal for a Business and IT architecture discussed above, the use of shared validation services by Member States will be optional.

Annex II

Presentation of the ESS Methodological handbook for validation

Data validation is a task that is performed in all National Statistical Institutes and in all statistical domains. It is indeed not a new practice, and although it has been performed for many years, procedures and approaches to it have rarely been systematized. This is a cause of inefficiency both in terms of methodologies and of organization of the production system. There is a need for a generic framework for data validation in order to have a reference context and to provide tools for setting an efficient and effective data validation procedure.

The ESS methodological handbook for validation, developed by the ValiDat Foundation ESSnet between January and December 2015, provides such a framework.

The first part of the handbook is devoted to establish a generic reference framework for data validation. Firstly, the main elements needed to understand clearly what is data validation, why data validation is performed and how to carry out data validation are discussed. To this aim, a definition for data validation is provided, the main purpose of data validation is discussed taking into account the European quality framework, and finally, for the ‘how’ perspective, the key elements necessary for performing data validation, e.g. validation rules, are illustrated.Afterwards, data validation is analysed within a statistical production system by using the main current references in this context, i.e. GSBPM and GSIM. Finally, the data validation process life cycle is described to allow a clear management of such an important task.

The second part of the handbook is concerned with the measurement of important characteristics of a data validation procedure (metrics for data validation). The introduction of data validation metrics is meant to provide guidance in the design, maintenance and monitoring of data validation procedures. In this second part, the reader can find a discussion of concepts concerning the properties of validation rules such as complexity, redundancy, completeness, and suggestions about how to analyse them.

The document is intended for a broad category of readers: survey managers, methodologists, statistical production designers, and more in general for all the people involved in the data validation process. In fact, the first important objective of the document is to provide a common language about data validation that can be used both in the design and production phases of official statistics.

Annex III

Proposed follow-up actions to the ESS Vision 2020 Validation project

In November 2012, the ESS Vision 2020 Validation project was launched. The objective of the ESS Vision 2020 Validation project was to provide the methodological and architectural frameworks necessary to ultimately achieve the two following medium-term goals.

  • Goal 1: Ensuring the transparency of the validation procedures applied to the data sent to Eurostat by the ESS Member States through a common validation policy focusing on the attribution of validation responsibilities among the different actors in the production process of European statistics.
  • Goal 2: Improving the interoperability between Eurostat and Member States through the sharing and re-use of validation services across the ESS on a voluntary basis.

The follow-up actions proposed attempt to capitalise on the results of the Validation project by focusing on the following aspects:

  • Implementing the already available actionable deliverables in ESS statistical production processes.
  • Developing the missing actionable deliverables needed to support the realisation of the medium-term goals for validation.

Follow-up actions for medium-term goal 1

Timeline / Follow-up action / Action type
(Implementation/Development)
Ongoing actions
(Until Mid-2016) / Improvements to VTL 1.0
Eurostat has been working with the VTL community and with ESS members to take into account the criticisms of VTL raised by the ValiDat Foundation ESSnet. A new version of VTL will come out in Q2 2016. / Development
High priority
(Mid-2016 to Mid-2017) / Implementation of the methodological handbook in statistical domains
The classifications in the handbook will be used by domain-specific ESS Working Groups as the basis for the systematic description of the validation rules applied to the data sent to Eurostat. / Implementation
Piloting of VTL in statistical domains
Piloting of the use of VTL in domain-specific ESS Working Groups to unambiguously document the validation rules applied to the data sent to Eurostat. / Implementation
Finalisation of the Business and IT architecture
The ESS Vision 2020 Validation project produced a proposal for a Business and IT architecture for validation in the ESS. The Business and IT architecture will need to be finalised taking into account input from Member States. / Development
Lower priority
(Mid-2017 onwards) / Implementation of the Business and IT architecture
The common Business and IT architecture will be implemented in domain-specific ESS Working Groups. / Implementation

Follow-up actions for medium-term goal 2

Timeline / Follow-up action / Action type
(Implementation/Development)
High priority
(Mid-2016 to Mid-2017) / Testing and piloting of available prototypes
The prototypes produced by the Validation project will be tested and reviewed in order to gather input for further developments. / Implementation
Definition of shareable ESS validation services
Based on the results of the testing of the prototypes and on an analysis of Eurostat and Member State needs, a portfolio of shareable ESS validation services will be defined. / Development
Lower priority
(Mid-2017 onwards) / Development of shareable ESS validation services
Development of the shareable ESS validation services previously defined. / Development

Annex IV