Title: Use of information reported for Article 8 – concept paper on compliance checking
Version no.: 1 Date: 13 March 2008
Author(s): Steve Nixon, WRc
Contacts:
Violeta Vinceviciene (DG ENV) (), Jorge Rodriguez Romero (DG ENV) (), Stefan Jensen (EEA) (), Steve Nixon (WRc) () /
1. Introduction
Member States (MS) have reported electronically on Article 8 through WISE. The basis of the reporting was the reporting sheets endorsed by the Water Directors in November 2005. There will be a need to check and validate the reported information before it can be used by those requiring the information and before it can be disseminated more widely to the public via the WISE viewer.
The aim of this paper is primarily to consider how the reported information can be used to check and assess MS compliance with the requirements of Article 8.
A number of steps are foreseen in the process:
· Preliminary QA/QC of the submitted data (XML files)
· Initial compliance screening and quality checking;
· Creating an inventory of everything that has been reported;
· Some interim analysis of the reporting of some key elements;
· Full in-depth analysis and assessment demonstrating the use of defined compliance indicators for a few (e.g. 4 to 6) MS that have supplied high quality reports/data.
· Assessment of all MS’s reports using agreed indicators and a process that can automatically take account of new and updated reports from MS.
At some stage in the process, as necessary, an EU-wide database of quality assured and validated data and information will be created. The information will also be used for other purposes such as in the WISE viewer and for State of the Environment (SOE) reporting and assessment: the former use is the subject of another concept paper on IT and visualisation aspects of the Article 8 information; and the SOE aspects are being led by the EEA.
2. WISE data flows
The WISE dataflow is illustrated in Figure 1.
Work so far on Article 8 has dealt with steps 1, 2, 3 and 4 in the WISE dataflow. The XML schema and Access tool have been successfully used by most MS resulting in a delivery of Article 8 data into the CDR (Step 3).
Acceptance/validation checking (step 4) was included in the schema design and delivery process. Though there was an opportunity for MS to automatically check their XML files before submission into the CDR, acceptance of the XML files into the CDR was not dependent on the files passing all validation rules and checks. The optional checking included, for example, the inclusion of four mandatory fields in the monitoring programmes and surface water stations files. If the XML files were checked during the uploading process a validation report was automatically produced. No further cross-checking, for example with linked elements in Articles 3 or 5, was undertaken at that stage.
Figure 1 WISE data flow (all steps involve QA/QC)
A feedback mechanism for the correction of errors and improving the completeness of submissions has been led by the EEA in direct contact with the data providers. This QA/QC work has proved very successful and has considerably improved the quality of the data. An extract of the log of the QA/QC work done can be found in Annex 1.
In the future some of the QA/QC checks that currently are done manually could be done automatically by the Reportnet system. In any case, as much QA/QC as possible should be done at XML level before proceeding to load this information into any database.
The production database will contain EU-wide high quality validated data for use in the WISE viewer etc and will be built by merging national datasets. Its construction will involve a more thorough QA/QC check both from a IT perspective and from a substance/contents perspective.
After all possible QA/QC checking has been completed at XML level, an EU-wide intermediate database will be created. The intermediate database may be updated with new data as more Member States report or update their information, and this will in turn be used to update the production, working and analytical databases, and the WISE viewer at regular intervals.
The WISE viewer will provide access to a subset of the whole information, and it may need to have the information pre-processed in order to be visualised at different scales, etc (see section 4 of this paper for more details).
A working database may be produced from the production database by the Commission (or its contractor) in the compliance checking process. It may generate new information (usually new tables with summaries, statistics, cross checks with other information etc) from a snapshot of the EU wide production database. The working database would have the queries etc. required to complete the compliance assessment. Not all information and analysis would be suitable for widespread and general use by others, though eventually some information may be later displayed in the WISE viewer. The working database would also annotate data so that some data can be kept confidential.
3. Compliance checking
The Commission’s current thinking with regards to the steps in compliance checking is shown in the figure below.
Figure 2
Three main questions relate to the reported Article 8 data and information:
· Are the reports complete (provision of mandatory fields) and clear (values in code lists correct and numeric/character values in correct minimum/maximum ranges)?
· Are the reports understandable (sense check)?
· Are the reports compliant
o with regard to key issues (conformity checking)?
o after in-depth assessment?
There are two parts to conformity checking: checking methodologies and checking data or results.
3.1 Completeness and clarity
As described in section 2 a full checking/validation of the Article 8 information will be done by direct analysis of XML files and by creating a suitable intermediate database if it proves necessary. Queries would then be created to determine what data/information has/hasn't been provided, values of codes and ranges of numeric values etc. The queries would be run automatically using an appropriate electronic tool. The tool could ultimately be applied by MS to their own files/reports in cases where new reports or resubmission are required.
Checking the clarity and validity will be in terms of whether or not:
· mandatory elements have been reported
· relevant values match entries in given code lists
· character strings are within specified sizes
· number values are within specified ranges, including exception values
· cross linkages between files are correctly established, mainly from surface and groundwater monitoring stations to monitoring programmes files.
The identification of incorrect entries would lead to further specific checking on entries and elements, and possible follow up validation questions to MS and data re-submissions.
The basis of the queries and tool could be a ‘screening’ check list containing the fields/elements to be included. Details of the screening check list are given in Annex 1.
The information arising from the checking for completeness and clarity could be presented in a number of ways including:
· Assessment of whether there are major gaps in the reporting: if all water categories are reported, all river basin districts, etc.
· Inventory of all the fields that have been reported under Article 8 by each MS;
· Reported fields as percentage of mandatory and total fields that could have been reported;
· Listing of gaps in fields/information reported e.g.
o Methods have not been reported by (…..)
o Frequency of monitoring elements is missing in (….)
o QE’s that have not been/will not be monitored in (….)
o ……(‘n’ out of ‘x’ member states, and for ‘m’ out of ‘y’ River Basin Districts).
As an example of this type of analysis, a simple tick list has been produced of the biological quality elements included in monitoring programmes in rivers, lakes, transitional and coastal water (see Task sheet COMP 07/6). Other elements in an initial checking may include reporting of information on methods, use of standards, and levels of confidence and precision.
3.2 Sense checking
Sense checking has a number of aspects:
· Whether the entries in reported elements are valid (see text on clarity above);
· Whether the entries are consistent with values reported for Articles 3 and 5 – cross validation;
· Whether the information is consistent within and between countries, and to examine and explain any differences and inconsistencies – benchmarking.
3.2.1 Cross validation
Sense checking would also include cross validation, for example, with Article 8 information/data being successfully linked to Articles 3 and 5 reports. This would include the use in Article 8 reports of valid River Basin District and Water Body codes as reported in Articles 3 and 5. In addition, there should be cross validation between the three Article 8 schema. For example, there should be linkage of the monitoring programme codes defined in the Monitoring XML with those given in the Surfacewaterstations XML and Groundwaterstations XML.
There have already been some initial analyses of the Article 8 reports undertaken by the JRC and EEA including some aspects of cross validation. The JRC analysis included two aspects of the electronic reporting tools and linkage between WB codes provided for Article 5 and Article 8. The EEA mapped Article 8 surface water sites to Article 5 Water Bodies, (see Annex 1, Table 1) as well as a short analysis of Article 8 groundwater data.
Cross validation will also be required for the spatial aspects of the Article 8 reports. In particular the geographic coordinates of the surface water and groundwater monitoring sites are required. A site is linked (by a unique code reported in Articles 5 and 8) to a particular water body in which the monitoring station is physically located. A check will have to be made to see if the site is located in the correct water body. This will be possible by mapping where a shape file of the water body has been provided. In cases where only centroids of water bodies have been provided, a vicinity check will be possible to see if the site is in the correct RBD or country, and how far away it is from the centroid of its identified water body.
3.2.2 Benchmarking and checking of outliers
Sense checking may also include initial analysis and benchmarking of some aspects of the information. For example, there may be large differences between countries in the numbers of sites used for surveillance and operational monitoring. A way of ‘normalising’ numbers is to relate them to the relevant land surface area (e.g. country, RBD, or sub-unit) and/or the lengths/areas of water bodies being monitored or represented. A comparison of these monitoring “densities” may reveal large differences within and between countries.
Of course, differences in monitoring site density may relate to differences in natural characteristics of river basins and water bodies, and the numbers and extent of pressures upon them; rather than on differences in levels of ambition or interpretation of the Directive. Indicators could be derived for each MS, RBD and sub-unit and their values compared to average and outlying values from all reports. Outlying values may be further investigated to determine and explain the reason(s) for the differences.
Such an approach has been used in analysing Article 5 reports. Here there were obvious differences in the reported percentages of provisionally designated Heavily Modified Water Bodies (pHMWB) in RBDs. On average 20% of Water Bodies were designated as pHMWB. A benchmark value of 40% for a RBD was taken above which more detailed analysis was required. The benchmark value has not been, and would not be, taken as a judgemental value, rather it is a value that might trigger further sense checking and investigation.
Benchmarking may also serve to highlight issues and MS/RBDs that may require more detailed analysis and assessment of how the Directive has been interpreted and implemented. This is discussed in the next section.
3.3 Compliance indicators and key issues
Benchmarking may highlight cases where there will be a need to check the consistency in the design and implementation of monitoring networks with the requirements of the Directive and also to assess in more depth the comparability of monitoring between countries. This might be most efficiently achieved through the careful definition of indicators for the assessment of ‘key issues’. The selected indicators would be produced through data and information reported for Article 8 supported, and where required, by information reported under Articles 3 and 5.
Compliance indicators are a tool for benchmarking, but cannot be used directly to extract conclusions about the compliance with the WFD. They are used to flag deviations that have to be looked at more closely.
3.3.1 Identification of key issues
Key issues with regards to Article 8 and the design and implementation of monitoring networks might include those identified in the workshop on surface waters monitoring networks and classification systems held in Brussels on 27-28 April 2006.
· *Risk category of water bodies included in surveillance monitoring.
· *Grouping of water bodies, i.e. percentage of water bodies covered by monitoring.
· Numbers of monitoring stations/water bodies* included in surveillance and operational monitoring.
· Numbers of monitoring stations in water bodies, and in particular very large* (coastal) water bodies.
· Quality elements and parameters monitored.
· Frequency of monitoring.
· Confidence and precision of monitoring results.
Assessment of some issues would require data/information previously reported and a proper link between the information provided in the various reporting exercises. For example, those issues above marked by a ‘*’ would require information reported for Article 5.
Note that data integrity and quality assurance checking is required of the Article 5 data submitted electronically through WISE (Task sheet WISE 07/17), and that the aim is to harmonise the Article 3 and Article 5 databases which will be hosted by the EEA (Task sheet WISE 07/9). In the meantime for preliminary checking it may be possible to abstract and use the relevant Article 3 and 5 data from existing working databases particularly from those countries that have provided the most complete and clear datasets.