/ EUROPEAN COMMISSION
EUROSTAT
Directorate E: Agriculture and environment statistics; Statistical cooperation
Unit E3: Environment statistics /
Doc. ENV/ACC/WG/08 (2007)
Original in EN
Point 13.1 of the agenda
NAMEA survey 2006 – Technical Documentation of
data validation and estimation of EU-aggregates
Stephan Moll, Wuppertal Institute
for Eurostat – Unit E3
Working Group "Environmental Accounts"
Joint Eurostat/EFTA group
Meeting of 7 and 8 May 2007
BECHBuilding – Room Quetelet

Meeting Documents can be soon downloaded from the Environment statistics meetings CIRCA site at

Please note that, for environmental reasons, paper copies of meeting documents will not be available in the meeting room. The only exceptions will be documents that are not posted on CIRCA at least one week before the meeting.

1Overview on NAMEA-air survey 2006

The survey started on 12 June 2006 and deadline for responses was set to end of August 2006.

The electronic questionnaire (a simplified and streamlined version of the NAMEA-air standard tables) was launched through a letter to all countries. For each country, the electronic questionnaire was pre-filled based on data available at Eurostat from former surveys. The pre-filled questionnaires were placed on a protected internet library (CIRCA). The countries were asked to confirm or revise pre-filled data and to up-date the time series. Further, countrieswere asked to provide certain meta information related to NAMEA activities in their respective countries.

During the survey period ad hoc technical supportwasprovided toDenmark, Poland, and Austria.On 13 September, a reminder was sent to those countries which had not replied until this date.

Altogether, 24 countries reacted on the survey. 20 countries provided data in the one or the other form (Czech Republic,Lithuania, Luxemburg, and Malta sent notifications that no data can be expected in the short term).

16 countries provided data in the form of filled 2006-questionnaires. 4 countries provided data in a different format (BE, CH, HU, and IE). The consultant transferred those data to the 2006-questionnaire in close consultation with the respective counties.

None of the countries made use of the pre-filled data. Moreover, countries preferred to send new (revised) time-series.

The following table provides an overview on country replies:

Country / by / when / response type
1 / Austria / AT / Sacha Baud / 07. Sep 06 / CIRCA
2 / Belgium / BE / Data were not sent via 2006-questionnaire. / In February 2006, BE had sent data in the format of ‘old’ standard tables. / SMO transferred from standard table to ‘new’ 2006-questionnaire format
3 / Bulgaria / BG / Rumiana Ivanova on behalf of Stefan Tsonev Head, Environment Statistics Division, NSI of Bulgaria / 31. Okt 06 / e-amil sent to Julio Cabeca
4 / Switzerland / CH / CH has not sent in the 2006-questionnaire / CH has published NAMEAs for GHG-emissions and the year 2002 / SMO transfered data from publication to new 2006-questionnaire format
CzechRepublic / CZ / Eva Krumpová / Notified by e-mail (31.8.06) that deadline cannot be met
5 / Germany / DE / Uwe SCHERHAG / 10. Okt 06 / by e-mail
6 / Denmark / DK / Thomas OLSEN / 11. Jul 06 / CIRCA
7 / Estonia / EE / Eda Grüner / 06. Sep 06 / CIRCA
8 / Spain / ES / Marisa EGIDO MARTIN / 01. Sep 06 / by e-mail
9 / France / FR / Regis MORVAN / 30. Aug 06 / by e.mail
10 / Hungary / HU / Data were not sent via 2006-questionnaire. / Instead, publication with tables was sent by Orsolya BALINT / SMO transfered data from publication to new 2006-questionnaire format
11 / Ireland / IE / Data were not sent via 2006-questionnaire. / Instead, an EXCEL workbook was sent; containing data in Irish format / SMO transferred data from Irish format to 2006-questionnaire
12 / Italy / IT / Angelica TUDINI / 31. Aug 06 / CIRCA
Lithuania / LT / Danguole KREPSTULIENE / notified by e-mail (15.9.06) that no data can be expected
Luxembourg / LU / notification received (via 2006-questionnaire) / no data can be expected (Eric de BARBANTER, 19.10.06) / CIRCA
Malta / MT / Decelis Rachel / Notification by e-mail (14.7.06) that Malta only compile air emission inventories
13 / Netherlands / NL / Cor GRAVELAND / 11. Sep 06 / by e-mail
14 / Norway / NO / Kristine KOLSHUS / 01. Sep 06 / by e-mail
15 / Poland / PL / Danuta DZIEL / 01. Sep 06 / CIRCA
16 / Portugal / PT / Isabel QUINTELA / 31. Aug 06 / by e-mail
17 / Romania / RO / Daniela STEFANESCU / 31. Aug 06 / by e-mail
18 / Sweden / SE / Anders WADESKOG / 15. Sep 06 / CIRCA
19 / Slovenia / SI / BUTINA Vida / 29. Aug 06 / by e-mail
20 / United Kingdom / UK / Ian GAZLEY / 31. Aug 06 / CIRCA

2Data validation

The data validation of the 2006-survey is divided into several sub-tasks:

  • Pre-checks of incoming filled questionnaires
  • Development of check-procedures to facilitate data validation
  • Comparing survey data with auxiliary data (UNFCCC, CLRTAP, IEA)

2.1Pre-checks of incoming filled questionnaires

As a first step, procedures needed to be developed in order to pre-check incoming data. These procedures were discussed and developed jointly with Eurostat’s IT-staff during a meeting on 12 October 2006. The pre-checking includes:

  • Processing the original 2006-questionnaire for automatic reading (i.e. checking whether questionnaires have been filled correctly; removing misplaced footnotes etc.)
  • Preparing overviews (i.e. mapping of which parts of the questionnaire have been filled etc.)
  • Reading EXCEL-files into ENVSTAT (for further validation within ENVSTAT)

2.2Development of check-procedures to facilitate data validation

In order to facilitate the data validation an automatic check-procedure was developed by an IT-consultant.

Ideally, the NAMEA questionnaire aims at air emission, energy and economic data broken down by 60 industries (NACE 2-digit level) – see rows 5 to 65 (in the 2006-questionnaire). The A60-level (2-digit divisions) allows full compatibility with Eurostat’s monetary Input-Output Tables.

In anticipation of the fact that some countries may not be able to report data by A60-level, the questionnaire offers the alternative option of filling so-called interim aggregates (see rows 66 to 84 in 2006-questionnaire). Those interim aggregates are certain groupings of 2-digit divisions. E.g. row 66 (A_B) is a grouping of the 2-digit-divisions 01, 02 and 05. The list of interim aggregates has been proposed by the former task force (standard table). One problem is that the interim aggregates are partly nested within each other.

Table 1 provides an overview on how the interim aggregates are related to each other and how they are related to the 2-digit divisions (A60-level).

As can be seen from Table 1, the interim aggregates can be allocated to two clusters:

  1. Interim aggregates II (A10): they refer to 8 classes which are comparably broad. If, in addition, one considers two of the 2-digit-divisions (namely, 45 ‘construction’ and 55 ‘Hotels and restaurants’) one obtains the full range of all industries making up ‘industry totals’.
  2. Interim aggregates I (A36): they refer to 12 classes which are medium broad. They are not covering the full range; i.e. are not making up industry totals. Therefore, several 2-digit classes need to be added in order to make up the industry totals.

The check-procedure was developed to check for each country and each single variable (13 air pollutants, 2 energy use variables, 4 economic variables) and comprises the following elements:

  1. Calculation and adding (and flagging) of interim aggregates if sub-components are available whilst interim aggregate is not given (e.g. calculating interim aggregate A01-02 from the two 2-digits A01 and A02);
  2. Controlling interim aggregates whether they equal the sum of sub-components if the latter are available (e.g. whether interim aggregate A01-02 is equal to sum of single 2-digits A01 plus A02);
  3. Controlling industry totals (e.g. whether sum of all sub-components equal to industry total as given in questionnaire);
  4. Controlling household totals (e.g. checking whether the three sub-components of household emissions equal to total of household emissions);
  5. Providing an overview on data availability for the three different levels of resolution (A10, A36, and A60) for each county and each variable.

Table 1: Relation between interim aggregates and NACE-2-digits

The following Table 2provides an overview on the data availability and quality of the 20 countries for which data have been validated.

Table 2: Overview on the data availability and quality of 20 countries for which data were received during 2006-survey

2.3Comparing survey data with auxiliary data (UNFCCC, CLRTAP, energy balances)

In a next validation step, survey data were compared with auxiliary data. National population statistics (from Eurostat) were used as a generic auxiliary variable in order to calculate and cross-check per capita figures of the single NAMEA-variables. For the actual air emission and energy use data, several sources were approached to obtain auxiliary data (see Table 3).

Table 3: Sources for auxiliary variables used to validate 2006-NAMEA-survey data

NAMEA variable / Auxiliary data and sources / Notes and filename
Emissions of greenhouse gases:
CO2, N2O, CH4, HFC, PFC, SF6 / United Nations Framework Convention on Climate Change (UNFCCC): Greenhouse Gas Emission Inventories
Source: National Inventories submitted to UNFCCC (submission year 2006)
File for the latest year reported: 2004
TABLE 10 s1 to s4: EMISSIONS TRENDS 1995-2004
URL: accessed, 7 Feb 2007 / Cyprus and Malta do not submit to UNFCCC.
For Luxemburg Table 10 is not available.
For Poland, only year 2004 is available
For those missing countries, EEA datasets were used.
AuxiliaryUNFCCC_GreenhouseGases.xls
Emissions of SOx, NOx, NH3, NMVOC, CO, PM / WebDab 2006 – EMEP activity data and emission database
URL: accessed, 7 Feb 2007
This online emission database is provided by EMEP (co-operative programme for monitoring and evaluation of long range transmission of air pollutants in Europe) and contains all emission data (except Large Point Source data) officially submitted to the secretariat of the Convention on Long-range Transboundary Air Pollution (CLRTAP) by Parties to the Convention.
Auxiliary data were downloaded from the website-section: "Expert Emissions used in EMEP models" (These emission data are based on officially reported emissions to the extent possible, but some of the officially reported data have been corrected and gaps filled). / AuxiliaryEMEP_CLRTAP_ExpertEstimations.xls
Energy use / Eurostat’s NewCronos online database: Energy Statistics (balances)
The category downloaded is termed: “Gross Inland Consumption – all products”
Table: nrg_100a Supply, transformation, consumption - all products - annual data
Unit: 1000toe Thousands tons of oil equivalent (TOE)
Indic: en100900 Gross inland consumption
URL: accessed, 7 Feb 2007
29 countries
years downloaded: 1995-2004 / Switzerland is missing
AuxiliaryEnergyGIC.xls

With the help of those auxiliary variables several unit errors were detected and corrected in consultation with respective national statistical institutes. In one case, new data for one air emissions parameter was sent.

3Estimation of EU-aggregates

The 2006-NAMEA-survey reveals considerable data gaps regarding coverage of the 29 countries, 19 parameters (13 air pollutants, 2 energy uses, 4 economic variables) and time (9 years: 1995-2003). Further, there are considerable data gaps regarding the breakdown of industries (the 2006-NAMEA-questionnaire offers three disaggregation levels: A10, A36, and A60).

The overall objective is to estimate:

  • time series 1995 to 2004,
  • for several EU-aggregates: EU15, EU25 and EU27
  • at A36-level of disaggregation
  • for 8 air pollutants[1](CO2, N2O, CH4, SOx, NOx, NH3, CO, NMVOC),and
  • 2 energy uses[2]

(decided at meeting on 21 March 2006)

In general, a bottom-up approach is applied, i.e. in a first step missing values are estimated for single countries in order to arrive at aggregate estimations for EU15, EU25 and EU27.

Two cases of “data-gaps” can be distinguished (see Figure 1):

  • In ‘case A’ certain values are given/available for a certain industry-item. However, some years are missing; in other word: the time-series for a certain item is incomplete. The available values are also called “neighbouring values”.
  • In ‘case B’ no values of a certain item are given; i.e. the item is completely missing. The entire time series is missing. There are no “neighbouring values” which might be employed for estimation.

Figure 1: Overview on estimation procedures

items / estimation procedure
case A / Neighbouring values are existing, but incomplete time series / NAMEA total - ECON_TOT / use annual-%-change of auxiliary variable
(e.g. from UNFCCC or CLRTAP inventories, or energy balance)
to complete time series
household totals - EP_HOUS
industry total - SUB_TOT
industry breakdown - A10
industry breakdown - A36
industry breakdown - A60
case B / no neighbouring values are existing; i.e time series is completely missing / NAMEA total - ECON_TOT / use auxiliary variable (e.g. UNFCCC/CLRTAP total)
household totals - EP_HOUS / use average split between subcategories
derived from auxiliary-aggregate of countries
for which data are available
industry total - SUB_TOT
industry breakdown - A10
industry breakdown - A36
industry breakdown - A60 / no estimation

In a first step, time series of items are completed. Existing (neighbouring) values are extrapolated to the missing years with the help of annual percentage changes of auxiliary variables (e.g. for CO2, the UNFCCC inventory data are used as auxiliary). This extrapolation is applied for all items (i.e. levels of industry-breakdown) implying the assumption that sub-items (e.g. emissions of the electricity industry) develop linear to the total (e.g. emissions of total economy).

After this first step, EU-wide auxiliary-aggregates are generated from those countries for which data are available (now in complete time series). Evidently, the number of countries for which data are available varies for pollutants and industry-breakdown-level. The following Figure 2 provides the example for CO2. It gives the number of countries used to form the auxiliary-aggregate for the several industry-breakdown-levels and EU groupings.

Figure 2: Number of countries to form EU auxiliary-aggregates for the different industry-breakdown levels (example CO2)

EU15 / EU25 / EU27
A2 / 12 / 16 / 17
A10 / 9 / 11 / 12
A36 / 6 / 6 / 6

As Figure 2 shows, the number of countries to form auxiliary-aggregates decreases with increasing resolution of industry-breakdown. For instance, for A2-level (i.e. distinction between industry emissions and household emissions) 12 countries out of EU15 are available. For A36-level, only 6 country-data are available to form auxiliary-aggregates for EU.

In the next estimation step the auxiliary-aggregates are used to estimate missing countries. This is done in a hierarchical order. First, the A2-level is estimated. For instance in the case of CO2, the three EU15 countries for which A2-level is missing are estimated. The CO2-total as provided by UNFCCC inventory is split into industry and household emissions using the respective shares from the auxiliary-aggregated derived from 12 existing countries. Then A10-level is estimated for 6 missing EU15 countries by applying the respective auxiliary-aggregate of available 9 countries. The estimate obtained from superior level (i.e. total industry emissions as derived from A2-level) is then split into 10 industries using the available 9 countries’ split as a reference.

Finally, the A36-level is estimated starting from the 10 industries’ estimates derived in previous step. For the countries where A36-level data is missing, those are sub-allocated to 36 industries using the average distribution in auxiliary-aggregate which comprises 6 countries as reference in this case.

As a result of the above described estimation procedures one obtains full-fledged data sets for all 27 countries comprising original as well as estimated data. This full data set is then aggregated to three EU country groupings: EU15, EU 25, and EU27. Single-country estimates are not supposed to be shown. They only serve as an interim step to obtain EU estimates.

4Aggregating air emissions to three major impact categories

In order to facilitate analyses it is common to aggregate several air emission parameters to so called impact categories. Three impact categories are derivable from the 8 air emissions for which NAMEA estimates are conducted:

  • Global Warming Potentials (CO2, N2O, CH4)
  • Acidification (SOx, NOx, NH3)
  • Tropospheric Ozone Formation Potential (NOx, NMVOC, CO, CH4)

The impact categories are derived through aggregating several air emissions to one number applying certain weighing factors. Table 4presents the weighing factors applied.

Table 4: Weighing factors applied for environmental impact categories related to air emissions

impact category / unit / air emission / weighing factor
Global Warming Potential (GWP) / CO2-equivalents / CO2 / 1.0
N2O / 310
CH4 / 21
Acidification (ACID) / SO2-equivalents / SOx / 1.0
NOx / 0.7
NH3 / 1.9
Tropospheric Ozone Forming Potential (TOFP) / NMVOC-equivalents / NOx / 1.22
NMVOC / 1.0
CO / 0.11
CH4 / 0.014

5Availability of data

The data will be presented to the Working Group on 7-8 May 2007. Discussions on estimation methods for aggregated EU15, EU25 and EU27 figures are expected. If agreed, data can be published which is expected to happen by the end of May 2007.

1

[1] The response rate for remaining air pollutants (HFC, PFC, SF6, PM, CO2-from-biomass) was too poor to conduct reasonable estimates.

[2]Missing economic variables (gross values added, output) are not estimated from the 2006-NAMEA-survey base since Eurostat provides already estimates for those (see which are to be used for the further integrated environmental-economic analyses.