8

Reference period / 2006
Observation period / 2004 - 2006
Person who filled the report / J. Malek Mansour


TABLE OF CONTENTS

1 Overview 3

2 Short Description of the national CIS 2006 methodology used 4

3 Accuracy 9

4 Comparability 19

5 Coherence 22


1 OVERVIEW

The purpose of this report is to get an overview of the quality of the Fourth Community Innovation Survey (CIS 2006) carried out in each member state. The quality report is to be established for the CIS 2006. The same is also envisaged for subsequent Community Innovation Surveys.

This quality assessment will be based on different quality dimensions and indicators. The quality dimensions are based on the standard ones as defined in the Eurostat standard statistical quality framework. Also the indicators themselves are in line with these recommendations. Indeed, the criteria to judge statistical quality will correspond to a specific chapter in the report. These criteria are: Relevance, Accuracy, Timeliness and Punctuality, Accessibility and Clarity, Comparability, Coherence and Cost and Burden. In addition each report should contain a short methodological description of the national methodology used for the CIS 2006.


2 SHORT DESCRIPTION OF THE NATIONAL CIS 2006 METHODOLOGY USED

2.1 Target population

NACE

In accordance with section 2 of the annex of the Commission Regulation No 1450/2004 on innovation statistics, the following industries are included in the core target population of the CIS 2006:

-  mining and quarrying (NACE 10-14)

-  manufacturing (NACE 15-37)

-  electricity, gas and water supply (NACE 40-41)

-  wholesale trade (NACE 51)

-  transport, storage and communication (NACE 60-64)

-  financial intermediation (NACE 65-67)

-  computer and related activities (NACE 72)

-  architectural and engineering activities (NACE 74.2)

-  technical testing and analysis (NACE 74.3)

Please list all “non-core” industries that were covered in addition:

We also included Research and Development (NACE 73)

We believe it is useful to include NACE 73, as a number of enterprises transfer their R&D activities to legally separated subsidiaries, that are – by definition – classified into the NACE 73 sector. By not surveying those firms, we are likely to miss a part of the picture.

Size-classes

All enterprises included in the target population follow the minimum coverage which is all enterprises with 10 employees or more.

Please indicate if there were some deviations.

Comments: There were no deviations.

Statistical units

The main statistical unit for CIS 2006 is the enterprise, as defined in the Council Regulation 696/1993 on statistical units or as defined in the national statistical business register. EU Regulation 2186/1993 requires that Member States set up and maintain a register of enterprises, as well as associated legal units and local units.

Please indicate if there were some deviations.

Comments: There were no deviations (firms were sampled at the level of the VAT number). .

The observation and reference periods

The observation period to be covered by the survey is 2004 - 2006 inclusive i.e. the three-year period from the beginning of 2004 to the end of 2006. The reference period of the CIS 2006 is the year 2006.

Please indicate if there were some deviations.

Comments: There were no deviations.

2.2 Sampling design

The target population of the CIS 2006 is broken down into similar structured subgroups or strata (which should be as homogeneous as possible and form mutually exclusive groups).

The stratification variables to be used for the CIS 2006, i.e. the characteristics used to break down the sample into similarly structured groups, are:

- The economic activities (in accordance with NACE)[1].

In accordance with the requirements of section 5, paragraph 2 of the annex of the Commission Regulation 1450/2004 on innovation statistics, stratification by NACE should be done at least at two-digit (division) level, except for NACE 74. Here the three digit sections NACE 74.2 and 74.3 should be treated as separate NACE categories while NACE 74.1 and 74.4 to 74.8 should be treated as a single NACE category.

- Enterprise size according to the number of employees[2].

The size-classes used should at least be the following:

·  0-9 employees

·  10-49 employees

·  50-249 employees

·  250+ employees.

- Regional aspects at NUTS 2 level:

In accordance with section 7, paragraph 2 of the annex of the Commission Regulation 1450/2004 on innovation statistics, the regional allocation of the sample is taken into consideration when sampling.

The selection of the sample should be based on random sampling techniques, with known selection probabilities, applied to strata.

Please describe the sampling and allocation scheme used (number of strata, number of samples…).

Comments:

Belgium is composed of 3 Regions (at NUTS 1 level): Flanders, Brussels and Wallonia. Each Region is endowed with its own statistical office. Therefore, three separate samples were drawn.

For the Brussels Region:

The problem of NUTS 2 allocation is irrelevant since Brussels is a City-Region and has no NUTS 2 sub-components.

As to size classes, we did not include firms with less than 10 employees, so we had 3 size classes: [10-49 employees], [50-249 employees ], [250 employees and more].

We used all “core nace” sectors, plus NACE 73 (“Research and Development”). We stratified according to Regulation 1450/2004. That is, we stratified our sample over sectors: NACE 10, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 40, 41, 51, 60, 61, 62, 63, 64, 65, 66, 67, 72, 73, 74.2, 74.3. That makes 42 sector classes.

Therefore, in total, we have 3 x 42 = 126 strata.

For the Walloon Region

A separate sampling was made in each NUTS 2 region within the Walloon Region. There are 5 NUTS 2 Regions within Wallonia (BE31 to BE35), which makes 5 different samples.

In each NUTS 2 region within Wallonia, the same stratification was used as in Brussels.

For the Flanders Region

Besides firm size and sector a third stratification variable was taken into account for sampling in the Flanders region, i.e. whether or not a firm was known to have continuous R&D spending. The inventory of firms with continuous R&D spending as obtained from the 2006 R&D survey was used as a base for this variable.

Census sampling was done for all large size firms (250 or more employees), for all medium size firms (50-249 employees) and for small size firms (10-49 employees) of NACE 24, 29 through 35, 72, 73, 74.2 and 74.3. .Census sampling was also done of the small size firms known to have continuous R&D spending in the other core NACE sectors. .

For the remaining small size firms first sampling rates were set that would meet the Eurostat precision criteria for NACE sectors grouped according to their technology level: low-tech industry (10-23, 25-28, 36 and 37) versus low-tech services (NACE 51, 60-67). These were then applied proportionally to each of the NACE sectors belonging to those technology level groupings, as well as to each NUTS 2 (province) level grouping within each NACE sector. The NACE sectors considered were at the 2-digit level (divisions). Cells that consisted of 12 or fewer firms were fully included in the sample (exhaustive sampling).

2.3 Sampling frame

The official, up-to-date, statistical business register[3] of the country was used.

Please indicate if there were some deviations.

For reasons of confidentiality on NACE and size, the sampling was performed on the basis of the population of the National Office for Social Security (NOSS) as composed at the end of August 2006 for the Flanders region and at the end of November 2006 for the other two regions. This official register is at the enterprise level and includes all active enterprises in Belgium. This file was agreed upon by the National Institute for Statistics as being statistically equal with the representative official business register (NIS population). For both versions of the NOSS register recent updates (bankruptcies, liquidations, mergers etc.) were taken into account by consulting other data sources.

2.4 Sample size and overall sample rate

There is no minimum sample size. However, if a particular stratum has less than 6 enterprises, then all the enterprises in this stratum were selected for the survey.

Please indicate national deviations from this rule as well as the overall sample rate if available.

Comments:

There were no deviations from that rule.

For Flanders, strata with 12 or fewer enterprises were exhaustively sampled, to take into account the expected response rate, which typically hovers around 40%.

The overall ex-ante sampling rates are:

Brussels region: 95%

Walloon Region: 64%

Flanders Region: 50%

2.5 Data collection method

Data are collected through a census, sample survey or a combination of both.

Please indicate the data collection method used.

Data were collected both through a mail survey and an electronic questionnaire (firms were sent a paper booklet by regular mail and could opt to answer either on paper or online). A mixture of census and sample survey was used according to the stratum under consideration.

For the Brussels Region as well as for the Walloon Region, a census was done on all medium and large enterprises. Regarding small firms, sample size was computed according to the methodology and requirements described in the Eurostat Methodological Recommendations for the CIS Survey, section 4.6 and Annex 2, based on prior information from CIS4 when needed. In addition, we imposed that there should be a minimum of 30 usable answers (taking into account the ex-post response rate of the CIS4). Wherever these two requirements could not be met, a census was performed.

For the Flemish Region: See the description under section 2.2 above.

2.6 Weights calculation method (short description)

The survey results are weighted in order to adjust for the sampling design and for unit non-response to produce valid results for the target population.

The basic method for adjusting for different probabilities of selection used in the sampling process is to use the inverse of the sampling fraction i.e. using the number of enterprises or employees. This would be based on the figure Nh/nh where Nh is the total number of enterprises/employees in stratum h of the population and nh is the number of enterprises/employees in the realised sample in stratum h of the population, assuming that each unit in the stratum had the same inclusion probability. This will automatically adjust the sample weights of the respondents to compensate for unit non-response.

However, if a non-response analysis is carried out (and the results indicate that there is a difference between respondents and non-respondents), then the results of the non-response analysis should also be used when calculating the final weighting factors. One approach is to divide each stratum into a number of response homogeneity groups with (assumed) equal response probabilities within groups. A second approach could be to use auxiliary information at the estimation stage for reducing the non-response bias.

Various software packages are available to do the calculations needed to derive calibrated weights. These include:

·  CLAN. This was developed by Statistics Sweden and it is a suite of SAS-macro commands.

·  CALMAR (Calibration on Margins). This is another SAS macro developed by INSEE in France.

·  CALJACK. This is also a SAS macro developed by Statistics Canada.

Please describe the calibration method and the software used:

The weights were computed separately in each region.

For the Brussels Region as well as for the Walloon Region, they are simply the inverses of the realized (ex-post) sampling fractions.

For the Flemish Region: A non-response adjustment was done to the basic weights Nh/nh using calibration. The program g-CALIB 2.0 available from Statistics Belgium was used.

3 ACCURACY

3.1 Introduction

Accuracy in the statistical sense denotes the closeness of computations or estimates to the exact or true values. Statistics are not equal with the true values because of variability (the statistics change from implementation to implementation of the survey due to random effects) and bias (the average of the possible values of the statistics from implementation to implementation is not equal to the true value due to systematic effects).

Several types of error occur during the survey process which comprises the error of the statistics (their bias and variability). A typology of errors has been adopted:

1. Sampling errors. These only affect sample survey; they are simply due to the fact that only a subset of the population, usually randomly selected, is enumerated.

2. Non-sampling errors. Non-sampling errors affect sample surveys and complete enumerations alike and comprise:

a) Coverage errors,

b) Measurement errors,

c) Processing errors,

d) Non response errors and

e) Model assumption errors.

3.2 Sampling errors

The aim of this sub-chapter is to measure the sampling errors for CIS 2006 data. The main indicator used is the coefficient of variation (CV).

Table 3.1: Coefficient of variation for key variables by NACE and size (cf. Annex 10.1)

NACE / Breakdown / 1 / 2 / 3 / 4 / 5
Total NACE
Total / 2.27% / 3.59% / 17.9% / 4.19% / 13.0%
Small [10-49] / 3.20% / 4.96% / 18.1% / 6.21% / 11.4%
Medium-sized [50-249] / 2.52% / 4.50% / 15.5% / 4.78% / 9.6%
Large [> 249] / 2.69% / 4.45% / 18.5% / 4.05% / 23.1%
10-14, 15-37, 40-41 / Industry
Total / 2.48% / 4.32% / 18.2% / 4.65% / 8.87%
51, 60-64, 65-67, 72, 74.2, 74.3 / Services
Total / 3.86% / 5.90% / 34.9% / 7.81% / 23.4%

[1] = Coefficient of variation for the percentage of innovating enterprises.

[2] = Coefficient of variation for the percentage of innovators that introduced new or improved products to the market.

[3] = Coefficient of variation for the turnover of new or improved products, as a percentage of total turnover.

[4] = Coefficient of variation for percentage of innovation active enterprises involved in innovation cooperation.

[5] = Coefficient of variation for total turnover per employee.

3.3 Non-sampling errors

Non-sampling errors occur in all phases of a survey. They add to the sampling errors (if present) and contribute to decreasing overall accuracy. It is important to assess their relative weight in the total error and devote appropriate resources for their control and assessment.

3.3.1 Coverage errors