LUCAS 2012 sampling design

(Internal Note – draft Alessandra Palmieri)

1. Introduction

1.1 Main features of 2012 sampling design

Methodology for survey design of LUCAS 2012 is similar to LUCAS 2009[1]: area frame sampling scheme, stratified two phases sample, multivariate allocation, use of auxiliary information for improving the efficiency and panel approach.

Most of the points visited in the field in 2008/2009 survey are surveyed also in 2012 survey in order to maintain the panel approach[2].

At national level a significant redistribution of points has taken place on the basis on the 2009 survey results; in particular the total number of points per country was corrected according to the diversity of the landscape measured with the Shannon Index of the transect in 2009 ( see chapter 2.1.1).

Within the estimates domain (consisting of the concatenation of the variables NUTS2 region and Strata), points not reached in 2009 were substituted with more “accessible”ones, according to auxiliary information (small scale elevation models, detailed road network data etc.).

2. SAMPLING DESIGN

2.1 ALLOCATION

2.1.1 Sample size per Countries

The total number of points in the LUCAS 2008/2009 and 2012 survey is almost the same: however 2008/2009 survey: 263,464 points and in 2012 survey: 270,389 points. The final figures are in Annex 1.

The more varied the land cover is, the more points are needed for precise estimates. The variety was measured by using the Shannon Index (SI) of the transect data in 2009 survey. All the countries which had the SI above the EU average (SEI=0.64) were allocated more points.

In 2008/2009 on average,total sample size of each country was about 6% of total area; in 2012 the following conditions were applied (table 1):

Table 1- LUCAS 2012 – Shannon Index and number of points

Value of SEI in 2009 / Number of points
If SEI >= 0.7 / 8% of total area
If SEI >= 0.65 / 7% of total area
If SEI + - 0.64 / 6% of total area
If SEI < 0.64 / 5% of total area
In FI due to accessibility / 4% of total area

For Malta and Cyprus all points in the master sample are surveyed as no stratification results are available for those countries.The theoretical sample consisted of 270.389 points.

2.1.2 Sample size within the estimate-domains (NUTS2_STRATA)

The adopted design is a two-phase sample for stratification.The LUCAS first phase sample is a systematic sample with points spaced 2km in the four cardinal directions covering all European territory (EU)[3]. The first phase sample is only used to stratify the population. It included a total of 1,077,247points[4]. Each point of the first phase sample was photo-interpreted and assigned to one of the 7 pre-defined land cover strata. The results of the stratification activity, conducted in 2005, are reported in Table 2[5].

Table 2: Stratification results

From the stratified first phase sample, a sub-sample of points (field sample) is extracted in 2012 to be classified during field visit according to the full land classification.

As far as the second phase sample is concerned, the sampling size, same as in 2009, is defined for each combination (stratum (land cover classes) * domain (nuts2)) using the Bethel multivariateoptimalallocation, that is a sort of generalization of the Neyman Univariate optimal allocation[6].

A sampling scheme based on multivariate optimal allocation (Bethel, 1989) was devised taking into account a set of land cover classes; upper-bounds for the coefficient of variation (% values) were fixed based on the experience gained in previous LUCAS survey(Table 3).

Table 3: Upper-bound of expected error by Land Cover classes

Land Cover classes / Upper-bound of the expected error
Cereals / 15%
Root crops, Vegetables, floriculture, ornamental plants and strawberries / 25%
Fibre and oleaginous crops, non permanent industrial crops / 25%
Fodder and temporary grassland / 25%
Permanent crops and nursery / 25%
Grassland / 7.5%
Broadleaved woodland / 20%
Coniferous woodland / 20%
Mixed woodland / 20%
Shrubland / 20%
Bare land / 20%
Artificial areas / 15%
Water / 20%

A minimum number of units in each domain is fixed at 4. In addition the new size of domains which were not covered[7] in 2009, is computed in proportion (around 25%) with the new total country size (first phase sample).

2.2 POINTS SELECTION

Most of the points visited in the field in 2008/2009 survey are surveyed also in 2012 survey in order to maintain the panel approach: 203,277 points from 2009 resulted in 2012 final sample.

A preliminary analysis of Ex-ante and in situ PI points was carried out together with analysis of road distance to the point and height difference between point and closest road elevation.

In the 2009 survey it was agreed to photo interpret 29,986 points before the field survey took place due to the accessibility problems (25 % of the points in FI and SE due to the large not easily accessible northern forests and lakes, 10 % of the points in other countries: points in the mountains and large forest areas). The reason for this was that Eurostat didn't have at its disposal adequate data sources to assess the accessibility of the points beforehand (small scale elevation models, detailed road network data etc.). The photo-interpretation for these points was done using the most recent available orthophotos. Those points were excluded from the selection of 2012 sample.

The points which were photo-interpreted in-situ in 2008/2009 survey due to inaccessibility discovered in the field were replaced by keeping the points in the same strata and NUTS2 area. The accessibility (detailed TeleAtlas road network and slope) is considered in the replacing process to make sure that the points are as accessible as possiblein terms of road distance and height difference.

In general the threshold of 1000 m of elevation above which points were excluded raised to 1500 m.

3. sample for Pilot TRANSECT lenght

In the LUCAS survey the transect is a straight line of 250 m that the surveyor has to walk in eastern direction recording all different Land Covers (LC) and Linear Features (LF) met. Transect is particularly useful for computation of diversity indexes; in order to give the reasonable weight to the different LC and LF it is important to associate them a measure of length. In 2012 a pilot exercise on the measure of transect length for each different Land Cover and Linear Feature was carried out on around 1300 points.

The aim of this pilot transect measuring is to perform a cost / benefit analysis for the eventual inclusion in future LUCAS surveys, evaluating the burden (e.g. Compare extra time needed with previous round 2009) and the impact on transect indicators using weights (driven by lengths).

The criteria forsampling of transect to be walked and measured are the following: the total number of points by country is computed as 0.4% of sample (Table 4); a minimum number of 4 transect was forced where possible in each NUTS2.Then, once fixed the total number of points by country, within each nuts2 region points have been allocated according to the Shannon Evennes Index: an index of SEI nuts2 compared to National average SEI nuts0 was computed.

Transects eligible for this pilot exercise belong to those points which have the following features: they belong to survey 2009 and were observed in field. The transect selection was carried out according to variable probability drawing. Transects with higher SEI had higher probability to be selected.

Transect to be measured are amongst the most diversified ones.

Table 4 below shows the distribution of transects to be measured by country.

Table 4Transects to be measured by country.

Country[8] / Sample 2012 / Evenness
Index in 2009 / Base transect 2012 (0.4%) / % / final transect sample / % of total sample
AT / 6474 / 0.76 / 26 / 0.40 / 36 / 0.56
BE / 2448 / 0.73 / 10 / 0.41 / 44 / 1.80
CZ / 5515 / 0.69 / 22 / 0.40 / 32 / 0.58
DE / 24941 / 0.69 / 100 / 0.40 / 156 / 0.63
DK / 3445 / 0.74 / 14 / 0.41 / 20 / 0.58
EE / 2202 / 0.54 / 9 / 0.41 / 9 / 0.41
ES / 35378 / 0.67 / 142 / 0.40 / 143 / 0.40
FI / 13483 / 0.60 / 54 / 0.40 / 55 / 0.41
FR / 38343 / 0.67 / 153 / 0.40 / 152 / 0.40
GR / 7891 / 0.63 / 32 / 0.41 / 40 / 0.51
HU / 4640 / 0.59 / 19 / 0.41 / 28 / 0.60
IE / 3489 / 0.44 / 14 / 0.40 / 14 / 0.40
IT / 21019 / 0.70 / 84 / 0.40 / 84 / 0.40
LT / 3889 / 0.65 / 16 / 0.41 / 16 / 0.41
LU / 215 / 0.76 / 1 / 0.47 / 4 / 1.86
LV / 4421 / 0.68 / 18 / 0.41 / 18 / 0.41
NL / 2241 / 0.64 / 9 / 0.40 / 48 / 2.14
PL / 21806 / 0.66 / 87 / 0.40 / 87 / 0.40
PT / 7338 / 0.75 / 29 / 0.40 / 30 / 0.41
SE / 22431 / 0.65 / 90 / 0.40 / 96 / 0.43
SI / 1618 / 0.72 / 6 / 0.37 / 8 / 0.49
SK / 2452 / 0.60 / 10 / 0.41 / 16 / 0.65
UK / 12265 / 0.55 / 49 / 0.40 / 148 / 1.21
247944 / 0.64 / 994 / 0.40 / 1284 / 0.48

References

A. Palmieri, L. Martino, P. Dominici and M. Kasanko (2011), ‘Land Cover and Land Use Diversity Indicators in LUCAS 2009 data’ Conference on Land quality and land use information in the European Union - Keszthely (HU) 2011

Palmieri A., Dominici P., Martino L., Kasanko M., (2011) 'Diversified landscape structure in EU Member States. Landscape indicators from LUCAS 2009 Survey' - Statistics in Focus, 21, Eurostat

Martino L., Palmieri A. & Gallego J. (2009): 'Use of auxiliary information in the sampling strategy of a European area frame agro-environmental survey', in: Proceedings of the First Italian Conference on Survey Methodology (ITACOSM09). Specialized Session 5: Agricultural Surveys in European countries, June 10-12 2009, Italy: Siena.

Martino L. & Fritz M. (2008) New insight into land cover and land use in Europe, Statistics in Focus, 33, Eurostat, Luxembourg

Bethel, J. (1989) Sample Allocation in Multivariate Surveys, Survey Methodology,15, pp 47-57.

Annex 1 – Lucas sample in history

Country / Number of points in the base sample (1x1 km grid) / Number of points in the master sample (2x2 km grid) FIRST PHASE SAMPLE / Number of points in the second phase sample (2009) / Number of points in the second phase sample (2012)
Austria (AT) / 83 912 / 20962 / 4969 / 6474
Belgium (BE) / 30 683 / 7655 / 1808 / 2448
Bulgaria (BG) / 110 968 / 27510 / 8100 / 6643
Cyprus (CY) / 9 238 / 1442 / 1442
CzechRepublic (CZ) / 78 861 / 19716 / 4674 / 5515
Denmark (DK) / 43 399 / 10822 / 2554 / 3445
Estonia (EE) / 45 326 / 11340 / 2680 / 2202
Finland (FI) / 390 899 / 84519 / 19946 / 13483
France (FR) / 549 223 / 137296 / 32417 / 38343
Germany (DE) / 357 820 / 89406 / 21157 / 24944
Greece (EL) / 133 719 / 33432 / 7819 / 7891
Hungary (HU) / 93 020 / 23268 / 5513 / 4638
Ireland (IE) / 70 258 / 17537 / 4165 / 3489
Italy (IT) / 301 406 / 75346 / 17851 / 21018
Latvia (LV) / 64 596 / 16142 / 3827 / 4420
Lithuania (LT) / 65 311 / 16327 / 3864 / 3890
Luxembourg (LU) / 2 590 / 646 / 153 / 215
Malta (MT) / 314 / 79 / 79
Netherlands (NL) / 37 347 / 9297 / 2461 / 2241
Poland (PL) / 312 620 / 78124 / 18530 / 21806
Portugal (PT) / 89 071 / 22257 / 5426 / 7338
Romania (RO) / 241 360 / 59250 / 20364 / 14281
SlovakRepublic (SK) / 49 033 / 12262 / 2895 / 2455
Slovenia (SI) / 20 258 / 5058 / 1201 / 1621
Spain (ES) / 498 519 / 124610 / 29917 / 35378
Sweden (SE) / 449 399 / 112332 / 26665 / 22431
United Kingdom (UK) / 248 761 / 62133 / 14508 / 12265
EU23 / 4 016 031 / 990487 / 235000
EU25 / 4 025 583 / 1077247 / 263464
EU27 / 4 377 911 / 1078768 / 270389
EU23 / excluded BG CY MT RO
EU25 / excluded CY and MT

[1]Martino L., Palmieri A. & Gallego J. (2009):

[2] For the 11 MS who took part to Pilot LUCAS 2006 a 3 – year serie is available for valuable time serie analysis.

[3] No stratification was available for Cyprus and Malta

[4]Excluding 1521points in Cyprus and Malta

[5]Martino L. & Fritz M. (2008)

[6]Bethel, J. (1989) -

[7]34 Domains not covered by 2009 estimates belonging to the following NUTS2 regions: ES53, EL22, EL41, EL42 and FI20

[8]In 2009 LUCAS survey covered 23 MS