Document: E/GIS/27/EN - Extended Version

Document: E/GIS/27/EN - Extended Version

EUROSTAT GISCO Progress Report August 2000

Document: E/GIS/27/EN - extended version

Original

Meeting of the Working Party

"Geographical Information Systems for Statistics"

Joint meeting with National Statistical Offices

and National Mapping Agencies

Luxembourg, October 17-18, 2000

Bech Building (Room "Ampère")

Beginning of the meeting: 10 a.m.

______

Management summary of activity report for 1999-2000

Working document concerning item 4 of the agenda for Day 1 of the meeting

1

EUROSTAT GISCO Progress Report October 2000

Eurostat

GISCO

Geographic Information System

of the Commission

Progress Report 99/2000

October 2000

Management Summary

1.INTRODUCTION

2.GISCO PROJECT STRUCTURE

3.Project activities

3.1 Technical and contents aspects of the GISCO reference database

3.1.1 Contents of the database

3.1.2Quality control

3.2 Reference database architecture

3.3 Spatial analysis

3.4 Cartography

4.Relations with the GISCO User Community

4.1 COGI

4.2GISCO User Committee

4.3GISCO Technical Committee

4.4Annual joint NSI-NMA meeting

4.5 Relations with other Commission Directorates Generals

5.Dissemination of GISCO information

Management Summary

In 1999 the GISCO project started a transition process which is not yet finished. Technical developments as well as new user needs continuously demand new solutions and efforts.

Major progress was achieved in the field of the revision of the basic project architecture. After one year of intensive work and co-operation with the JRC in Ispra clear technical decisions were taken. It is the final aim to integrate spatial data into a relational database system. However the implementation of this choice is not yet going to take place due to software and data limitations. The pilot site for the new architecture will be maintained in parallel with the existing GISCO reference data until the end of LOT2 in November 2001. A replacement is only envisaged when the new software developments and the quality of the spatial data are mature enough for production processes.

The new architecture based on Oracle Spatial is currently implemented as a pilot site using the most detailed reference data available in GISCO. These data sets are converted and tests are ongoing for Intranet/Internet applications.

In parallel an improvement of the reference data in the GISCO database is initiated through different initiatives. Eurostat together with DG INFSO created an inter-service group on geographical information inside the Commission (COGI) in order to improve the co-ordination of actions in the field of geographical information.

Main updates of the contents of the GISCO reference database occurred in the last year in order to comply with the latest available version of the nomenclature for territorial units for the statistics (NUTS) published in 1999. Coverages with the regional breakdowns for the EFTA and CEC countries were also updated. The commune boundaries for 1997 based on SABE 1997 were an additional data set updated in the past year. This data set is however still not linked to the NUTS. The current coverage includes the 15 Member states, the 4 EFTA countries and HU, LV, HR, SI, SK, CZ, PL,LT.

In the field of spatial analysis, another activity area of the GISCO project team, interesting progress was achieved in 4 different projects. A study focussing on interregional migration in correlation with other socio-economic variables was finalised. A second study analysed GISCO and REGIO data with respect to Trans-European Networks and derived transportation richness indices for the European regions.

A third study on the use of CORINE Land Cover to map population density and rural areas was launched and is still ongoing. Due to the divergence in size of the communal territories in the EU countries, GISCO identified the need of presenting population density based on a common resolution. The purpose is therefore to derive disaggregated population figures and portray population density based on a grid. The objective of the spatial analysis exercise is the disaggregation of population data and the imputation of different densities to different CORINE land cover categories.

Finally, lot 20 of SUP.COM 98 "Development of advanced corroboration-verification techniques for models embedded in Geographic Information Systems” investigated to what extent Uncertainty Analysis (UA) and Sensitivity Analysis (SA) can be used in an innovative way to address the robustness of GIS-based models, the optimisation of data collection procedures and the reliability and quality of GIS-based models.

The GISCO project progressed also in the field of mapping. New poster designs were developed and implemented. For a number of publications issued under the series "Statistics in Focus" posters were produced. Considerable progress was also achieved in studying the portrayal of the changes occurred from passing from NUTS version 1995 to NUTS version 1999.

GISCO organised in October 1999 the annual joint meeting of the National Statistical Institutes and the National Mapping agencies. Main points of discussion were the future plans of GISCO both in terms of technology and geographic data and alternative ways of data acquisition. Furthermore the automatic management of the NUTS nomenclature as a future solution to the currently manual updating process was on the agenda.

Furthermore GISCO participated actively to the preparation of the new version of REGIONS 2000 Statistical yearbook. The traditional presentation of statistical data in the form of endless tables was replaced by a completely new layout, providing text, tables, graphs and maps for each chapter. The new structure aims at raising the attractiveness of the publication to a broader readership and to filter the information for a non specialists' target group

Special training sessions for the GISCO staff were organised in the field of spatial analysis and ARC/INFO.

Contracts were signed with three pilot NSIs in the UK, the Netherlands and Finland in order to establish a system for reporting changes in local units to Eurostat and if possible including digital boundaries of the transferred territories.

A new call for tender was launched and attributed for 2 out of the 5 lots in GISCO dealing with the technical support and advisory work in connection with GIS and the revision of the contents and update of the GISCO database.

GISCO participates also in a co-operation with the JRC aiming to build up a common web site for GI and GIS activities inside the Commission.

The co-operation with the different DG’s is of course one of the main activities of the GISCO project that continued to play a very important role as this is the driving force for all above mentioned actions.

Other important tasks were completed in the period considered:

  • Elaboration of a number of thematic maps for Eurostat and the Commission;
  • Participation in spatial analysis projects in other Directorate Generals;
  • Review of the Eurostat dissemination policy of GISCO data;
  • finalisation of the dissemination agreement with MEGRIN concerning SABE based data sets;
  • Finalisation of the decentralisation in a pilot directorate of Eurostat (3 units involved);
  • Progress meeting of all lots subcontracted by GISCO.

1.Introduction

This report refers to the period 1999 / 2000 and summarises the main activities and achievements of the GISCO project. After a short description of the project structure the main activities of the team are presented. The fourth chapter deals with the relations to users and partners. The last chapter of this report deals with dissemination related aspects.

2.GISCO Project structure

The GISCO project structure as defined in 1999 proved to be very efficient and remains in its current set up.

The GISCO team consists of 8 full time internal members. An additional 5 man years equivalent is subcontracted to three different companies/institutions. The software used for database management is ARC/Info 8.1. As Desk Top Mapping tool Arc/View 3.2 is in use.[1]

3.Project activities

The centre of all activities is the GISCO reference database. All such activities are managed by an internal project manager who defines the priorities and working program for the respective areas. GISCO identified 5 working areas (lots) which are subcontracted to companies and organisations[2] specialised in the corresponding fields.

3.1 Technical and contents aspects of the GISCO reference database

Two lots[3] deal with the technical support and the revision and updating of the contents of the GISCO reference database. The contents of the GISCO reference database is defined by the needs of GI and GIS users inside the Commission. They are mainly the policy DG’s that are using GIS as policy support tools. These technical lotsassure that the data requirements (if data can be acquired or is available) are integrated in an appropriate way and that quality controls of the integrated data are performed.

The achievements of this year are the following:

3.1.1 Contents of the database

-The communes 1M 1997 data set has been created from SABE97. The coastline and lakes have been integrated for the various countries and all available country data sets have been integrated in one seamless data set. The applications (ATOOLS) have been adapted and tested, the documentation in the database manual updated. The 1997 communes have been integrated in the reference database without a codification based on NUTS, level5 which is not yet available for this reference year. The current coverage includes the 15 Member states, the 4 EFTA countries and HU, LV, HR, SI, SK, CZ, CY, AN.

- A new version of the road (RD) and railway (RW) database has been supplied by DG TREN to GISCO. A data quality assessment has been carried out after which the data were integrated into the reference database. Also data on ports and airports have been integrated in the reference database. Traffic and infrastructure data for RD and RW on segment level have been included for some Member States.

- An update of the biotope data set (BP) and the designated area (DA) database was received from the EEA and was integrated.

-A layer on national support areas (NT) in the CS theme has been created (based on information from DG COMP). The data set describes the areas eligible for national support by Member States on commune level.

-A new water pattern data set was integrated.

-A gazetteer (GZ) from the U.S. National Imagery and Mapping Agency (NIMA) has been purchased by GISCO. Due to the size of the database the integration of the data will be done successively in phases.

-New Layers for the Structural funds and the Interreg III regions valid for the new budget period (2000-2006) are included

-Parts of the MARS meteo database was integrated

The "ArcEurope Base map" was evaluated by GISCO for a possible inclusion in the GISCO reference database. ESRI Data & Maps - ArcEurope Base Map - provides an extensive set of data for Europe. Suggested display scales are largest scale 1:100,000 and as most appropriate scale 1:1,000,000. A thorough analysis was performed and comparison of selected coverages with GISCO reference database coverages was carried out.

As a result ArcEurope coverages are of very good quality in terms of resolution corresponding to the scale of 1:250000 or better. However, the requirements in terms of effort in order to include certain coverages in GISCO will be high. Especially the correction of certain coverages and the codification will be laborious and not worth the effort. ArcEurope can be considered as a database following its own internal logic but not compliant with GISCO. The road/rail network though very complete is not at all compatible with the GISCO.

The next GISCO technical Committee will analyse the question if ArcEurope should be stored under the GISCO reference (theme id: integrated database) and be used as a background coverage(s). Analysis could be carried out without introducing codification and specific attributes in order to produce seamless coverages.

3.1.2Quality control

A preliminary approach on the quality of GISCO reference database was elaborated in the past year. The objective of this exercise is to develop a quality measurement for the precision - defined as the degree of detail recorded for measurements and expressed in terms of resolution - that will enable us to identify errors, segments with extreme lengths and eventually compare different coverages and/or updates. Subsequently, it intends to provoke discussions and reactions on issues relevant to quality of GIS data with data providers for GISCO.

GISCO reference database comprises geographical information at different levels of abstraction thus different scales and resolution. The data analysed through out this exercise are from the CoMmunes layer of the ADministrative boundaries theme of the GISCO reference database. Data referred to NUTS level 5 boundaries for 1981, 1991, 1995. Data was at a scale of 1:100 000 and 1:1000000 with nominal resolution of 30m and 200 m respectively. Additionally, we analysed data from SABE 1997 at the same scales and resolutions, before their integration into the GISCO reference database. The selection of NUTS level 5 boundaries is because this information is the finest in terms of scale and resolution in GISCO database from official sources. Data providers to SABE are the national mapping agencies under the umbrella of MEGRIN holding the copyright for the administrative boundaries.

The segments of a geographical coverage do not represent a statistical population on which statistics can be carelessly applied. However, findings based on descriptive statistics allow us to have a general idea of the characteristics of GISCO coverages and propose interventions.

Descriptive statistics on the length of segments for each coverage having polygon/line topology were produced in three steps. First, the segments of interest (i.e. country, coastline, land boundaries, international boundaries e.t.c) were extracted to a text file with an indication of the arc, the end-vertex identification number and the Euclidean distance between the vertices. Second, the text file was imported into an ACCESS database. Due to the huge number of records, it was necessary to produce frequency tables for a given interval arbitrarily set at 50m. The result of the frequency tables was then imported into EXCEL where the final statistics were calculated.

One of the important factors that influences any analysis of the measurements of the arcs is the type of the segments i.e. natural features as rivers, man made features etc. In our case, we did not extend the categorisation of the arcs besides the fact that they represent administrative boundaries, though our initial intention was to separate the coastline and boundaries on land.

The results were presented by country, by scale version and by temporal version.

3.2 Reference database architecture

The 3rd activity area refers to the revision of the GISCO reference database. The GISCO architecture is currently built around Arc/Info, a file based proprietary system. Numerous technical developments in the field of geographical information systems have taken place since the beginning of the project in 1992 (geographical information in relational databases accessible online, integration of geographical data and alphanumeric data in end users applications, etc.). A considerable review of this architecture has now in the 3rd year of this lot 2 lead to a description of the migration of the GISCO spatial data to a database environment. The idea behind this is to put the GI system from a file based environment to an open database system and thus combining non-spatial tabular data with spatial information. The JRC in Ispra is in charge of this lot.

Major aims of the project are:

-to combine spatial and attribute data in one single database;

- the database management will also include the management of spatial data;

-introduce an object oriented approach for modeling spatial data which includes the establishment of dependencies between spatial objects;

- opening the usage of geographic information systems to less skilled users (no longer an exclusive expert domain);

- exploiting more common user interfaces (e.g. web browser);

- to introduce life time cycles for geometric objects by assigning start and end dates (allowing thus to produce a map for a certain point in time by selecting valid spatial objects in the database).

For testing purposes, a license of ORACLE Spatial 8.1.6 was installed in the Computing Centre in Luxembourg. Based on the new GISCO data model a part of the themes were converted to ORACLE Spatial ( administrative boundaries, landcover and the geographic names database). With this implementation, data could only be processed and queried via SQL.

Therefore the JRC in Ispra developed in co-operation with GISCO a web application following Open GIS specifications. The corresponding map server/web server software has been compiled and installed in Eurostat premises. This application allows to query and display features from the gazetteer together with land cover and administrative boundaries stored in ORACLE spatial on the web. This product is however not yet mature enough to be put into production.

3.3 Spatial analysis

Spatial analysis projects are often subcontracted as such projects require in general specific technical know how and infrastructure and also multidisciplinary teams.

Currently there is only one spatial analysis project ongoing. It is based on the divergence in size of the communal territories in the EU countries. We identified the need to present population density based on a common resolution eliminating the difference in size of the NUTS 5 levels. The purpose was therefore to derive disaggregated population figures and portray population density based on a grid. The objective of the spatial analysis exercise undertaken by the Space Applications Institute in Joint Research Centre (ISPRA) was to disaggregate population data; imputing different densities to different CORINE land cover categories.

Furthermore, our intention was to develop an interactive presentation and mapping tool for a better understanding of different habitat types - higher or lower population density depending on the countries and linked to longitude and latitude - for the specific CORINE land cover classes.

The exercise had three phases:

-attributing a population density to each land cover grid cell;

-studying the behaviour of the population density per land cover class across EU;