13th International Roundtable on Business Survey Frames

Paris: 27 September-1 October 1999

Summary: Session 4 - Technology

1. Introduction

Technology advances in developing, maintaining, and utilizing business survey frames are critically important to the Roundtable participating countries in our continuing effort to improve the efficiency of our survey frames and processes. The goal of this session is to share technological developments, practices, and approaches that may be useful and applicable to other countries. Following is a brief summary of the papers presented in this session and the technological issues discussed.

2. Coordination of Samples: the Microstrata Methodology ( Pascal Riviere, INSEE, France)

INSEE has developed microstrata methodology, in the framework of a Eurostat project, for controlling the amount of overlap between survey samples into their Salomon sampling system. In this methodology, population units within the intersection of the sampling strata from specified surveys are sorted using several cumulative burden measures. The system can be used to provide positive or negative coordination with these other surveys, depending on the needs of the survey program. The paper provides detailed documentation and discusses conditions for application.

3. From the Business Register to National Accounts (Julia Cravo, INE, Portugal)

This paper describes the Annual Business Survey program at the National Statistics Institute in Portugal, which is used to establish the National Accounts. The Business Register, initially created in 1988, is the core of the survey program. The paper describes the structure, content, data sources, and updating of the register. A stratified sample for the Annual Business Survey (IEH) is selected from the Business Register. The IEH is the key input to the National Accounts and is one of the main sources for updating the Business Register. The structure of the National Accounts is described, and the use of the business Register and the IEH for estimating National Account data values is documented.

4. Developing the Statistical Register as a Component of the One Window Registration System (Jozsef Kophazi, Hungarian Central Statistical Office)

This paper describes the on-line registration system used by Hungary=s Central Statistical Office to add new units to their business register, which was recently migrated to a centralized UNIX/ORACLE database. This system was first implemented in 1998. New legal units are initially entered into the system at the Registration Court, the local Chamber of Commerce and Industry, or the local Chamber of Agriculture. There are more than one hundred of these offices in Hungary. They are connected physically by an open network and the Internet. Data transmission is secured by using appropriate e-mail and EDI protocols, as well as an encryption system. The initial registration data are sent to the Central Statistical Office. The data are edited, each unit is assigned an identification number, and the records are forwarded to the Tax Office and back to the Registration office for further processing.

5. Linking Administrative and Statistical Units (Harrie van der Ven, Statistics Netherlands)

Administrative data need to be easily linked to the statistical units on the business register. At Statistics Netherlands, tax data files are an important administrative data source. Currently, different statistical departments use different algorithms to link the tax data to their statistical units. The paper proposes datamodels for the relationships between legal units, enterprise groups, and enterprises on the business register and between legal units and tax units on the administrative tax files. Based on these models, the paper proposes a general model for the linking of statistical units on the business register and the tax data from the tax files, and discusses many of the complexities involved.

6. Record Linkage at NASS Using Automatch (Kara Broadbent and Bill Iwig, National Agricultural Statistics Service, USDA, USA)

In order to maintain high coverage, new entities must continually be added to the business register. The National Agricultural Statistics Service (NASS) has access to many different agricultural list sources each year that are used to identify agricultural operations not on their current farm register. NASS has recently developed a new record linkage system for merging list sources which uses the commercial software AutoStan and AutoMatch as the record standardization and record linkage engines, respectively. Based on the Fellegi-Sunter record linkage theory, the system creates linkage groups containing matched records, possible matched records, and non-matched records. The number of linkage groups classified as possible matches is dependent on the record linkage parameters used and the acceptable risk of false matches and false non-matches. The linkage groups are populated into a resolution database and are reviewed through a new clerical review system developed using PowerBuilder software. The paper discusses the application of record linkage methodology at NASS and discusses the different resolution screens used in the system.

7. Data Dissemination through a Data Warehousing System (Patrizia Altieri and Liana Veronico, ISTAT, Italy)

Data from ISTAT=s Intermediate Census for Industry and Services from 1971, 1981, 1991, and 1996, are being desseminated over the Internet using an on-line SAS query system to access a SAS Data Warehousing system. A major goal in developing the system was to provide easy and quick access for the data users, eliminating their need to contact ISTAT for most Aad hoc@ inquiries. The Data Warehouse currently occupies about 25 GB and is divided into five Data Marts in a multidimensional database. The different Data Marts provide data for different sets of variables over the different censuses and for specific sectoral and economic-statistical indicators. Aggregated data values are stored in the Warehouse for selected intersections of the variables in a particular Data Mart. Data users can then create their own custom summary tables from this initially aggregated data. The Internet site has been operational since 1998 and has proved to be very popular with data users.