Doc. Eurostat/ITDG/October 2006/1.6
IT Directors Group
23 and 24 October 2006
BECH Building, 5, rue Alphonse Weicker, Luxembourg-Kirchberg
Room AMPERE
9.30 a.m. - 5.30 p.m.
Remote access to confidential data
Item 1.6 of the agenda
2
Remote access to confidential data
1. Introduction
Policy makers need information in order to evaluate existing programs and develop new ones. That information often comes from research based on data collected by statistical agencies or others under a pledge of confidentiality. The most critical data are microdata- data about individuals, households, and business and other organisations.
There is therefore a growing appreciation of the benefits of providing access to microdata for research and analysis. At the same time it is vital to protect data confidentiality. It is essential that new approaches are developed at international level to meet these objectives which create conflicting pressures. The increased need for restricted access to more detailed microdata means that conditions for obtaining such an access need constant improvement.
The challenge in the future is to safeguard the confidentiality of data while making them available to researchers in a wide variety of settings. To meets society’s need for high quality research and statistics, EU national statistical agencies should provide easy access to anonymised files and guarantee restricted access to detailed, individually identifiable confidential data for researchers under carefully specified conditions. Especially important is easier access to research data centres, through the creation of a safe centre network and developing remote access systems.
Last year it was presented at this meeting a short survey of on going works in the area of statistical disclosure control. Two main domains linked with IT technologies were identified as domains to be exploited in the future:
- CENEX (Centres and Networks o Excellence) on statistical disclosure control. Development of software tools. Possibility of sharing these tools by means of OSS.
- Remote access to researchers. Development of common tools. Exchange of practices on organisational set up.
2. CENEX Statistical Disclosure Control
The CENEX on statistical disclosure control (SDC) was considered by Eurostat in 2005 one of the two pilot projects to test the CENEX approach. It was finally the only one launched in 2005. The CENEX SDC grant was awarded to a consortium of 8NSIs lead by Statistics Netherlands. The activities started in January 2006. The project will have the duration of 12 months
The CENEX SDC addresses the following objectives:
- Set standards for the protection of micro-data sets, based on disclosure risk assessment methods and criteria.
- Improve tabular data protection techniques and develop harmonized criteria
- Extend and develop SDC software tools, both for micro and tabular data, so as to fit the specific production and dissemination environments of ESS.
The pilot project is progressing satisfactorily and will be evaluated at the end of the year. If this evaluation is positive it might lead to a wider CENEX project that will deal amongst other subjects with the development of methodologies to handle disclosure control problems of remote access, remote execution and on-site facilities and have as a result the availability of tools in SDC to all MS.
3. European statistical system and European wide access to microdata sets of official statistics
A sensible approach for facilitating high quality research is to maintain the data in a secure, restricted remote access environment. This approach to develop remote access procedures, which has the advantage of reducing researcher burden, involves substantial investment in hardware and software. It was thus proposed at the last Statistical Confidentiality Committee (CSC) to study the possibility already initiated in the 7th Research Framework Program in the field of research infrastructures to further develop such an approach at European level.
Thus Eurostat has approached DG RTD to discuss the possibility to include such an action in their work program. As far as the legislative process is over, that could lead to a call for proposal in the first half of 2007.
The project would aim at achieving the following objectives:
- facilitating integrated European wide access and use of microdata sets from official statistics for scientific purposes
- developing an European integrated approach to remote access systems to microdata and determining a remote access standard
The following activities could be foreseen in such a project:
· Determining a remote access standard.
Three phases should be considered when promoting a European wide remote access approach to microdata and the development/implementation of a standard remote access system for Europe.
First, establishment of a remote access scheme to be applicable in European countries. Research should be done in this domain taking into account already existing schemes.
The standard remote access system should consist of:
– IT remote access solution,
– statistical disclosure control methods for different types of microdata data
– standard legal and administrative procedures
In a second phase these standard should be tested in several countries. In the last phase, training on procedures and ethics should be provided to all users of remote access systems.
This expanded access requires expanded procedural and legal protections. However, laws, enforcements, and penalties are not enough to safeguard the confidentiality of research records. What is needed in addition to the legal sanctions is a system of norms and values concerning the ethical use of such data. Everyone working with confidential records requires education and training in these ethical principles and practices.
· Creating a safe centre network.
An important component of developing a new confidentiality protection system is the development of a safe centre network. The remote access and safe centres network are complementary in the sense that one does not exclude the other since they address usually different data needs. Highly sensitive data that will not be available to researcher via remote access systems could be accessed via a safe centre network. European datasets could be available at these centres.
Three phases should be considered when promoting a European safe centre network.
The safe centre network should consist of:
– accredited safe centres: Minimum set of requirements should be determined and agreed by MS in order to become an accredited safe centre in the network,
– remote connection system between the safe centres in the network,
– standard legal and administrative procedures to be agreed bilaterally with Member States.
This network would enable researchers from one country to access data from other countries via de remote connection between safe centres facilitating research at European level. In parallel certain European datasets could be deposited in the accredited safe centres to be accessed by researchers coming to these centres.
The modalities of such system need to be agreed bilaterally with Member States. Similarly, the legal implications of an international network connecting all the remote access systems should be examined.
4. Conclusions
A progress report was presented on remote access activities. The two main domains linked with IT technologies need to continue to be developed in the future:
· CENEX on statistical disclosure control. Development of software tools. Possibility of sharing these tools by means of OSS: this might be issue of a second CENEX SDC project. Member States (MS) should follow the CENEX activity and maintain a proactive attitude towards the appropriation of these results. The WG ESS Coordination and Programming will be a vehicle to transmit to MS the CENEX results.
- Remote access to researchers. Development of common tools. Exchange of practices on organizational set up: Eurostat will work together with MS in order to prepare a project to present under FP7.
Member States are asked:
- To comment on this document.
Doc. Eurostat/ITDG/October 2006/1.6 3