Q U E S T I O N N A I R E
ESSnet
On micro data linking and data warehousing
in statistical production
------
Inventory of current best practices
in integrated business data systems
Coordinator: Netherlands
Partners: Estonia, Italy, Lithuania, Portugal, Sweden, UK
1. Context / Background
In the context of the MEETS programme (action 3.1) the ESSnet on micro data linking and data warehousing in statistical production is established by 7 partners in 2010. The main goal of action 3.1 is to make better use of data that already exist in the statistical system, with as ultimate aim:
‘To create fully integrated data sets for enterprise and trade statistics at micro level:
Ø a 'data warehouse' approach to statistics.’
The overall objective of this ESSnet is to provide assistance in the development of more integrated databases and data production systems for business statistics in ESS Member States. The ESSnet has to work on issues that are common for the majority of the ESS NSI’s when applying a data warehousing approach for statistics.
Its general objectives are:
§ Review of current best practices in integrated business data systems
§ Identification of problems and solutions in current practices and the opportunities
that a data warehouse might provide
§ Examination of ways in which data can be combined to support new outputs
§ Provision of recommendations on how the ESS can improve data warehousing
§ Dissemination of the ESSnet results to all ESS countries.
The aimed results in daily statistical practice are:
§ Increase the efficiency of dataprocessing in statistical production systems
§ Maximize the reuse of already collected data in the statistical system.
As the field and scope of this ESSnet is very broad, it first needs a strict definition of specific detailed subjects which the ESSnet should explore and study in depth. Therefore the activities in the first year will mainly concentrate on an inventory of the current (best) practices in member states, with the following deliverables:
§ a detailed overview of current best practices in integrated business data systems;
§ a prioritised list of problems and desired solutions as indicated by the Member States;
§ an overview of opportunities that data warehousing can provide for Member States.
§ a conceptual model of the statistical data warehouse
Based upon these deliverables, partners will define and prioritize the subjects that need to be explored in depth. The main criterion for this will be the subjects indicated as most urgent by most Member States. Final result will be a detailed work plan for future work to be run under the FPA, with a clear repartition of the subjects in to specific work packages.
2. Definitions
This ESSnet seeks to define a functional model of the statistical data warehouse, so that the issues raised by the ESSnet can be assessed in a generic and standardized way. The ESSnet will help Member States to develop and implement a maximum efficient statistical process for business and trade statistics, independent of any (technological) specific architecture. Specific implementations at an NSI can be used to illustrate good or bad practice but will not be part of the recommendation unless they are universally applicable.
The broad definition of a data warehouse to be used in this ESSnet is therefore:
‘A common conceptual model for managing all available data of interest, enabling the NSI to (re)use this data to create new data/new outputs, to produce the necessary information and perform reporting and analysis, regardless of the data’s source.’
More specific definitions will be developed as the need arises but as far as possible the definition will be kept at a functional level.
For the purposes of this questionnaire, we will define a ‘data warehouse’ as
‘A system or set of integrated systems, designed to handle the processing of statistical data in the production of (business) statistics’.
We have developed two example models which we have called the “Data Model” and “Process Model”.
1. the ‘data model’ perspective
In this perspective the data warehouse philosophy for storing and processing data is irrespective of where data has come from or where data is going to.
2. the ‘process model’ perspective
In this perspective, the data warehouse philosophy for storing and processing data is considered as the set of production processes needed to manage the inputs and generate the outputs.
A complete description of these perspectives is provided in the annexes.
These models are two idealised representations of perspectives on statistical processes and are intended to help put the following questions into context.
3. Questionnaire
A. Philosophy
1. Does your organisation have a single conceptual approach to processing data in the production of business statistics ?
Yes / [ ] / à goto 1.1No / [ ] / à goto 1.2
1.1 How would you describe your single conceptual approach ?
More like a ‘Data Model’ / [ ] / -> goto 2.More like a ‘Process Model’ / [ ] / -> goto 2.
A mix of approaches / [ ] / -> goto 2.
A completely different concept of data processing / [ ] / -> goto 2.
1.2 In case of no single conceptual approach, can the current situation be best characterized by:
The ‘Data Model’ / [ ]The ‘Process Model’ / [ ]
Neither of these / [ ]
Remarks:
2. Do you plan to change the current single conceptual approach or implement a new single conceptual approach in the next five years ?
No / [ ] / à goto 3
2.1 How will the approach change ?
3. Do you mainly see one-to-one correspondences between input data and outputs ?
Yes / [ ] / One input source is related to one specific output.No / [ ] / Input sources are used for several outputs.
3.1 Do you see that changing during the next five years ?
No / [ ]
4. Do you see a data warehouse as active or passive ?
incorporates weighting and coherency work etc.
Passive / [ ] / The data warehouse (DWH) is primarily there to hold data;
weighting, consistency etc. are done outside.
5. How important is it that metadata is an integral part of the system ?
Extremely Important / [ ]Important / [ ]
Somewhat important / [ ]
Not important at all / [ ]
B. Practical implementation
6. Do you have a single coherent system which covers most of your data in the production of business statistics ?
No / [ ]
7. Do you plan to change in the next five years?
No / [ ]
8. Is your metadata currently integrated into your data systems ?
Note: ie is it possible to interrogate the data system directly about characteristics of the data,
rather than looking up the information in a separate repository/metadata system ?
In some systems / [ ] / à goto 9
Not in any system / [ ] / à goto 9
9. Do you plan to change the current role of metadata in the next five years ?
No / [ ]
10. Is your business register currently (or will it be) an integral part of the current (or planned) DWH system or does the register sit outside ?
Outside the system / [ ]
11. Is your data input for current needs integrated into your data systems ?
Note: ie do input data come straight into the system rather than manually/explicitly
uploading files, possibly needing to convert?
Partially integrated / [ ] / à goto 11.1
Not integrated / [ ] / à goto 11.2
11.1 Which part(s) of your data are integrated ?
Web collection / [ ]
Admin data / [ ]
Others:………. / [ ]
11.2 Is this likely to meet your needs during the next five years ?
No / [ ]
If not, please explain / give details:
12. Are your current output requirements integrated into your data systems ?
Note: ie can/are all outputs be generated (automatically) from the system?
No / [ ]
12.1 Is this likely to meet your needs during the next five years ?
No / [ ]
If not, please explain / give details:
13. Are your conceptual models generally realised in practice ?
Note: eg could you map directly from your functional model/description of the data system
to the actual implementation ?
Not completely / [ ] / à goto 13.1
Only for some systems / [ ] / à goto 13.1
No / [ ] / à goto 13.1
We don’t conceptualise / [ ] / à goto 13.1
13.1 What are the factors causing this ?
Note: eg downgrading of importance/expectations/funds, technical limitations,
statistical surprises, etc.
C. Motivation / Barriers
14. What do/did you see as the main motivation to start DWH in your business statistics systems ?
Please, tick up 5 items at most
More variation in inputs and/or outputs / [ ]
Data linking / integration / [ ]
Reducing risks in data management / [ ]
To implement changes in business process architecture / [ ]
To improve proces integration in business statistics production / [ ]
To improve efficiency in business statistics production / [ ]
Reducing risks in production processes / [ ]
Cost savings / [ ]
Consolidation of legacy systems / [ ]
Technical efficiency / [ ]
Others: [Please specify]
15. What do you see as the main methodological barriers to implementing an integrated system ?
Please, tick up 5 items at most
Limited financial benefit / [ ]
Requires significant investment / [ ]
Too disruptive to regular work processes / [ ]
Not enough expertise / [ ]
Too difficult, complex / [ ]
Statistical requirements are too different / [ ]
Data characteristics are too different / [ ]
Not enough expertise / [ ]
Methodological barriers
Data characteristics are too different / [ ]
Statistical requirements are too different / [ ]
Metadata model / [ ]
Data linking / [ ]
Data confidentiality / [ ]
Technical barriers
Technology (data-storage; performance) / [ ]
Complexity / [ ]
Insufficient development time / [ ]
Too high costs / [ ]
Problem of transformation, dealing with legacy, consistency of data/outputs / [ ]
Data confidentiality / [ ]
Others: [Please specify]
16. Which problems encountered in data integration you desire to solve with an
integrated DWH-system ?
16.1 In case you implemented already an integrated system. How did you solve them ?
17. Which problems encountered in process integration you desire to solve with an integrated
DWH-system ?
17.1 In case you implemented already an integrated system. How did you solve them ?
18. Do you think that the results of this ESSnet are useful for your work ?
Yes / [ ]No / [ ]
Please explain / give details:
.
D. Follow-up
19. Which points of view were taken into account when completing this questionnaire ?
IT system specialist (ie technical implementer) / [ ]Methodological (conceptual designer) / [ ]
Survey production (process users) / [ ]
Statistical production (processing input) / [ ]
Analyst (output/end point user) / [ ]
Other: [Please specify]
20. Would you be available to take part in a follow-up telephone interview or a visit from EssNet project staff ?
Yes / [ ]No / [ ]
21. Is there something else you want to mention ?
Please explain / give details:Thank You !!!
4. Annex
Data warehousing: the conceptual perspective
Two perspectives were discussed: the ‘data model’ and the ‘process model’.
In the ‘data model’ perspective:
§ The core is a unit for storing and processing data, irrespective of where it has come from or where it is going to.
§ The process/store is not designed around either the type of input or output, but around the data item.
§ Data acquisition is driven by availability of sources; output production is driven by availability of data in the store.
§ Metadata, security restrictions, weights etc are attached to the data item.
§ Getting the data from a source and converting it to a usable form is a separate part of the system.
§ Similarly, producing a type of output is a discrete process.