SERV - Adopting Common Statistical Production Architecture (CSPA) in Europe
Keywords:ESS Vision 2020, SERV, CSPA, Modernisation of Statistical Production
1.Introduction
Service orientation is a key issue of UN ECE and Eurostat in their efforts to modernize statistical production. The major benefits to be achieved are sharing not only ideas but “real” technology making the provisioning of tools and services cheaper. The Common Statistical Production Architecture (CSPA)[1] triggered by the UN ECE,and its European counterpart, the Vision 2020 Implementing Project “SERV” (ESS.VIP SERV)[2], are conceptual frameworks that contribute to a harmonized and integrated approach to service orientation within official statistics.
To promote the ideas of sharing services, Eurostat initiated a so-called ESSnet to proof the concept and help national statistical offices to join the program. INSEE/France, ONS/United Kingdom, SURS/Slovenia, Statistics Lithuania, INE/Portugal, DESTATIS/Germany and GENES/France contribute to six work packages originally defined.[3]
The paper presents some of the current issues discussed in different organisations and connects thoseto results achieved in the first part of the ESSnetproject. It indicates common problems faced in the “real world” and motivatesStatistical Organisations to participate in the endeavour.
2.Current issues
The High Level Group for the Modernisation of Official Statistics (HLG-MOS) created an expert group, the CSPA Implementation Group, to foster the use of CSPA worldwide. In the European Statistical System (ESS) complimentary Expert groups have started working in the last years. These groups discuss strategic and operative questions arising in the implementation process. The following points are a sample of topics currently debated
2.1.Business functions – Activities – GSBPM Level 3
While GSBPM [4](and - to some extent - GAMSO) is well established in the statistical community and used by statistical organisations around the world, different NSIs have advanced to new levels of detail. Level 1 (“phases”) and level 2 (“sub-processes”) seem to be universally applicable to all institutions in all domains. Those abstraction layers are useful to some extent but lose their power when trying to identify smaller building blocks of the statistical production chain below level 2. GSBPM already has named level 3 as “activities” and mentions several of these in the original documentation. They have not been systemized yet and different approaches have independently been started to do so. The Enterprise Architecture Framework of the ESS terms these building blocks “Business functions”.
A common systemisation of these national approaches would be a major advance towards a standardized statistical production process.Whether such a systemisation ispossible, i.e. activities generic can be identified across different statistical organisations and domains, is not yet clear. Comparing first attempts in different NSIs shows only limited overlap. While some (like Spain) try to keep to a hierarchical model (by functional decomposition)[5], others identified some auxiliary activities that apply to more than one sub-process. [6]
2.2.Service Granularity and Integration
Another issue that will reappear repeatedly is the question of Service granularity. First proofs of concepts differ strongly in this respect. While encapsulating Blaise is one extreme, minor and specialised statistical functions are good examples for the other. Several guidelines have been proposed. Services should have a clear business value on one side; they should be easily exchangeable and encapsulating autonomous functions on the other side. These aspects are not always compatible with each other. A rather misleading term in this discussion is “Microservices” phrased by Martin Fowler [7]which do not have to be small at all.
The granularity issue is connected to questions of data and control flow (“do services have to be stateless?”), and the integration of user interfaces and persistence layers within a service. This becomes evident when discussing different patterns of service integration. While point-to-point integration still seems to be the most prominent pattern, some organisations have established a “true” SOA infrastructure using an integration platform like a full scale Enterprise Service Bus. First NSIs (Sweden and Norway) experiment with a third approach called “Smart Services, dumb pipes” which enrich the services with their own persistence layers and user interfaces, while using a very basic integration layer only.
We (the authors) believe that questions of GSBPM activities, service granularity and integration are strongly coupled. Probably some kind of agreement on or at least a clear conceptual understanding of these issues is necessary to enhance building and sharing services.
2.3.Information models
The Generic Statistical Information Model (GSIM) has been created to give a common framework on data and metadata flow in statistical processes. This agreement is paramount for building service oriented architectures and exchanging services across organisations. Compared to GSBPM, GSIM is not quite as well adopted by the statistical organisations yet. On more practical terms, experiences with GSIM indicate that it is not detailed enough for the purpose of defining in-, through- and output parameters at the level needed. Therefore a new and more practical information model was created, the Logical Information Model (LIM)[8]. LIM has been described as the “glue” between the Generic Statistical Information Model and physical models like SDMX or DDI.
The structure of LIM is conceptually linked to GSIM, but differs in detail. It is not complete yet and a process has been defined to enhance it in an ordered way. Only by sharing experiences with LIM (as well as GSIM) among statistical organisations can its development and usage support the statistical community in an optimized way.
2.4.Governance Issues
Questions like the licensing, certification of services, support and maintenance models, integrating new requirements and financing, are still open to discussion and hamper the usage of foreign services for production purposes significantly. This applies for services worldwide but even in the European Statistical System with its existing organisational regulations this is a hindrance.
3.ESSnet “Sharing Common Functionalities in Europe”
The work packages of the ESSnet can be grouped into four main topics. These topics reflect some of the issues in section 2 and provide solutions or at least some experiences that should be included into discussion.
3.1.Identifying Business Needs
The ESSnet follows differentstrategies to identify which services are probably the most useful for statistical organisations in general. One approach follows the “bazaar” pattern. Statistical organisations offer services (sometimes encapsulating applications that existed already for a long time) developed for their own purposes. Other organisations use those services in their own premises or as a remote service. In this pattern the Service catalogue hosted by Eurostat should play a key role [9]. However, without some marketing and adaptation of the services, this pattern might not be the ultimate choice.
A second approach, launched by the UN ECE, the organisations make their investment plans public and cooperate in building new or adapting old services.[10] This strategy helps identify common requirements and might be a good solution for the financing issue. It still requires a good understanding of gaps in the production landscape of each individual institution.
A third approach tries to systematically identify those gaps and overlaps of services mapping business functions in a statistical organisation. This approach requires some initial investment into some kind of Enterprise architecture and probably adopting the concepts of GSBPM (on activity level) and GSIM/LIM.
The ESSnet developed an initialcost/benefit model for evaluating the usefulness and prioritisation of services within a statistical organisation or in a statistical system.
3.2.Sharing existing Services between National Statistical Institutes
A main task of the ESSnet is to reuse already existing IT-services (CSPA-compliant) by another organization. Three services (Seasonal Adjustment, Questionnaire Design, Metadata Dissemination) are being integrated into foreign infrastructures.
These trial implementations were carefully documented and allow a better understanding of the costs involved and the benefits achieved. Some specific difficulties in adapting the services in different environments, the usefulness of the service catalogue and the original CSPA-templates as well as LIM for describing the interfaces of services have been explored in detail and give some hints for further improvement.
3.3.Creating Guidelines and establishing support
The efficiency of SERV is growing rapidly with the number of services being shared and the number of uses in national statistical organisations. The maturity of statistical organisations in this respect is very heterogeneous. An investment in technology, methodology and people is necessary. The ESSnet is providing information material (“Guidelines”), dissemination actions (workshops) and specifies the setting up of a Centre of Excellence (CoE) as a central hub for information exchange and support.
A specific topic of the different supporting activities will be the migration path from a traditional application-based landscape to a service-oriented. SOA does not have to be implemented at once.
3.4.Open Source Software as a GovernanceModel
A specific work package of the ESSnet is devoted to the use of Open Source Software (OSS) in official statistics. The term “usage” has several layers of meaning. On the most bottom layer OSS can be used as the platform for running services (Operating System, Application Server, Database Server, Runtime Environment, Programing languages (R)) and integration of services (Enterprise Service Bus). A second layer is providing IT-Services as OSS so that licensing issues can be more easily resolved. A third layer – and perhaps the most interesting– is using the governance of OSS as an organisational blueprint for maintaining, supporting and perhaps even financing the development of new IT-Services within the statistical community.
Analysing and promoting OSS as a model to some of the governance issues listed in 2.4 seems to be promising.
4.Conclusions
Service-oriented Architecture in official statistics will help to make IT-production more efficient and agile than more traditional approaches to Software-Architecture. Its implementation requires a certain amount of organizational and technological maturity. Models of a stepwise migration and different level of supportive material and guidelines could ease the introduction of the concept in national environments.
The ESSnet SCFE is working on work packages that will help the National Statistical Institutes in Europe and beyond to participate on its way to a Common Statistical Production Architecture.
References
[1]UN ECE CSPA:
[2]ESS.VIP SERV:
[3]ESSnet SCFE:
[4]GSBPM:
[5]Spain: p. 3-5
[6]UK: ONS forthcoming
[7]Fowler:
[8]LIM:
[9]Service catalogue:
[10]Investment Planning:
1