TABLE OF CONTENTS

Introduction

Coordination of NSS and data dissemination

Resolving discrepancies between national and international data

The new MDG monitoring framework

National MDG monitoring

ECE database on MDG indicators

Statistical capacity – recommended initiatives

Annex 1. List of participants

Introduction

  1. The Workshop on the use of SDMX (Statistical Data and Metadata Exchange) for MDG (Millennium Development Goals) monitoring was held in Amman, from 10-13 July 2011. The workshop was organised by the United Nations Statistics Division (UNSD), the United Nations Economic Commission for Western Asia (ESCWA) and the African Development Bank (AfDB).The Jordanian National Statistical Office hosted the Workshop.
  2. Participants included representatives from 12 National Statistics Offices – namely Comoros, Egypt, Iraq, Jordan, Morocco, Palestine, Saudi Arabia, Somalia, Sudan, Syria, Tunisia and Yemen– as well as by representatives fromUNSDand ESCWA. In addition, some of the sessions were facilitated by SDMX experts from Metadata Technology and DevInfo. The list of participants is included in Annex 1.

Opening addresses

  1. The opening address was given by Mr. Kamal Saleh, the Assitant Director General of of the Department of Statistics in Jordan, who welcomed the participants to the Workshop and highlighted the importance of data documentation and reporting.The opening address was followed by introductory words by UNSD representatives, ESCWA representatives and Metadata Technology experts.

Objectives of the workshop

  1. This Workshop was the first of a series of Workshops to implement recommendations made bythe IAEG/Expert Group on SDMX at its meeting in Geneva in 2009, to promote the use of SDMX for MDG indicator data and metadata exchanges. The objectives of the Workshop were:
  • To inform countries on strategies for improving MDG data dissemination and reporting from NSOs and line ministries to international agencies and other users;
  • To highlight the importance of timely and comprehensive MDG data and metadata reporting and to propose the use of SDMX for the exchanges;
  • To provide hands-on training to representatives of NSOs on SDMX for MDG indicator data and metadata exchanges;
  • To recommend strategies for a better coordination within national statistical systems (NSS) and between the national and international systems;
  • To highlight the flexibility of SDMX supporting all data storage platforms, and to advocate for its use within NSS as well;
  • To inform countries and receive feedback on the existing tools to use SDMX for MDG data and metadata exchanges.
  1. The recommendations of the Workshop will be reported to the Inter-agency and Expert Group on MDG Indicators (IAEG) at its next meeting in October 2011.

Overview of SDMX

  1. Mr. Christopher Nelson from Metadata Technology conducted the initial session providing a general overview of SDMX. He explained that it was designed through an inter-agency agreement in 2001 with the aim of creating a data sharing environment based on XML and looking at the business practices in the field of Statistical Information. He briefly described the SDMX standard, highlighting that all its components are required items that permit to identify each data point in a unique way, and that SDMX is ideally suited for aggregated statistics. An existing similar standard which supports microdata documentation is DDI. Currently, efforts are being addressed towards making SDMX and DDI work together.
  2. Mr. Nelson enumerated the benefits of SDMX and highlighted the importance of adopting the standard for their data and metadata exchanges. Among the benefits presented, SDMX allows to locate data and metadata easily, to publicly open databases for direct query creation, andto effortlessly relate data and metadata for every data point. Without any doubt, the main advantage of using SDMX for MDG data and metadata exchanges would be that all agencies and data providers would be using a single standard, that is, the reporting process would be harmonized, which would also benefit end users through data dissemination.
  3. The SDMX registry was also presented in detail. This tool registers whenever new data is added to a particular database and notifies the consumers so that they can access the SDMX files either through a URL or through direct queries to the database itself.

SDMX and Data Structures

  1. Mr. Nelson explained that in order to exchange any data or metadata using SDMX it is necessary to define a Data Structure. Ideally, there should be different Data Structure Definitions (DSD) for different types of data (e.g. MDG indicators, economic indicators, industry data, etc.). The MDG dataset is described by an MDG DSD that was agreed by the members of the IAEG on MDGs. MDG data providers provide time series data (dataflows) having information agreements, which are materialised on the DSD.
  2. The facilitators, Mr. Matthew Nelson and Mr. Christopher Nelson, provided hands-on training on how to create a DSD, and in particular how to replicate the creation of the MDG DSD. In order to create/edit DSDs they suggested the use of the SDMX Fusion Registry. This software can be downloaded for free from Requirements for it to work are a web server, and the installation of Tomcat and Java software.
  3. The participants were taught how in a DSD it is necessary to specify different concepts so that a computer knows how to read the data. A concept was defined as a unit of knowledge created by a unique combination of characteristics. Concepts can be country, unit, indicator, etc. They can usually be coded using codelists.
  4. A concept can play one of three roles in a DSD:

a)Identify the observation value, that is, be a classificatory variable. These concepts are called dimensions.

b)Add additional metadata about the value. These concepts are called attributes.

c)Be the observation value itself. These concepts are called measure.

  1. It was emphasized that although there are many ways to represent the concepts – text, coded, number, etc. – it is necessary to use the same representation throughout the file. Concepts should be maintained in a concept scheme, separately from code lists. Code lists constraint the value of concepts, defining a shortened language independent representation of the values and giving meaning to the values. Additionally, it was also emphasized that,for each item in SDMX ,the“agency” field must be filled out, indicating which agency is responsible for the maintenance of each indicator. The “agency” therefore acts as a unique identifier for the indicator.
  2. After explaining the theory, the consultants conducted a session providing hands-on exercises for the participants to learn to replicate the creation of the MDG DSD, and to identify different data flows and category schemes. Participants completed a series of exercises and learnt to create and load a database with SDMX data.

SDMX Schemas and Data Formats

  1. Mr. Matthew Nelson introduced the concept of Extensible Mark-up Language (XML), providing the participants with a general idea of the advantages of the language, and explaining that SDMX is based on XML. He highlighted that the main advantage of XML, and therefore of SDMX, is being platform independent.
  2. When using XML, and agreement is needed between the information senders and recipients, in order to determine how to interpret the information. He defined this as “Schema”. SDMX provides schemas to define what is valid SDMX-ML. SDMX schemas that define a valid XML can be downloaded from
  3. Mr. Christopher Nelson presented a set of slides on SDMX Data Formats. He explained that SDMX has two formats for exchanging data, or can be structured in two ways, both based on XML, the generic and the compact format. Both formats have the same amount of information, but each one supports different applications.
  4. Mr. Nelson explained that the schema for the MDGs can easily be viewed using a software tool called Fusion Weaver, which helps with data validation, transformation and schema generation. Based on the use of this software, the consultant taught the participants how to create schemas and valid XML datasets. Schemas can be used to determine whether a data file is valid or not with regards to the DSD that defines the dimensions. An additional application called XML Spy can also be used to check for errors on the XML language used.
  5. The consultants mentioned that two different schemas can be matched (by codes, concepts, categories and organizations), which is a very useful feature if the ID codes of the sender are different from the codes in the database being fed. Similarly, DSDs can also be matched (by dimensions, attributes and measures). Loading SDMX Message Scheme into a product like XML Spy, it is possible to see all the schemas that compose SDMX at a glimpse.

Web Services

  1. Web services represent Internet technologies that allow computer applications to exchange data over the Internet. In order to allow web services to function, it is necessary to use standards for requesting and supplying data, for packaging exchanged data, to describe for web services to one another, or to integrate different web-service applications. SDMX, with its focus on the exchange of data using Internet technologies, will provide some of these standards for statistical data and metadata.
  2. Some of the web services that the consultants presented and explained in detail are:

a)The Fusion Registry: Allows the user to look for concept schemes in a registry (it was exemplified using the MDG DSD) and pull up any lines of xml the user is interested on. This can be achieved by modifying the URL. If we have a database in SDMX, anybody can query any structure by doing this. This is an initial step in order to retrieve data afterwards; it is first necessary to define the data flow in a data query, including the selections for each dimension and the data provider. It was noted that the web service registry returns structures, and it needs to be connected to Fusion Access in order to return data.

b)Fusion Access: This tool enables pulling up data in a similar way the Fusion Registry pulls up concept schemes.With Fusion Access, if a database is defined in SDMX, anybody can query any data, without a need for the data provider to intervene at all.Fusion Access allows to make a query in any web page that uses SDMX and to retrieve data.

c)Fusion Cube: It is a more complex tool that integrates several functions of the data retrieving process. It allows the user to select a data flow and refine his or her search by series and country. Based on the information entered, it queries the web service for data and shows it in plots and other types of charts. Fusion Cube enables the user to filter the data based on any code and shows only the available series.

SDMX and Metadata

  1. Mr. Nelson explained that in SDMX, a DSD allows the user to send metadata in the form of footnotes, that is, values related to observations, series, groups, etc. In the MDG DSD, the only things metadata can be reported for are unit multiplier, nature, source details and footnotes. However, if the data producer wants to link the metadata with other aspects of the data, a Metadata Structure Definition (MSD) is necessary.
  2. The MSD defines what type of metadata is being sent and what the metadata is for. Just like in the case of the DSD, the MSD starts off with concepts, which are maintained in concept schemes. Similarly, there are also metadata attributes that reference a concept, and there is a key that specifies the object that is being described and what are its values in the dataset. Necessary attributes of any metadata are:

a)Who is the metadata from (data provider)

b)Category (within a category scheme of the receiving agency)

c)Statistical presentation (or name of the section in which you insert the text)

Resources

  1. The consultants presented a series of on-line resources where free tools are available for download, including web services and training manuals. Some of the resources presented were:

a) Tutorials, Youtube guides, installation tools, web services and manuals.

b) User forum, tools and guidelines.

c)Circa.europa.eu: Student tutorials with examples.

MDG Data coordination issues and SDMX

  1. Ms. Neda Jafar (UN-ESCWA) presented the way forward to enhance statistical coordination in the ESCWA region on behalf of the director of ESCWA–SD, Mr. Juraj Riecan. The three regional priorities in official statistics are statistical capacity building, collection, processing and dissemination of statistical data, and input to global standard setting initiatives. Together with the League of Arab States (LAS), ESCWA has set up a detailed coordination plan of action in the three regional priority areas for 2011-2015.
  2. Ms. Jafar then presented the work of ESCWA on MDGs. There are five focus areas in ESCWA’s MDG work: improving quality and availability of national data, building national capacity to enhance MDG reporting, organising workshops on training on MDG monitoring, enhancing coordination between countries and agencies, and monitoring the dissemination of MDG data. Significant progress has been made in improving data quality and availability at the country level. Most countries in the regionuse DevInfo to store and disseminate MDG data. Coordination between countries and international agencies has also improved. However, there is still a lot of work to be done to address discrepancies between national and international MDG data. ESCWA has developed a virtual library on MDG as a knowledge sharing-tool to provide users from different disciplines access to selected national, regional and international resources.
  3. Ms. Sara Duerto Valero (UNSD) gave a presentation on MDG data reporting and strategies for upcoming reporting using SDMX. She explained that SDMX is an ideal tool to address some of the existing coordination issues between NSOs and international agencies, and within NSS. Some of the existing coordination issues regarding MDG data and metadata compilation could easily be addressed by using SDMX. Progressmade on the use of SDMX for MDG data and metadata exchangeshas taken place in three phases.First, an MDG DSD was developed and agreed by agencies members of the IAEG on MDG Indicators. Afterwards,some agencies started to use SDMX to feed the MDG database. Currently,capacity building activities are being implemented for countries to start using SDMX for national and international data and metadata exchange. A number of strategies for upcoming MDG reporting using SDMX were also suggested.
  4. Ms. Yongyi Min (UNSD) presented some international and countryexperiences in resolving coordination issues by using SDMX. She gave concrete examples ofMDG data exchange/coordination issues at international leveland helped countries to identify the dimensions and attributes of a data point using the MDG DSD. The presentation also briefly introduced experiences from UNESCO and Mexico in using SDMX for data and metadata exchanges.
  5. After the presentations, a roundtable discussion was conducted on the use SDMX for MDG data and metadata exchanges within countries and with international agencies. The aim was to explore the feasibility of using SDMX for MDG data at the country level and to understand the concerns and capacity building needs of the countries in the region. Countries discussed first the benefits of using SDMX in MDG data and metadata reporting activities to international agencies and data and metadata exchanges within the NSS. At the international level, using SDMX can help reduce response burden in NSOs and line ministries, when replying to different requests from international agencies, and also helps reduce discrepancies between national and international MDG data. At the national level, using SDMX can shorten the data exchange time between line ministries and the NSO and reduce data exchange errors.
  6. As part of the exercise, countries also identified key steps that would need to be taken by the NSOs to be able to use SDMX systematically for MDG exchanges, and evaluated the related need for resources. Countries requested UNSD and ESCWA that they organize additional workshops in the region to strengthen SDMX skills. In particular, it was requested that training be facilitated to NSOs and line ministries, and that countries who are currently implementing SDMX in other regions be invited to future workshops in order to share their experience.

Conclusions and recommendations

  1. The workshop helped the participants realize the benefits of SDMX, including:

a)Improvement of data quality and availability and reduction of data discrepancies;

b)Budgetary savings attributed to the minimum maintenance required, the number of available free tools, and the ease of data and metadata dissemination;

c)Reduction of burden to countries and of errors as International Agencies can directly query data from national databases;