Measuring Statistical Capacity Development: a review of current practices and ideas for the future – moving towards Statistical Capacity 4.0

Draft, December 2017

Co-authored by François Fonteneau, Barbara Yael Baredes, Charlotte Mayard (PARIS21)

1

Contents

1. Introduction

2. Analysis of existing assessments

3. Analysis of selected dimensions

Conclusion

Annex 1: List of Questionnaires

Annex 2: Proposing a framework for Statistical Capacity Development 4.0

Annex 3: Glossary and References

1

1. Introduction

The 2030 Sustainable Development Agenda, with its 17 Sustainable Development Goals (SDGs) and framework of 244 indicators for monitoring and evaluation, entails an unprecedented demand for data which pose new challenges and opportunities for National Statistical Systems (NSS). Despite recent improvements, NSS in many countries, in particular the poorest, struggle to produce basic quality data for policy making and programme implementation. The emerging data demands from the SDGs add to this challenge in a financially constrained environment. In this context, capacity development in data and statistics has become a widely recognised priority, as stated in the Cape Town Global Action Plan for Sustainable Development Data, first unveiled at the UN World Data Forum in January 2017. Following its 2017 Annual Meetings, PARIS21 formed a Task Team to revisit Statistical Capacity Development in this changing context. This “Capacity Development 4.0” (CD4.0) Task Team has three objectives: first, to propose a conceptual framework for CD4.0; second, to look into ways CD4.0 needs to be measured; and third, to design some operationalisation principles for CD4.0. Different experts worked under the auspices of the Task Team.

This paper contributes to the second objective of the Task Team, Measurement. It presents the results of a review undertaken by the PARIS21 Secretariat. These results shall be discussed during the Workshop on New Approaches to Statistical Capacity Development, organised jointly by PARIS21 and UNDP in Paris, 11-12 December 2017. The review focused on the quantifiable aspects of measuring capacity, with a twin perspective on (i) what is being measured currently, and (ii) how is it being measured. Fourteen assessment tools were assessed to measure different dimensions of statistical capacity. The common reference used to benchmark these different assessments is the new CD4.0 conceptual framework, which has also been improved through this exercise. This paper proposes the conceptual framework used as reference, with corresponding definitions. The framework lists all of the elements and categories which need to be covered to provide for capacity measurement compatible with CD4.0. An online platform, the Open Assessment Repository (OAR)[1], has been developed to compile all the assessments and map their content to this conceptual framework.

The paper provides evidence on what current assessments aim to measure, and how they do so. It shows the elements and categories of the CD4.0 framework which are well covered, and those which are not. It discusses and explains when and why this could create problems when understanding statistical capacity, and the associated actions in developing this capacity. It also provides an in-depth analysis of one dimension from the most covered category, organisational skills, and proposes questions to address a category that is not well covered, individual resources. Future lines of work may include further analysis of the assessment’s questions, in terms of similarities, differences and duplications. Ultimately, this analytical question bank could serve as a reference for designing harmonised and comprehensive assessments, with the double objective of generating meaningful quality data while reducing the burden on respondents.

1

2. Analysisof existing assessments

  1. Purpose

The starting point of our analysis was the prevailing confusion around the concepts of ‘capacity’ –understood as “the means to plan and achieve” (UNDP 2009)- and ‘capacity development’. There are two main trends resulting from such confusion: first, the proliferation of assessments created and implemented by international organisations, resulting in a significant response burden on countries –paradoxically constraining low capacity ones even more. Second, the repetition of topics, areas -and even indicators/questions- covered in such assessments. The majority of them focus solely on the most tangible aspects of capacity (mainly, methodology, resources and statistical laws). These trends can be traced back to the absence of a systematic approach to capacity, which sets the grounds for designing assessments and capacity development programmes.

The Capacity Development 4.0 conceptual framework emerges from the collective effort of the Task Team. It was conceived following a participatory approach to provide a conceptual lens through which to analyse capacity in the National Statistial System. The framework is a work in progress that has gone through several iterations and will continue to be improved with feedback from the statistical community.

According to UNDP, capacity is “the process through which individuals, organizations and societies obtain, strengthen and maintain the capabilities to set and achieve their own development objectives over time”. Our work sought to decompose the elements of such definition. After an extensive literature review, we decided to base our framework on Denney and Mallet (2017). These authors state there are three levels of capacity (in line with UNDP): the individual, the organisation and the system. There are five targets that exist in each of these levels: resources, skills and knowledge, management, politics and power and incentives.

We proceeded to decompose the intesections between levels and target (from now on, categories), and found fifty-one unique dimensions detailed in Table 1 (please refer to Annex 2 for further reference).

Table 1: Capacity Development 4.0 framework
Target/Level / Individual / Organisational / System
Resources / Education / Human Resources / Laws, regulations and reference frameworks
Work experience / Budget / Funds infrastructure
Infrastucture (physicalassets, IT, etc.) / Plans (NSDS, sectoral…)
Existing data
Institutional infrastructure
Skills & Knowledge / Technical skills / Methods, practices and QC / Data literacy
Work ‘Know-how’ / Standards and regulations
Autonomy & problem solving / Innovation
Creative thinking
Management / Talent management / Strategic planning / NSS co-ordination
Time management and prioritisation / Organisational design / Data Ecosystem co-ordination
Leadership / HR Management / Advocacy strategy
Strategic thinking / Transparency
Politics & Power / Teamwork & collaboration / Change management / Relationship between producers
Relationship bet. producers and users
Communication & negotiation skills / Workplace politics / Institutional autonomy
Accountability
Strategic networking / Fundraisingstrategies / Policy preferences
Incentives / Career expectations / Career development / Stakeholders' interests and strategies
Income / Compensation and benefits / Public support/endorsement
Work ethic & self-motivation / Organizational culture / Legitimacy
Status / Reputation/Visibility

We used the Capacity Development 4.0 framework as a basis to categorize the questions/indicators of a selection of assessments tools. We then proceeded to identify which categories, levels and targets of capacity development are the most tackled by existing assessments on statistical capacity, and which ones are the least represented in our selection. This also allowed us to observe overlaps and similarities between questions/indicators from different assessments.

  1. Scope

We have included fourteen assessments on statistical capacity in our Open Assessment Repository (OAR): the Self-Assessment Guidance Questionnaire from UNECA (SAGQ), the Snapshot (Eurostat), the Country Assessment of Agricultural Statistical Systems in Africa from AFDB (ASSA), the Tool for Assessing Statistical Capacity from US Census Bureau (TASC), the Light Self - Assessment Questionnaire on the implementation of the European Statistics Code of Practice from the European Commission and OECD (Light SAQ), the Statistical Capacity Indicators from the World Bank (SCI), the Generic National Quality Assessment Framework from UNSD (Generic NQAF, the Global Assessment of the National Statistical System from Eurostat and EFTA (GANSS), the Data Quality Assessment Framework for National Accounts Statistics from IMF(DQAF for National Accounts), the Assessing the National Health Information System. An Assessment Tool from HMN (HIS) , African Statistical Development Indicators from UNECA (StatDI), , the Environment Statistics Self-Assessment Tool from UNSD (ESSAT), the Pan-African Statistics Programme: Peer Reviews of NSIS/NSSS in African countries from Eurostat and AUSTAT (PAS), the extra modules added by IDB to the Tool for Assessing Statistical Capacity (TASC v.IDB).

This amounts to a totalextract of 1974 questions and indicators. We decided to include both in an attempt to capture the variety of instruments that exist in the universe of capacity assessments. Our sample includes both tools developed for voluntary use by countries and those sent by international organisations with the aim to produce a comparison of countries’ statistical development or compliance with international standards. Some assessments are general and others refer to specific statistical areas (such as health).

  1. Classification of assessments

Assessments can be initially classified according to type of instrument. We found three groups: the most frequent are structured questionnaires, such as the TASC from the US Census Bureau. A second type areopen guidelines, including the Light SAQ. Finally, others are collected using secondary sources, such as the SCI. Overall, there are ten structured questionnaires, three open guidelines and one secondary data collection exercise.

Another classification is regarding assessment administration; we have identified two types. The first one is self assessments (that are administered by the authorities of the country under assessment), an example of this category is the Generic NQAF. The second one is peer reviews (this is, an external evaluator with experience on the same field conducts the assessment), such as the PAS. There are questionnaires that fall into both categories (Snapshot, TASC and DQAF for National Accounts), and one (SCI) that cannot be classified into any of them. There are ten in the first category and five in the second one.

The last criterion for classifying assessments is concerning their purpose. We split them into whether they seek to analyse the capacity of a country to collect statistics (whether in general or specific to a sector), for example the Snapshot, or they measure compliance with a specific international framework, for example, the NQAF. In this case we found eleven in the first category and three in the second one.The detailed classification is shared in Annex 1.

  1. Methodology

The first step consisted of identifying and selecting a non-exhaustive list of assessment tools on statistical capacity that represented the variety of instrument types, administration types and purposes covered by the universe of capacity assessments. We also took into consideration whether they had been implemented in several countries. Such assessments were input into our open access Platform, the OAR.

We used our Capacity Development 4.0 framework to analyse their content. We coded each question/indicator to one (in some cases two) dimensions. Some questions/indicators were double counted as a result of the coding, amounting to a total of 2353. All calculations were based on this number. Three people worked independently on each assessment, and held meetings to align their coding criteria. After the initial coding and data-entry, a final quality control was performed for each dimension.

Our main purpose was to first identify first the underlying definition of ‘capacity’ that dictates such assessments. A second objective is to measure the response burden on National Statistical Offices and other official statistics producers (in the case of sectoral assessments), which we have not yey completed.

As an example, the TASC from the US Census Bureau comprises a total of 254 questions, out of which 39% belong to the category Organizational Skills and Knowledge and almost 24% to the category Organizational Management (Graph 1). There are remarkably no questions for the categories Individual Management and Individual Politics and Power.

From an overall perspective, 80% of the questionnaire focuses on the organisational level, with 48% of the questions assessing Skills and Knowledge.

Noticeably, the top five dimensions represent 63% of the TASC, all belonging to the organisational level. Such dimensions are: Methods, practices and quality control (30%), Standards and regulations (9%), Strategic Planning (11%), Resources (7%) and HR Management (6%).

20 dimensions are not covered by the TASC:

  • at the individual level: Education, Creative Thinking, Talent management, Time management and prioritisation, Leadership, Strategic Thinking, teamwork & Collaboration, Communication & Negotiation Skills, Strategic Networking, Career Expectations, Income and Status
  • at the organisational level: Innovation, Change Management, Workplace Politics, Compensations and benefits
  • at the systemic level: Institutional Infrastructure, Policy Preferences, Stakeholders’interests and strategies and Legitimacy

Table 2 exemplifies the coding of questions/indicators from the TASC following our framework for the top 6 categories.

Table 2: Examples of questions for selected dimensions
Dimensions / Question
Human Resources / NSO has a core staff specialised in each of the following: census/survey planning, questionnaire development, field operations, data processing, sampling, data analysis, evaluation, and dissemination
Infrastructure / Physical facilities are adequate to perform required tasks (such as power sources, space for work, places for meetings, space for training, etc.)
Methods, practices and quality control / Periodicity for collection of major surveys is defined and followed
Standards and regulations / NSO uses a national standard for place names and place codes for the geographic hierarchy of the country that encompasses all administrative and statistical areas and is implemented across all geospatial products
Strategic planning / NSO has strategic multi-year plan updated annually which identifies organisational challenges, activities, and goals
HR Management / NSO has an on-the-job training programme that is supported financially and employees are given time to attend training
  1. Results

As Table 3 highlights, most of the questions focus on organisational skills and knowledge. 36% of the questions fall in in this category (three standard deviations above the mean). The organisational level absorbs 63% of overall questions. Systemic resources also add significant respondent burden, 16% of the questions fall in this category. Within these two levels, incentives are almost neglected. Other foregone categories are politics and power in the organisation and skills and knowledge in the system. There are only few questions about individuals working for official statistical agencies.

Table 3: Relative frequency distributionof categories, targets and levels.
Individual / Organisational / System / Total
1 - Resources / 0% / 8% / 16% / 24%
2 - Skills & Knowledge / 1% / 36% / 1% / 38%
3 - Management / 0% / 17% / 9% / 26%
4 - Politics & Power / 0% / 1% / 8% / 9%
5 - Incentives / 1% / 1% / 1% / 3%
Total / 2% / 63% / 35% / 100%

Altogether, the top five dimensions represent more than half of the questions/indicators (Graph 2). Methods, practices and quality controls and standards and regulations build up 35% of the sample (both belong to organisational skills and knowledge). The following three (transparency, laws and reference frameworks and strategic planning) make up another 19%. The remaining 46 dimensions account for 46% of the sample.

A further analysis was conducted following the classifications described in Section c. The first analysis was conducted regarding the type of instrument. The distribution of assessments is uneven, since ten of them are structured questionnaires, three are open guidelines and only one is a secondary data collection exercise. Because of its nature, this last type can only capture outputs and reporting. This focus on realised capacity obscures the more difficult to measure aspects of politics and power and incentives, thus the bulk of questions fall into skills and knowledge (Graph 3).

Questionnaires and open guidelines are instruments that could potentially allow capturing the least visible aspects of capacity. Nonetheless, they reproduce the notion of capacity portrayed by the secondary data collection exercise. Incentives and politics and power together represent less than fifteen percent of the questions/indicators across types of instruments. We tested for statistical significance in the difference between questionnaires and open guidelines using the chi-squared test and obtained p = 0.05, thus we were not able to reject the null hypothesis that both have the same distribution.

We did not find any statistical significance when testing for assessment administration types (p = 0.13), but we did find variation between purposes (p = 0.00). When inquring into these differences, we found that they are driven by the distribution of questions between skills and knowledge and resources. Capacity assessments allotted more questions/indicators than expected to the first target, the ones designed to measure compliance had a higher share inquiring about resources.

When analysing the distribution of questions between levels of capacity (Graph 4), we found that open guidelines had almost the entirety of their indicators/questions devoted to the organisation. Questionnaires and guidelines focused more than sixty percent of their questions/indicators on the system. Again, we tested for statistical significant differences between these last two and found that there were none (p = 0.15)

Further controls were conducted by assessment administration and purpose. We found that the differences in distribution of questions/indicators are statistically significant between types (p = 0.0 for both). Peer reviews have a higher than expected share of questions about the system, which is also the case for capacity assessments. On the contrary, those that seek to measure compliance, focus more on the organisation.