A ScoringSystem to Measure the Use of Statistics in the Policy Making Process
Summary Report
Paris, 30January 2015
Table of Contents
Introduction
Methodology
Part 1: Upstream Use of Statistics
Section 1: Basic Facts /50
Section 2: Disaggregated Data /15
Section 3: Further Analysis /5
Part 2: Downstream Use of Statistics
Section 4: Monitoring and Evaluation /15
Section 5: Institutional Arrangements /5
Part 3: Statistical Capacity Development
Section 6: Data Production and Use /5
Section 7: Statistical Capacity Development Programs /5
Conclusion
Findings as of 2012
ANNEX
Scoring Sheet
Sample Scoring Sheet
Keyword List
APPENDICES
Introduction
This scoring system has been developed to assess the use of statistics in the policy making of developing countries. It is based on a review of the most recent poverty reduction strategy paper, medium-term strategy, or national development plan for Lower Middle Income countries (as defined by the OECD Development Assistance Committee List of Recipients of Official Development Assistance[1]). The current results cover45 countries;22 countries from Africa, 19 from Asia and the Pacific, three from Latin America and one from Europe:
Afghanistan / East Timor / MoldovaArmenia / Gambia, The / Mongolia
Azerbaijan / Georgia / Nepal
Bangladesh / Guinea / Nicuragua
Benin / Guinea-Bissau / Niger
Bhutan / Kiribati / Pakistan
Bolivia / Kyrgyz Republic / Papua New Guinea
Burundi / Lao PDR / Rwanda
Cambodia / Lesotho / Samoa
Central African Republic / Liberia / Senegal
Comoros / Madagascar / Sudan
Congo, Dem. Rep / Maldives / Tajikistan
Côte d’Ivoire / Mauritania / Tanzania
Djibouti / Malawi / Togo
Dominica / Mali / Vanuatu
The scoring system provides an analytical framework for the assessment of national policy documents, and allows for a quantitative ranking based on each country’s use of statistics, which falls into three categories: upstream policy use, downstream policy use, and statistical capacity development. Upstream refers to the extent to which statistics and statistical analysis have contributed materially to policy and decision-making (measured in sections 1-3). Downstream refers to the responsiveness of policy and decision-making to monitoring and evaluation (sections 4-5). Statistical capacity development involves taking steps to ensure that statistics are sufficiently available and of adequate quality to underpin policy processes (sections 6-7). In general, a “statistic” is considered to be any measurable value given in the report including absolute values, percentages, or fractions, used to assess a country's past, current, or future levels of development in a given field. This assessment limits the definition of a statistic to internal development indicators, and therefore does not include exogenously-determined variables (world interest rates, etc.), given geographical features (land area, coastline, etc.), or government financing figures.
It is important to note that the index is just one way to measure statistical use in the policy-making process. This is not an exact science. While it can be useful to quantify ‘statistical use’ there is no single correct way to measure the use of statistics and the design of this index and its scoring system are one of many ways to quantify this. While we have tried to make this assessment as objective and transparent as possible, there remains an irreducible degree of subjectivity in the report.
First, the basic design of the scoring system – with three subcategories -is just one of many scoring systems that could have been constructed. Second, the selection of the weighting attached to the different component scores in the index is also, essentially, arbitrary and many other schemes are possible. Indeed those interested in using this index are encouraged to consider using other weights that might better reflect their own priorities.Third, the guidelines setting the calculation of each component score are open to debate. And fourth, although the interpretation of those guidelines is set out in section II, there are inevitably a number of ‘judgement calls’ inherent in interpreting the guidelines to set each country’s score.
The report, and scores, should be used with all this in mind. Because calculating a score is not an exact science, small differences between countries should not be taken as significant. However, large differences can be taken to suggest genuine differences in the PRSP process and its use of evidence. The index, therefore, should be used to provoke debate and pave the way to a more detailed investigation. It seeks to generate debate and raise questions, rather than rank each country in a strict hierarchy.
Challenges in creating such a scoring system exist due to the difficulty of comparing numerous policy documents published in a variety of formats and with differing development objectives. Furthermore, a proper analysis must contain a consistent mechanism to produce quantifiable results and must not rely solely on subjective assessment. Thus, a large portion of the scoring system is based on the frequency of statistical references throughout the text, while another measurement addresses the scope of its coverage.
The next section provides a detailed breakdown of the methodology behind the scoring system, with calculations and reasoning described. This is followed by an analysis of the findings obtained as of 2013 (including baseline and milestone indicators for indicator G2 of the PARIS21 logical framework). The master scoring sheet and one country example (Benin) are included in the Annex. Finally, an appendix illustrates the scoring process using annotated pages from Burundi’s PRSP to help explain how the scoring system was applied.
Methodology
The scoring system is broken down into seven sections: three relating to upstream use of statistics, two to downstream use of statistics, and two to statistical capacity development. Each section is given a score according to its weight; these scores are then added together to obtain a final score with a maximum of 100.
Part 1: Upstream Use of Statistics
Section 1: Basic Facts/50
Current Statistics/25
Historical Trends/12.5
Forecasts/12.5
Rationale:
Section 1 measures the frequency of statistical use throughout the policy document, with the aim of determining how often statistics are quoted to present current situations, past trends, or forecasts for future development (“forecasts”, in this sense, refer to any statistic given for future years, whether it is simply a projection based on current conditions, or a specific target that the government aims to achieve.)This section is the most highly-weighted in the scoring system, as it provides a quantifiable way to measure the intensity of statistics usage in the development plan. Countries obtaining a high score in this regard must show that they have the means to measure current and past development indicators or to develop reasonable targets for future progress and must use these statistics appropriately. On the other hand, ifa country states its indicators in less quantifiable terms, the measurement of its development and progress becomes less effective.
This section focuses on textual references to statistics, with an effort to avoid tallying repeated statistics. Most tabular data, unless used to sum up a section of text, is not included in the frequency count, but is considered later in the Further Analysis section (3).
Calculations:
The total frequency (F) of each type of statistic (current statistics, historical trends, forecasts) is tallied separately in the scoring sheet, and given a score out of five, with the scoring intervals based on the average distribution of statistical use among the countries analysed. The breakdown by sector is not relevant to this part, but will be of importance to Section 2.
The scoring is as follows:
Current Statistics / Score / ScoreF = 0 / 0 / 75 < F < 89 / 3
0 < F < 14 / 0.5 / 90 < F < 104 / 3.5
15 < F < 29 / 1 / 105 < F < 119 / 4
30 < F < 44 / 1.5 / 120 < F < 134 / 4.5
45 < F < 59 / 2 / F > 135 / 5
60 < F < 74 / 2.5
Historical Trends
F = 0 / 0 / 40 < F < 47 / 3
0 < F < 7 / 0.5 / 48 < F < 55 / 3.5
8 < F < 15 / 1 / 56 < F < 63 / 4
16 < F < 23 / 1.5 / 64 < F < 71 / 4.5
24 < F < 31 / 2 / F > 72 / 5
32 < F < 39 / 2.5
Forecasts
F = 0 / 0 / 60 < F < 71 / 3
0 < F < 11 / 0.5 / 72 < F < 83 / 3.5
12 < F < 23 / 1 / 84 < F < 95 / 4
24 < F < 35 / 1.5 / 96 < F < 107 / 4.5
36 < F < 47 / 2 / F > 108 / 5
48 < F < 59 / 2.5
Once a score out of 5 is obtained for each of the three categories, they are weighted together: – the score for Current Statistics is multiplied by 5, while those for Historical Trends and Forecasts are each multiplied by 2.5. These three scores are then added together to obtain a Section 1 score out of a maximum of 50.
Section 2: Disaggregated Data/15
Rationale:
Having assessed the sheer frequency of statistical use in the previous part, Section 2 takes into account the scope of data used throughout the document. Therefore, it assigns higher scores to policy documents that use data to measure a wide range of indicators. This means that a country with a very high frequency of data use will not necessarily obtain a high score if it relates to only one or a few indicators or topic areas. To analyse this, the data from Section 1 is broken down into 25 topics and 5 divisions, which were based on those areas that most frequently arose in PRSP reports.
Topics:
- PovertyFarming & Agriculture
- Economic GrowthFisheries
- Other Macroeconomic Data (inflation, unemployment, etc.)Forestry
- DemographicsMining
- TradeTourism
- Health StatisticsCulture
- NutritionSocial Security
- HIVBanking & Credit
- Water & SanitationTelecommunications
- EnergyHousing & Land Ownership
- EducationEnvironment & Conservation
- LiteracyGovernance (corruption, security, etc.)
- Infrastructure (roads, transportation, bridges, etc.)
Divisions:
- Geography
- Rural/Urban
- Income inequality
- Gender
- Age
The disaggregated data measurement assigns only one score to a topic that is broken down by sector. To take an example, assume current poverty rates are given for five regions (that is, they are broken down geographically). While this would count as five separate scores under the Basic Facts section (Poverty Current Statistics), only one score would be added to the Disaggregated Data section under “Geographical” (Current). In effect, one score is given in this section for each measurement that is disaggregated based on a single sector, regardless of how many separate statistics are given within this sector.
Calculations:
Based on the total number of sectors covered (topics + divisions), each type of statistic (current statistics, historical trends, forecasts) is first given a score out of five. These three scores are then averaged and the result multiplied by 3 to obtain the final score for Section 2. The scoring is as follows:
Number of Sectors Covered / Score0 / 0
1 to 5 / 1
6 to 11 / 2
12 to 17 / 3
18 to 23 / 4
24 to 30 / 5
Section 3: Further Analysis/5
Rationale:
This section gives credit for additional analysis that goes beyond stating simple facts, trends or goals. It rewards policy documents that draw links between statistical trends, that use tables and regressions to provide more complex predictions, or develop alternate scenarios for development based on future indicators. This section is scored subjectively out of five, but roughly adheres to the following guidelines:
Guidelines / ScoreNo analysis present. / 0
One or two correlations between indicator data. / 1
Data linkages; some predictions; models not present. / 2
Modelling graphs and tables present; correlations between data trends outlined; future predictions for a variety of indicators. / 3
Many graphs, tables, and predictions; data provided for several possible future scenarios for a variety of indicators; correlations and linkages between data trends outlined. / 4
Data correlations; growth models developed; advanced graphs, tables, regressions and/or categorical analyses included; detailed growth predictions for a variety of future scenarios. / 5
Part 2: Downstream Use of Statistics
Section 4: Monitoring and Evaluation/15
Rationale:
A monitoring and evaluation framework is vital to measuring the success of the development plan and ensuring that policy-making is based on indicators that are measurable and usable. This section contains four assessments that measure the responsiveness of decision-making to monitoring and evaluation activities. The final score for this section (a maximum of 15), is obtained by averaging four individual scores (out of 5) and multiplying the result by 3.
Calculations:
The first of the four scores is obtained simply by verifying that a monitoring and evaluation framework is in place. This is often found near the end of a policy document in a section entitled “Mechanisms for Implementation, Monitoring-Evaluation and Risks” or something similar. A score of 5 is awarded if this framework is in place – otherwise, a score of 0 is given.
The second score depends on the indicator table(s) present in the document, which often provide baselines and targets for development in line with the Millenium Development Goals and national development strategies. This score takes into account 20 subject areas that are commonly addressed in these tables, and assigns a score of 0-5 based on the scope of its coverage:
Scope of Coverage (Topics) / Score0 / 0
1 to 4 / 1
5 to 8 / 2
9 to 12 / 3
13 to 16 / 4
17 to 20 / 5
The third and fourth scores look at the baselines and targets, respectively, provided in the indicator table. A score of 0-5 is determined for each depending on the percentage of indicators missing baselines or targets. This provides an idea as to the mechanisms in place for measuring data.
Percentage of Indicators Missing Baseline/Target / Score100% / 0
30% ≤ X < 100% / 1
20% ≤ X < 30% / 2
10% ≤ X < 20% / 3
0% ≤ X < 10% / 4
0% / 5
Once these four scores are obtained, they are averaged, providing a maximum result of five. This result is then multiplied by three to obtain the sectional score out of 15.
Section 5: Institutional Arrangements/5
Rationale:
This section determines a simple score out of five based on whether or not institutional arrangements are in place for reporting as part of the monitoring and evaluation framework. It considers three elements of the institutional arrangement: the responsibilities delegated to the reporting parties; the details of the reporting processes; and the timing and periodicity of reports.
Calculations:
Each of these three elements is given a score based on whether they are outlined in the document (5 for yes, 0 for no). These three scores are averaged to obtain the score for Section 5.
Part 3: Statistical Capacity Development
These final sections on statistical capacity developmentdo not measure the actual usage of statistics in each policy document, but reflect the underlying attitude towards statistics within the document. They reward the identification of statistical problems, proposals for data improvement, and overall acknowledgement of statistics as key to the development process.
Section 6: Data Production and Use/5
Rationale:
Section 6 assigns a score out of five based on the extent to which the policy document identifies weaknesses in current data collection, as well as its discussion of specific initiatives designed to improve data in a particular sector or for a particular purpose (for example, improvement of the health information system or funding for an environmental database and management information system). Examples from the policy documents include:
“…information systems, particularly for data on the economy, still have significant gaps that make conventional decision-making tools unworkable.”
-Mauritania, p. 55
“The NIGERINFO databank will be used to stock and present the indicators that are required for monitoring the various sector strategies and the DPRS. The databank will receive sector data and data from the various surveys. Sector data bases will therefore have to be upgraded.”
-Niger, p. 115
“Develop reliable agricultural statistics and an effective early warning system.”
-Malawi, p. 80
Calculations:
To obtain a score for this section, individual references to the aforementioned criteria are noted throughout the document. They are then tallied and scored according to the following system:
Number of References / Score0 / 0
1 to 5 / 0.5
6 to 10 / 1
11 to 15 / 1.5
16 to 20 / 2
21 to 25 / 2.5
26 to 30 / 3
31 to 35 / 3.5
36 to 40 / 4
41 to 45 / 4.5
46 or greater / 5
Section 7: Statistical Capacity Development Programs/5
Rationale:
This section is similar to Section 6, but deals with statistical capacity development in a broader sense, that is, with the institutions and programs that entrench statistical analysis in the policy-making process and promote access and dissemination of statistics as vital to the achievement of a developed and democratic society. Like the previous section, it applies a score out of five based on a tally of individual references to statistical capacity development programs throughout the document. In addition, it rewards the policy document for containing a section devoted to discussion of the national statistical office or agency, either proposed or already in place, and its achievements, successes, and challenges.
Examples from the policy documents:
“DGSCN will see to broad dissemination of the quantitative data necessary for monitoring and evaluation of the poverty reduction strategy by making use of appropriate channels, particularly the PRSP website and Togo Info. It will publish analyses of poverty in Togo on a regular basis.”
-Togo, p. 99
“Design national statistical Classifiers aligned to European Union standards.”
-Moldova, p. 64
“As regards gender, it is important that both men and women have equal access to accurate, timely and relevant information. This will allow them to participate fully in democratic decision-making, such as voting and contributing to planning processes, and provide them with an evidence base for evaluating government performance at local and national level.”
-Rwanda, p. 89
Calculations:
The final Section 7 score is marked out of a maximum of 5. The score is based on two parts, A and B.
(Part A) If the document contains a section on the national statistical office or agency it receives mark of 1.67. It receives 0 for part otherwise.
(PART B) The score comes from adjusting by two-thirds the score out of five based on the following:
Number of References / Score0 / 0
1 to 2 / 0.5
3 to 4 / 1
5 to 6 / 1.5
7 to 8 / 2
9 to 10 / 2.5
11 to 12 / 3
13 to 14 / 3.5
15 to 16 / 4
17 to 18 / 4.5
19 or greater / 5
Conclusion
The index we have calculated is just one way to measure statistical use in the policy-making process. This is not an exact science. While it can be useful to quantify ‘statistical use’ there is no single correct way to measure the use of statistics and the design of this index and its scoring system are one of many ways to quantify this. While we have tried to make this assessment as objective and transparent as possible, there remains an irreducible degree of subjectivity in the report. The report, and scores, should be used with all this in mind. Because calculating a score is not an exact science, small differences between countries should not be taken as significant. However, large differences can be taken to suggest genuine differences in the PRSP process and its use of evidence. The index, therefore, should be used to provoke debate and pave the way to a more detailed investigation, It seeks to generate debate and raise questions, rather than rank each country in a strict hierarchy.