Reinforcing evidence-based policymaking in education. Methodological developments at CRELL, Centre for Research on Lifelong Learning based on Indicators and Benchmarks.

Andrea Saltelli, Joint Research Centre, Unit of Applied Statistics and Econometrics

Aim

This contribution focuses on the advantages, challenges and limitations of using statistical data for educational research and policy purposes. More specifically, the topic addresses the quality assessment of aggregated indicators (indices, composite indicators) which are routinely developed in order to describe and monitor educational phenomena, policies and progress. A method of assessing the robustness of such indicators, sensitivity analysis, will be presented and illustrated using examples relevant to the field of educational research and policy.

Background

Benchmarking as an education evaluation tool is used commonly throughout the world.

In the European context, such indicators are,for instance,developed within the strategic framework for European cooperation in education and training for the period up to 2020 (“ET 2020”)[i]. This framework makes use of a set of benchmarks which serve for monitoring member states progress towards common EU educational targets. For example, two of the most popular ET 2020 – related benchmarks are those meant to describe the countries’ progress towards targets such as reducing the number of school leavers to less than 10% and increasing the proportion of 30-34 year olds having completed a tertiary or equivalent education to at least 40%.

Developing such indicators and other alike is a cumbersome process especially considering the multidimensional nature of the educational phenomena to be describedas well as the multitude of theoretical perspectives and data constraints to be considered. Such endeavour often involves several steps such as arriving at a common definition of the (often complex and partially country context dependent) concept measured, identifying, collecting or building appropriate statistical data to capture such concepts and finally making sure that the models produced are robust and parsimonious enough to be defensible in the face of scientific and technical controversy and ultimately serve well their scientific or policy making purposes.

However, although statistical methods to assess the robustness of such indicators are available,they seem to be applied only by a limited number of educational researchers and therefore their use in practice and their potential to improve the quality of statistical models and research findings is still scarcely explored.

Methodology, Practical and Scientific Implications

As previously mentioned, the method introduced is sensitivity analysis. In general terms, sensitivity analysis can be defined as “The study of how uncertainty in the output of a model (numerical and otherwise) can be apportioned to different sources of uncertainty in the model input” ([ii]). Applications of sensitivity analysis are meant to determine to what extent model performance depends upon the information fed to it, its structure and its underlying assumptions. The advantages of the method will be illustrated by presenting two strategies one can employ to test the robustness of aggregated data such as indices and composite indicators. In particular we discussone invasive and one non-invasive approach, and illustrate their application to some popular league tablesdescribing university performance as well as to the Human Development Index ([iii]). These examples are only illustrations of the work on composite indicators and methodological developments conducted by the Centre for Research on Lifelong Learning([iv]),a European Commission’s research laboratory on statistics of education which focuses on developing valid ways of informing EU countries policies toward meeting the common targets and therefore supporting evidence-based policy making in the field of education.

The method can nevertheless serve to assess the robustness of any kind of indicator developed for research and/or educational policy purposes at national and international levels.

1

[i]Council conclusions of 12 May 2009 on a strategic framework for European cooperation in education and training (‘ET 2020’), Council (2009/C119/02).

[ii] See:

Saltelli, A., Tarantola, S. Campolongo, F., 2000, Sensitivity analysis as an ingredient of modelling, Statistical Science, 15(4), 377-395.

Saltelli, A., Ratto, M. Andres, T., Capolongo, F, Cariboni, J., Gatelli, D., Saisana, M. & Tarantola, S. (2008). Global Sensitivity Analysis. The Primer. John Wiley & Sons, Ltd.

[iii] See:

Saisana, M., d’Hombres, B. Saltelli, A. (2011). Rickety numbers: Volatility of university rankings and policy implications.Research Policy, 40, 165-177.

and

Paruolo, P., Saisana, M., Saltelli, A. (2012). Ratings and rankings: Voodoo or Science?, Journal of the Royal Statistical Society, 176 (2), 1-26.

[iv] One important aspect of CRELL’s activity is the development of benchmarks in the field of education. e.g.: the currently developed employability benchmark approved by the Council of Education in May 2012.