Concepts and Ideas

Concepts and Ideas

Monitoring and Evaluation in the practice of European Cohesion Policy 2014+

- European Regional Development Fund and Cohesion Fund –

This draft is a paper for discussion between DG Regional Policy and the Member States evaluation network in April 2011. The paper will be revised taking into account the discussions and – later in 2011 – the regulatory proposals of the European Commission for a future Cohesion Policy.

The common indicators in annex 1 are set to become part of a regulation (most likely ERDF-regulation or implementing regulation).

Note that the definitions for "result" and "impact" differ from the current practice.

Table of contents

Keyconcepts

1.1.Intervention logic of a programme as starting point

1.2.Monitoring and evaluation

1.2.1.Monitoring

1.2.2.Evaluation

1.2.2.1. Impact evaluation

1.2.2.2. Implementation evaluation

1.2.2.3. The evaluation of integrated programmes

Standards for evaluations

Practical points for the 2014-20 programming period
Programming
Ex ante evaluation of Operational Programmes
Monitoring of Operational Programmes and of the partnership contract
Evaluation during the programming period
Evaluation plan
Ex post evaluation

Glossary

Annexes

1 List of common indicators

2 Examples of result indicators

3 A structure for standards

4 Recommended reading

Key concepts

This first section explains a common understanding of concepts and key terminology as a basis for the remainder of the paper.

1.1.Programming as starting point. Results and result indicators[1].

The starting point of designing any public intervention is to identify a problem or a need to be addressed. In essence, as there will be always a multitude of real or perceived needs, the decision on which needs should be tackled is the result of a deliberative social process (a "political decision"). It is part of this process to also define the desired situation that should be arrived at as a change from the current one. A public intervention often will aim at more than one result. For instance, investment in the railway network might aim to improve the accessibility of a region and to reduce the burden on the environment.

The intended result, or simply result, is the specific dimension of the well-being and progress for people that motivates policy action, i.e. to be modified by the interventions designed.

Once a result has been chosen it must be represented by appropriate measures. This can be done by selecting one or more result indicators. Examples for the above case of railways are a reduction in travel time, a reduction in CO2 emissions and fewer traffic fatalities.

Result indicators are variables that provide information on some specific aspects of results that lend themselves to be measured.

A precise definition of result indicators facilitates understanding of the problem and the policy need and will facilitate a later judgement about whether or not objectives have been met. In this context it can be useful to set targets for result indicators. Concentration of resources on a limited number of interventions is obviously helpful to achieve such clarity.

Having identified needs and a desired result does not yet mean that the public intervention has been fully designed. The reason behind this is that in most cases different factors can drive a change. Any policymaker must analyse such reasons and decide which ones will be the object of public policy. In other words, an intervention logic must be established. For example, if a reduced number of traffic accidents is the result indicator of a programme, safer roads, a modal shift towards rail or a better behaviour of drivers could be assumed to change the situation. The programme designer must clarify which of those factors he wants to affect. The specific activity of programmes can be typically be captured by output indicators.

Outputs are the direct product of programmes, they are intended to contribute to results.

Often it can be useful to illustrate an intervention graphically by a logical framework. As mentioned above, such a stylised representation of a programme should reflect that an intervention can have several results to be addressed and that several outputs can lead to these changes. Equally, it can be useful to differentiate the result(s) by affected groups and time horizons.

Impact is the effect of the contribution of the outputs supported by the policy to the change in the result indicator.

Differencesof concepts and terms between 2007-13 and 2014+

In 2007-13, impact meant the ultimate effect of the intervention, in most cases after a significant time lapse. MemberStates and the Commission dedicated resources to distinguishing results (direct or short term effects) from impacts without sufficient attention to the question how values against these categories could be obtained.

The new approach shifts the accent at all stages in the process to the policy objectives being targeted. This enhances evaluability, as clarity of intended changes and ex ante identification of evaluation methods means that the results of the policy can be monitored and evaluated.

Real policy decisions are driven by needs that are of different nature, be it "impacts" or "results".

The new approach brings evaluation guidance closer to these real decision making.

In large parts of the literature, "impact" means the effect of the intervention net of other influences on a certain variable independent of the question if the variables belongs to outputs, results or impact in the traditional sense.

The new approach follows the argumentation in the literature. The aim is again to centre attention on the question of evaluability.

Taken together, Cohesion policy is set to become more result-oriented with programmes that are designed to deliver and to be evaluated.

1.2 Monitoring and evaluation: support to management and capturing effects

The public expects managing authorities to fulfil two essential tasks when running a programme:

to deliver the programme in an efficient manner and to be accountable on this (the management of a programme) and
to verify with credibility whether a programme has delivered the desired effects. We will argue below that monitoring is a tool serving foremost the management purpose, while evaluation contributes to both tasks.

Policy learning is an overarching objective of all evaluations.

1.2.1 Monitoring

To monitor means to observe. Monitoring of outputsmeans to observe whether desired products are occurring and whether implementation is on track. In general, the outputs measured are the direct and near-term consequences of project activities.

Cohesion policy programmes are implemented in the context of multilevel governance with a clear demarcation of roles and responsibilities. The actors in this system – implementing agencies, managing authorities, the national and the EU level - differ in their information needs to be met by monitoring. One of the tasks at the European level is to aggregate certain information across all programmes in order to be accountable to the Council, Parliament, the Court of Auditors and EU citizens in general on what Cohesion Policy resources are spent on. This is the task of common indicators, mostly outputs, defined at EU level.

Monitoring also observes changes in the result indicators(policy monitoring). Tracking the values of result indicators allows a judgement on whether or not the indicators move in the desired direction, in other words, if needs are being met. If they are not, this can prompt reflection on the appropriateness and effectiveness of interventions or, indeed, on the appropriateness of the result indicators chosen.

The values of result indicators, both for baselines and at later points in time, in some cases can be obtained from national or regional statistics[2]. In other cases it might be necessary to carry out surveys or to use other observation techniques.

1.2.2 Evaluation

Changes in the result which actually take place are a result of the actions co-financed by the public intervention, for example by the Funds, as well as other factors. In other words, knowing the difference between the situation before and after the public intervention in most cases does not equal the effect of public intervention.

Change in result indicator═ contribution of intervention + contribution of other factors

1.2.2.1 Impact evaluation – capturing effects

To disentangle the effects of the intervention from the contribution of other factors and to understand the functioning of a programme is a task for impact evaluation. Two distinctive questions are to be answered:

did the public intervention have an effect at all and if yes, how big – positive or negative – was this effect. The question is: Does it work? Is there a causal link?
why an intervention produces intended and unintended effects. The goal is to answer the “why it works?” question.

Sometimes, we can provide quantifiedevidence that an intervention works. More often, evaluations can provide judgements on whether the intervention worked or not. In both cases it is preferable to design a methodology which uses more than one method ("triangulation" suggests using 3!).

The importance of theory based impact evaluations stems from the fact that a great deal of other information, besides quantifiable causal effect, is useful to policy makers to make decisions and to be accountable to citizens. The question of why a set of interventions produces effects, intended as well as unintended, for whom and in which context, is as relevant, important, and equally challenging, if not more, than the “made a difference” question. This approach does not produce a number, it produces a narrative. Theory-based evaluations can provide a precious and rare commodity, insights into why things work, or don’t.The main focus is not a counterfactual (“how things would have been without”) rather a theory of change (“how things should logically work to produce the desired change”). The centrality of the theory of change justifies calling this approach theory-based impact evaluation.

Typical methods are use of administrative data, literature reviews, case studies, interviews, surveys and other qualitative methods. Often mentioned approaches are realist evaluation, general elimination methodology and participatory evaluation. The evidence marshaled during such an evaluation, both of quantitative and qualitative nature, should enable the evaluator to answer the evaluation questions and to provide a judgment on the success of the public intervention. Like for all other evaluations, this judgment will be based on imperfect information. What is important is that the evidence base is good enough to ensure a decision making with the degree of certainty necessary for the intervention under consideration.

Counterfactual impact evaluationshave the potential to provide a credible answer to the question "Does it work?". The central question of counterfactual evaluations is rather narrow—how much difference does a treatment make—and produces answers that are typically numbers, or more often differences, to which it is plausible to give a causal interpretation based on empirical evidence and some assumptions. Is the difference observed in the outcome after the implementation of the intervention caused by the intervention itself, or by something else? Evaluations of this type are based on models of cause and effect and require a credible and rigorously defined counterfactual to control for factors other than the intervention that might account for the observed change.

Typical methods are difference-in-difference, discontinuity design, propensity score matching, the use of instrumental variables and random control trials. The existence of baseline data and information on the situation of supported and non-supported beneficiaries at a certain point in time after the public intervention is a critical precondition for the applicability of counterfactual methods.

Note that counterfactual methods can typically be applied only to some interventions (e.g., training, enterprise support), i.e. relatively homogenous interventions with a high number of beneficiaries. If a public authority wishes to estimate the effects of interventions for which counterfactual methods are inappropriate, other methods can be used. For the transport example, this could be an ex post cost-benefit-analysis or a sectoral transport model.

Ideally, counterfactual impact evaluations and theory based evaluations should complement each other. While they should be kept separate methodologically, policymakers should use the results of both sets of methods as they see fit. Even assuming that the counterfactual methods proved that a certain intervention worked and could even put a number on this, this is still a finding about one intervention under certain circumstances[3]. More qualitative, "traditional" evaluation techniques are needed to understand to which interventions these findings can be transferred and what determines the degree of transferability.

Impact evaluations of both types are carried during and after the programming period (ex post evaluation). A well-defined set of impact evaluations during a programming period also means that the often cited problem of "late" ex post evaluations looses in importance.

The ex ante evaluation of programmes can be understood as a kind of theory-based evaluation, testing the strength of the theory of change and the logical framework before the programme is implemented.

Are counterfactual methods another burden on beneficiaries?

The data requirements for counterfactual impact evaluation do not need to be burdensome. In fact, the counterfactual method is at its best when relatively simple indicators are considered, such as:

Patent applications[4]
Number of employees[5]
Investment and turnover[6]

These data are already collected from firms, whether by the tax and labour authorities, or by patent offices or databases such as AMADEUS. The only remaining data burden falls, not on firms, but on managing authorities (who should be able to specify which firms were assisted by which instrument, and by how much, in order to construct the treated and control groups).

As a result, in terms of burden on beneficiaries, counterfactual impact evaluation is far less burdensome than more traditional methods, such as monitoring data (which require reporting by firms) and beneficiary surveys (which require firms to respond to interviews and questionnaires).

1.2.2.2 Implementation evaluation – the management side

Implementation evaluationslook at how a programme is being implemented and managed: Typical questions are whether or not potential beneficiaries are aware of the programme and have access to it, if the application procedure is as simple as possible, if there are clear project selection criteria, is there a documented data management system, are results of the programme communicated, etc..

The methods of implementation evaluation are similar to theory-based evaluations.Evaluations of this type typically take place early in the programming period.

Is there an ideal evaluation guaranteeing valid answers?

As illustrated on the example of impact evaluations, all methods and approaches have their strengths and weaknesses. All evaluations need:

- to be adapted to the specific question to be answered, to the subject of the programme and its context.

- whenever possible, evaluation questions should be looked at from different viewpoints and by different methods. This is the principle of triangulation.

- The costs of evaluation need to be justified by the possible knowledge gain. When deciding an evaluation, it needs to be considered what is already known about an intervention.

In sum: A mixed-method approach is the best approach to evaluation.

To date Cohesion Policy evaluations have tended to focus more on implementation issues than capturing the effects of interventions. For the 2014+ period, the Commission wishes to redress this balance and encourage more evaluations at EU, national and regional level, which explore the impact of Cohesion Policy interventions on the well-being of citizens, be it economic, social or environmental or a combination of all three. This is an essential element of the strengthened results-focus of the policy.

1.2.2.3 The evaluation of integrated programmes

The evaluation of integrated programmes covering a range of different but interlinked interventions represents a special challenge. It is a possible strategy to evaluate first of all the constituent components of an integrated programme. If their effectiveness can be demonstrated, it becomes more plausible that the whole programme is delivering on its objective.

As a second element, evaluators could assess whether the intervention logic and objectives of the different components fit with each other and make synergies likely to occur.

Thirdly, it is possible to apply methods that assess the effect of the integrated package as a whole. Traditionally this has been undertaken by macroeconomic models. Other methods are also being tested, for example counterfactual methods comparing the development of supported with non-supported regions. As noted above, a combination of methods is likely to be most effective.

2. Standards for evaluations

In order to ensure the quality of evaluation activities, the Commission recommends Member States and regions to base their work on clearly identified standards, established either by themselves or to use European Commission standards or those of national evaluation societies, the OECD and other organisations. Most of the standards converge on principles such as the necessity of planning, the involvement of stakeholders, transparency, use of rigorous methods, independence and dissemination of results. A possible structure with some explanations is provided in annex 3.

We recommend the consultationof the following sources:

Quality of an evaluation report: EVALSED, The Guide.
Website of European Evaluation Society: It provides access to the standards of national evaluation societies.

OECD, 1992. Principles for evaluation of development assistance.

3. Practical points for the programming period 2014-20

The intention of this section is to provide (future) programme managers some pragmatic ideas on what is required for monitoring and evaluation of cohesion policy and what should be done when taking into account the ideas and principles sketched out in the previous section of this paper and what has already been presented in the 5th Cohesion Report.

3.1 Programming

Programmes with a clear identification of changes sought, concentrated on a limited number of interventions are a decisive condition for efficient and effective monitoring and evaluation during the whole programming period.

3.1.1 Clear objectives as key condition for effective monitoring and evaluation

Each priority (sub-priority) should identify the socio-economic phenomenon that it intends to change – the result - and one (or some very few) result indicators that best express this intended change. Each priority should express the direction of the desired change (e.g., a reduction or growth of the value of result indicator). Setting a quantified target or a range for the addressed result indicator or the contribution of the programme might be possible in selected cases.