Response to Consultation on the “Reform of higher education research assessment and funding”[1]

Susan Cooper, 10.10.06

This response gives some general comments in the preface, then replies to the specific consultation questions, and finally gives some assessment of problems with the RAE and suggestions for improvement.

Preface

The proposal for radical change comes only 3 years after 2003 Roberts review of research assessment[2] which, after consultation with the UK academic community, clearly endorsed the need for peer review as opposed to relying solely on metrics. For example, paragraph 15 of the executive summary says:

“Some of us believed, at the outset of the process, that there might be some scope for assessing research on the basis of performance indicators, thereby dispensing with the need for a complex and labour-intensive assessment process. Whilst we recognise that metrics may be useful in helping assessors to reach judgements on the value of research, we are now convinced that the only system which will enjoy both the confidence and the consent of the academic community is one based ultimately upon expert review. We are also convinced that only a system based ultimately upon expert judgement is sufficiently resistant to unintended behavioural consequences to prevent distorting the very nature of research activity.”

The course of events illuminated there is instructive – it is tempting to initially hope that the system can be simplified by relying on metrics, but closer examination shows that it cannot.

The reason for the proposal is not clear. It seems to be the ‘high’ cost of the RAE. Paragraphs 4.7 and 4.3 of the ‘next steps’ paper[3] tell us that the cost of the 2008 RAE will be at least £45 million, whereas the QR funds to be distributed will be £1.45 billion; averaged over a 7-year period (as between the 2001 and 2008 RAEs), this is an ‘overhead’ cost of 0.44%. HEPI has estimated[4] the equivalent figure for research council funding at 10%. Objectively the RAE is an efficient method of distributing funding.

Nevertheless, the RAE is commonly perceived to be a heavy burden. However no analysis has been made of which aspects are most burdensome. Listing and assessing 4 papers per researcher every 4-8 years is certainly a small fraction of the total effort that went into all the papers which the community produced during that period. It is also only part of the work of preparing the RAE submissions, which require various additional types of information that add considerably to the burden but have an unclear role in an independent determination of quality. If the rest of the RAE were eliminated and only the 4 papers were kept, we might well be able to keep the desirable peer-review aspect of the RAE while considerably reducing its burden. Removing the peer-review aspect is taking out the good rather than the bad.

Any proposal for a new method of allocating funds needs to be appropriate to the purpose of those funds. The purpose of the QR funding is undergoing change in a way that is not well defined. Paragraph 2.11 of the consultation paper was apparently referring to the current (old) system in saying that the QR “funds the basic research infrastructure – including the salary costs of permanent academic staff, support staff, equipment and libraries … the flexibility to react quickly to emerging fields of enquiry; and the capacity to undertake “blue skies” research.” Much of that is out of date, with fEC project-specific grants now funding project-specific fractions of academic salaries as well as (in principle) all related infrastructure and support staff. However a clear new role for the QR funding is not expressed; paragraph 3.8 is comparatively vague in saying it is “to support research capacity and capability; … strategic, long-term research; and … enable speculative research.This issue is crucial – if we don’t know what QR funding is for, we can’t decide how to allocate it. Funding which is primarily intended to provide the missing 20% of fEC grants[5] clearly requires a different allocation method than funding which pays academic salaries, or funds infrastructure, or supports speculative research. The purpose of QR funding must be clarified before any meaningful discussion of the funding mechanism can take place.

An independent system of allocation is required to “maintain the distinct role that QR plays within the dual support system” (para. 3.8). The proposed metric system would make the QR leg of the dual support system entirely dependent on the research council leg. No one method of judging research is perfect. It makes sense to use two different methods for the two parts of the dual support system. Research councils, etc. judge research plans; it is therefore advantageous to have the QR funding based on research results.

Although it is not explicitly stated, the funding of academic salaries is moving from QR funding to fEC research grants. This important shift has been made without discussion of the consequences. Moving responsibility for funding academics’ research time from the HEFCE block grant to the much more volatile and subject-dependent sources of external funding is a change that can have severe consequences which need to be properly understood. In fields where grant funding is more readily available, academics are liable to become in effect self-employed, responsible for bringing in grant income to fund the research fraction of their own salaries, and feeling less loyalty to the university and to other duties such as teaching. In other areas it is totally unclear where funding for academic salaries is to come from. It would be a peculiarly skewed dual support system if academic salaries in some areas are paid from research grants and those in others from QR funds. On the other hand, the relative costings of the RAE (~1%) and research council funding (~10%) demonstrate that it would be unwise to ask academics in those areas to move towards a system of proposals for research council funding merely to pay their own salaries. The funding of academic salaries needs to be properly thought out.

The assurances given in the ‘next steps’ paper that current QR funding correlates well with Research Council income are based on a misleading correlation, as has been pointed out by HEPI.[6] The correlation is mainly due to the varying size of the institutions – large institutions naturally get more of both. The fractional changes in funding that would result from changing from QR to something proportional to Research Council funding are large for many institutions.

1. Which, if any, of the RAE 2008 panels might adopt a greater or wholly metrics-based approach?

None. Even a careful introduction of metrics is likely to have severe adverse consequences, as discussed in answers to the following questions. A hasty implementation in 2008 would be extremely unwise.

2. Have we identified all the important metrics? Bearing in mind the need to avoid increasing the overall burden of data collection on institutions, are there other indicators that we should consider?

No metric can fairly judge quality.

Publicationmetrics: The consultation paper does not mention using the number of publications as a metric, perhaps having already wisely recognised that using such a metric would simply create incentives for researchers to divide their work up into a larger number of publications. Not only would this not be an increase in either volume or quality of research, it would degrade the quality of the literature by dividing reports into more incremental, less comprehensible slices. Citations are closer to being a measure of quality, but are subject to many problems and would not remove the incentive to piecemeal publication.

Academic salaries: The current RAE includes a ‘volume metric’ – the number of research-active academics paid from HEFCE funds (plus a small fraction of externally funded research staff) – which is missing from the proposed new models. This is probably because academic salaries are now included in fEC grants, but not all areas of research need research grants (e.g. humanities). I think this move of academic salaries is unwise. However if this is how it is to be done, the cost factors or ‘pot sizes’ need to be adjusted to cover academic salaries in some areas and not in others.

Pre-selection of research proposals: Paragraph 4.2 suggests proposal success rates could be used as an additional metric. Institutions would inevitably react by filtering applications before they are submitted. This brings various problems. First, it is extra work as proposals need to be evaluated twice. Second, institutions would need to judge not the true value of the research proposed but the probability that it will be accepted by the particular research council. Third, individual researchers would lose their independence and become subject to the filtering panels in their institutions. Given the expected pressure on individual researchers to bring in funding to cover their own salaries via exactly these filtered research proposals, the consequences of doing anything against the wishes of management would be compounded to a completely unacceptable level.

Numbers of graduate students and PhDs awarded: This is really a volume measure, not a quality measure, as acknowledged in para. 4.3. Using the numbers of graduate students as a metric is liable to push institutions to accept more students whether they are qualified or not, unless the funding provided is closely aligned with the actual cost. Using the numbers of PhDs awarded could create pressure to award degrees even where the quality of work does not merit it.

TRAC: Using the amount of time academics spend on research as reported in TRAC as a metric (para. 4.3) would cause pressure for them to increase the amount of time they spend (or at least report) on research unless equally-funded streams are available for time spent on other activities. If universities are to maintain their role of combining teaching and research, both activities need to be supported in equally visible ways. This is very difficult to do without an undesirable micro-management of academic activities.

3. Which of the alternative models described in this chapter do you consider to be the most suitable for STEM subjects? Are there alternative models or refinements of these models that you would want to propose?

None of the models are appropriate, as they measure cost, not quality of research.

Volume measures: One needs to separate measures of the volume of various research activities which need HEFCE funding from measures of quality. Volume measures include the following:

  • Since fEC grants5 only cover 80% of the costs, a source must be found to cover the remaining 20%. Rather than solving this part of the overall funding problem by making the QR funding simply proportional to the funding of fEC grants, fEC should be extended to 100%. If this is impossible, then a simple formulaic allocation (separate from QR) should be used to distribute the missing funding. The amount distributed needs to be equal to the missing 20% – conflating it with the QR funding is not only confusing but brings the danger of under-funding of this crucial component.
  • Other aspects, e.g. numbers of post-graduate students, should go into separate formulaic funding streams to support the related specific purposes.
  • Academic salaries need to be funded in a way that provides suitable stability and independence.

Pot size: While all the models are defective in determining quality because they rely solely on quantity of funding, the differences between the models are mainly in determining the ‘pot size’ per UoA, a problem that already exists in the current RAE. The current pot size determination has two problems which carry over into the proposed new models:

  • First, there is no clarity on the purpose of QR funds, making it impossible to determine what aspect of costs should be covered. QR may be expected to cover the research part of academic salaries in some subject areas but not in others; this would have a very large effect on the funding required.
  • Second, the current RAE and several of the new models use a fixed factor for the relative cost of different subjects, and that factor is based on how much HEIs actually spend. However HEIs increasingly directly allocate QR income to subjects ‘as earned’ via HEFCE’s formula, so the system has become circular. Some independent judgment is required for each subject of the component of total cost which QR is intended to cover (which in turn requires clarity on the purpose of QR funding). While such judgement is not easy to obtain, one would hope that the government would want to take appropriate care in allocating its funds. Universities seem to be in no doubt that science departments are under-funded. The fact that the government is upset over the closure of science departments would normally be expected to result in increased funding to support that area.

In the following discussion of the specific models, I assume that the missing fEC fraction has been taken care of via other funding, as uncertainty on this question makes discussion of an appropriate model completely impossible. I find the descriptions of the different models difficult to understand and may have misunderstood some of them; my comments are based on my uncertain interpretation.

  • Model A is inappropriate in making the funding simply proportional to external research grant income. This selectively rewards expensive subjects, independent of their importance or quality, and completely penalises subjects which require only an academic’s thinking and writing.
  • Model B introduces the number of researchers in a subject as a volume indicator and uses the existing HESA cost factors. This is an improvement (although subject to the circularity problem described above), but as stated in the consultation paper, requires a way of deciding which researchers to count. I believe the best way to do it is similar to the current RAE, but with much greater flexibility in the number of papers submitted and a proportionally reduced FTE for those submitting fewer than the standard number of papers; this would providemore flexibility andfairersupport to academics who have spent less than the average time on research, for whatever reason, including teaching or administrative duties, public outreach, unpublished work with industry, etc.
  • Model C determines the pot size for a subject from the HESA cost factor and the number of researchers in departments with RAE ratings above a threshold. As described in the consultation paper, this threshold is inappropriate since the later distribution of funds makes strong use of the variation in ratings above the threshold: HEFCE’s May 2007 paper “Funding higher education in England” gives the factors as 1.0, 3.1, and 3.9 for 4, 5, and 5*, respectively. However this serious defect could be devised by simply weighting the numbers of researchers by thefactors corresponding to the ratings of their departments.
  • Model D relies on historical RAE data and locks it in for the future. This is completely inappropriate.
  • Model E modifies model A to give less weight to charity funding. The motivation is simply to avoid giving too much money to medical research. Surely a more rational process is required which also gives proper consideration to other subjects, based on costs, such as Model B or C.

Assessment criteria create a powerful force: The consultation paper admits (paragraph 5.2) that the proposed models only correlate well in aggregate and do not provide quality assessments at subject level within a university. It is an illusion to think that a defective assessment system can be used to allocate funds to universities and expect the universities to remedy its defects with their internal allocation of funds. These times of tight funding are driving universities to worry about their financial survival. In such conditions, it becomes an imperative for them to adjust their internal priorities to maximise their scores in the assessment criteria. The government must take its responsibility seriously for devising a fair assessment and funding system.

Judgement of quality: The QR funding should retain the true meaning of a dual support system – as an independent source of funding giving institutions flexibility to support some research on the basis of their own judgement, independent of the research councils, etc., and be judged on research outcomes rather than research proposals. The proposed metrics rely entirely on the latter. The current RAE should not be abandoned in favour of a system which is so defective. Instead one should try to keep what is good in the RAE and improve that which needs improving.

4. What, in your view, would be an appropriate and workable basis for assessing and funding research in non-STEM subjects?

The prospect of different funding methods in different subject areas is fraught with difficulties and the prospects of injustices. In particular, how do you place the border between STEM and non-STEM? Inter-disciplinary subjects that cross the border would be strongly disadvantaged. In addition, non-laboratory subjects such as mathematics, as well as the theoretical side of any science, have many similarities to non-STEM subjects in that they require less project-specific funding and would be disadvantaged by being placed in a pool with laboratory-based STEM subjects with funding based on a metric of income from research councils, etc.