Beyond the ‘learning society’: what have we learnt from widening participation research?

Stephen Gorard and Emma Smith

Department of Educational Studies

University of York

YO10 5DD

Paper presented at the British Educational Research Association Annual Conference, University of Glamorgan, 14-17 September 2005

Abstract

This paper emerges from a recent review of evidence, conducted by the authors and others, on the lifelong barriers to widening participation in higher education in England. This has led us to a consideration of the quality and relevance of the research activity in this large field of endeavour, and to the creation of a typology of the kinds of widespread problems we then encountered. These include pseudo-research, poor quality reporting of research, deficiencies in datasets, analytical errors, a lack of suitable comparators, obfuscation, a lack of scepticism in general, and the regular misattribution of causal links in particular. The paper discusses each of these, and illustrates them using generally high-profile research studies and publications. We found a substantial proportion of non-empirical pieces. Of the remainder, we found a substantial proportion that did not report sufficiently well their methods or their findings. Of the remainder that were empirical and did explain their methods and findings sufficiently, we found a substantial proportion in which the findings could not support the conclusions drawn from them. The paper ends with a plea for a great deal more ‘learning’ and openness to new ideas among those engaged, lifelong, in researching lifelong learning.

The review of widening participation

We have recently conducted a review of the available evidence relevant to widening participation (WP) in higher education (HE) in England. Our especial concern was with understanding and ameliorating the barriers to participation experienced by potential learners within a lifelong model. As part of the collation process, we advertised widely for evidence, proactively contacted key lists and organisations, and systematically searched journals and websites. As a result, we created an EndNote database of summaries of around 1200 research reports (available at http://www.york.ac.uk/depts/educ/equity/barriers.htm). The substantive results of the review are being made available elsewhere. The focus of this paper is on the nature and quality of the research into lifelong patterns of participation in HE, as evidenced by the pieces uncovered by the review.

One of our first findings is that a substantial proportion of ostensible research reports do not actually report new research evidence or analysis of any kind. This phenomenon has been noted before in other contexts (e.g. Gorard et al. 2004). There are, of course, literature reviews which are useful for future reviewers as a ready source of references and, if conducted rigorously and sceptically, can provide a useful synthesis of an entire area. There are also research method and methodological pieces which are, on occasion, thought-provoking and helpful (as we hope this one will be). But in addition to these, the research literature contains a high proportion of ‘thought-pieces’ with no clear empirical content, no summary of the research of others, and no assistance to others intending to conduct research. On occasion these thought-pieces provide a genuinely new or radical idea about research, policy or practice (e.g. Walford 2004), but in too many cases they are almost incomprehensible, appearing to play to the gallery of a particular clique rather than laying out an argument for the more general academic reader to follow. We exclude all of these three categories, totalling perhaps 40% of the literature, from further consideration here. Our review focused only on what we can learn from any relevant research evidence as it is reported by those who understand how it was conducted and, therefore, what its particular limitations are.

The quality of research reports

A further finding of the review, also noted previously in other contexts (e.g. Tooley with Darby 1998), is that primary reports of research evidence are often unsatisfactory. It has been traditional to assess the validity of research at least partly in terms of whether it has been evaluated or peer-reviewed (e.g. Kahn and Macdonald 2004). Peer review is used as the major quality control mechanism for academic publications, but it is inconsistent between publishers, journals, and reviewers. It also tends to suppress innovative and imaginative work, can create a ‘file-drawer’ problem wherein only ‘significant’ results are published and, in education at least, takes so long that it leads to the publication of already dated work. Given the problems we illustrate below with some research reports in high-ranked social science journals, from prestigious institutions, or by well-known figures in the field, we decided that it is not possible to rely on any kind of ‘kitemark’ for the quality of evidence. It is not the case that passing peer-review to appear in a prestigious journal is, in itself, a guarantee of quality in research or research reporting. Nor is it possible to rely on work from specific individuals, institutions, or organisations.

Therefore, our judgement of the usefulness of any research evidence depends heavily on the quality of its reporting. We should expect research to contain somewhere within its report all of the basic information needed for another researcher to replicate the work, including the analysis. We may need to know the number of cases, how they were selected, the research design, instruments, context of data collection, methods of analysis, prevalence of findings and so on. This should be summarised at the start of the report. We should also expect a fuller description of what has been found, including a description of the evidence (and not merely an account of what the researchers believes this to signify). These minimum criteria are not especially demanding, yet many reports do not contain this information.

Reports differ in the amount of information they give about the nature and methods of research. Reports giving more information of this type cannot be favoured over others that do not (given that the information provided may show up flaws in the research). But reports that do not give full information (and so do not allow the showing up of flaws) must not be favoured over fuller reports. Therefore, in the absence of any research methods information the reviewer must assume the ‘worst’. Where totally insufficient information is given the report must be excluded from further consideration, as though it were the equivalent of a non-research report (which it is, in effect). Clearly, the succeeding sections of this paper, which focus on generic problems in reported research, rely almost exclusively on the best-reported research. The quality of reporting is high enough in these cases to start a discussion of the fit between research questions, methods, evidence and the conclusions drawn. It is important, for a number of reasons, that readers recall when reading the rest of this paper that the examples are drawn from among the best of reported research.

Research reports, perhaps especially of work traditionally termed ‘qualitative’, regularly present their conclusions as though these were the findings, and present the actual evidence for the conclusions sparsely - often merely as illustrations. In such reports, there is no way of knowing how widespread any finding is, or how likely it is that a different analyst would reach similar conclusions. Therefore, only the illustrations provided can be used as the evidence for any review. Note that this is not always so, and that higher quality reports describe the prevalence of patterns clearly, or present some form of inter-rater validation of the analysis (e.g. Ball et al. 2002). Research reports of complex statistical models can produce some of the same problems for a reviewer, where the reviewer does not have access to the dataset and the actual data is not summarised clearly. The general pattern of research problems encountered during our review is not especially about the quality or desirability of different methods of investigation or analysis. Also, the issue of concern here is not whether the conclusions drawn by researchers can be shown to be true or not by other means. What is common to most of the problems illustrated below is that the research reports do not provide evidence that can support the conclusions drawn by the researchers – i.e. the latter are not warranted by the evidence actually presented for them (Gorard 2002). This complaint applies even more strongly to reports that present conclusions based on poorly reported research.

What is WP research for?

The UK widening participation agenda is predicated on the notion that particular social groups, defined perhaps by social class or ethnic background, are unfairly under-represented in higher education. However, no single large-scale dataset currently exists, to establish that this unfair under-representation is in fact so (Gorard 2005). Nor do we have clear evidence on whether the situation is getting better or worse over time. There are a number of reasons for this state of affairs, almost all outside the control of researchers. All of the large datasets, relevant to establishing the nature of the problem that WP research is intended to solve, are deficient. Many of the problems are caused by missing data. In addition, there are widespread deficiencies in the analysis of the data, and a reluctance to highlight the uncertainties caused by initial problems in the data.

Imagine what would be needed to establish the unfair under-representation of any social group in HE – to establish, in other words, that there should be more of a particular social group in HE than there is at any time. We would need to be able to define the group clearly, in such a way that the definition could be used by different people in different places at different times to mean the same thing. Unfortunately, the categorisation of social groups by occupational class or ethnicity is a matter of judgement, the categories themselves are arbitrary, they interact importantly with each other and with other categories such as sex, the categories have changed significantly in recent years in the UK, and the significance of the categories themselves (such as the meaning of being in a non-manual occupation) changes with prevalence and historical/economic development.

We would then need to know the prevalence of that social group in the relevant population. Unfortunately, when researching lifelong learning, it is not clear what the relevant group is. An analyst using all adults is open to the charge that the inclusion of people over the age of 90, for example, is irrelevant since so few of these are currently participating in HE. An analyst using only young adults, however, is open to the charge of presuming that WP is only about traditional-age students. The population census only happens every ten years. Not everyone actually takes part, and not everyone who takes part responds to the class and ethnicity questions (see below). And the categories used for the class and ethnicity questions are not the same between 1991 and 2001, nor are they always the same as those used in other large data sets – such as the individualised student records (ISRs) held by the Higher Education Statistics Agency (HESA) for all students, the UCAS database of applicants, or the annual schools census.

We would need to know the prevalence of that social group that had participated in HE. This apparently simple act of measuring also faces problems. We need to know what proportion of the population have already participated in HE (even if they did not receive a qualification). We need to define ‘HE’, and decide whether to include level 4 courses in FE colleges, level 3 courses in HE institutions, postgraduate students, and professional training. We need to know whether we distinguish between UK, Commonwealth, and EU home students. If not, then our prior population figures become more problematic. If so, then some datasets make it difficult to distinguish between categories of home students. Any variation in these decisions over time, or between analysts makes comparisons difficult. As with the general population figures, there will be incompleteness in HE records (see below), and for some years the ‘Individualised’ Student Records are not actually linked to individuals but to courses, so that a part-time student taking two courses in two different institutions does not have a unique identifier, and is in danger of being counted twice.

Some studies have attempted to overcome some of these limitations by using postal code data with Geographic Information Systems (GIS). However, these are still limited by the completeness, or otherwise, of the census. In addition, there is the added problem of the availability and accuracy of the home postcodes of students. For example, only 47% of Welsh-domiciled students in 2002/03 had valid postcodes (Taylor and Gorard 2005), and even this figure depends on some contestable assumptions about the nature of ‘domicile’. Are students to be counted as domiciled where they reside to study, or where their parents live? Is it even possible to use the same definition of domicile for traditional-age and mature students? If not, does this make aggregation of their figures less valid? And so on. Once these decisions have been made, analysts are still faced by the fact that they are using GIS is so that they can associate individuals with the average background characteristics of the area in which they live. Students are, thus, assumed to have the same occupational background as the modal category. Whether this can genuinely be agreed to improve the quality of analysis is debatable.