Evidence-based teaching - as if values, andall the evidence mattered Geoff Petty Oct 2012

I am a teacher turned teacher trainer and writer, not a researcher. I am author of ‘Teaching Today’ and more recently ‘Evidence Based Teaching: a Practical Approach.’ Both are widely used teacher training texts. I am interested in what works best, in understanding learning and teaching, and in how teachers can mine educational research for the benefit of their students’ life chances and enjoyment, and to help teachers work in a more efficient and enjoyable way.

It is necessary that teachers understand what teaching strategies work best, and why they work so well, when to use them, and when not. The aim is to improve teachers’concept of the teaching and learning process, so they can teach better (I will explain why this is the case further on.) This turns out to benefit students hugely, but also the teachers, because the best methods get the students rather than teachers to work harder, and make teaching and learning more enjoyable.

You could fill an entire library with just the summaries of educational research, in the form of handbooks and other summary texts. How are we to mine this wisdom? The time required to study, sift and prioritise these materials is well beyond the capacity of one lifetime. But there is a way….

Most books for teachers purport to summarise this research, if not explicitly then implicitly, but the writers have usually dipped in to the mountain of research to pick out the bits that support their viewpoint and prejudices. This partly explains why experts/books can often disagree. What would education look like if you stood on this mountain of research and used it as objectively as possible to determine what teachers and their masters do?

There are different schools of research

From the teacher’s perspective there are three main types of research in education, and if I can simplify enormously for a moment:

Quantitative research: tells us what has the greatest impact on achievement, for example which teaching strategies work best

Qualitative research: tells us what learning is, and how to bring it about. It also tells us why highly effective methods work, and how to use them to maximum effect.

Field Research: tells us what the best teachers do

Critical Theory and Humanistic Psychology: considers what education is and should be trying to do, what’s wrong with our curriculum, and what values and interests inform different teaching approaches.

We need all four for a complete view; lets look at each sort in more detail:

Quantitative Research

This only started to yield really useful results when ‘effect sizes’ began to be used around the 1980s and beyond. The best research consists of rigorous control group studies (RCTs) with real teachers in real schools and colleges.

There are hundreds of thousands of studies like this, world wide, on many hundreds of teaching strategies (- and other factors that affect achievement, such as having a single parent, or gender, etc.)

Systematic reviews of such studies and other evidence can usually:

  • Estimate an average effect size for the strategy or factor using all the best studies in the review
  • See what was special about the studies with the highest and lowest effect sizes to try to figure out what is going on - and from the teacher’s perspective determine how to get a high effect size
  • Group studies to see whether, for example, older students are more affected by a factor or strategy than younger students. For teaching strategies, usually there are very few or no ‘mediating factors’ such as this, which is surprising but wonderfully simplifying.

Effect size is measured in standard deviations, and is the improvement in average achievement shown by the experimental group over the control group. For example students writing their own notes,instead of being given handouts for example, has a very large average effect size of about 1.0 standard deviation which is equivalent to :

  • improving the achievement of a student in the experimental group by 2 grades at GCSE or A level compared to if they had been in the control group
  • more or less doubling the amount a student learns if they are in the experimental rather than the control group
  • an average student at the 50th percentile, being ‘promoted’ to the 16th percentile.

Such an improvement is vast and turns a good teacher into an outstanding one. There is a small number of teaching methods that have such power, and we know how to use them to maximum effect, thanks to qualitative understanding of them. However it takes teachers 2 years of practice to gain the maximum effect with most methods. The effect of learner’s age on the effect sizes is almost always found to be zero; an exception is homework which doesn’t work with primary school children as well as it does for older learners. For nearly all methods, they work in primary, secondary, college and university, though of course need to be adapted to these different contexts.

Note that it is possible to compare factors and methods to determine which have the greatest effect on achievement, as these are all measured in terms of average effect size.

Notably this has been done by Professor John Hattie(see his “Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement” which summarises all the quantitative studies that have been systematically reviewed). He has found 800 such reviews, looking at different factors strategies and methods, and has drawn up an effect-size table that shows what has the greatest effect on achievement. This table is not easy to read from the perspective of a typical teacher, as some things on Hattie’s table cannot be manipulated, or are not available to most teachers. If these are removed we get the following table of effect sizes. The term in bold is Hattie’s, any explanation following is mine. The ‘Rank’ is the rank in Hattie’s original table.

Rank / Influence: factor, method, technique etc being tested / Average effect-size
5 / Acceleration: moving students up a year if they are doing very well / 0.88
6 / Classroom behavioural: methods to improve student behaviour and discipline / 0.88
7 / Comprehensive interventions for learning disabled students / 0.77
8 / Teacher clarity: clarity of goals, explanations, summaries and reviewing etc / 0.75
9 / Reciprocal teaching: a special form ofsmall group teaching to improve student’s comprehension skills / 0.74
10 / Feedback: formative feedback to students, the most effective feedback tells the learner what they have done well, and how to improve. This ‘medal and mission’ feedback has a much higher effect size than 0.73, and can come from self and peer assessment as well as from the teacher. / 0.73
12 / Spaced versus Massed Practice. Instead of teaching something once and leaving it, visiting the topic repeatedly / 0.71
13 / Meta-cognition: thinking about your own learning, how you manage tasks. Reflection and target setting. / 0.69
19 / Professional Development. This is highest when it uses communities of practice. / 0.62
20 / Problem-solving teaching: / 0.61
22 / Phonics instruction / 0.60
23 / Teaching strategies: some methods have a much higher effect size than this average / 0.59
24 / Cooperative vs individualistic learning / 0.59
25 / Study Skills: / 0.59
26 / Direct Instruction: A specific form of active learning with teacher/student dialogue called ‘Whole Class Interactive Teaching’ in the UK. / 0.59
29 / Mastery Learning. A particular formative assessment approach where students repeatedly take short self-assessed tests until they pass / 0.58

Hattie finds that most effect sizes are positive, and that the average is about 0.4 see the diag above. An effect size of 0.6 he calls ‘high’.

It is important to recognise that Hattie’s table shows average effect sizes. For example although feedback has an effect size of 0.73, if it is done well (for example self-assessment, or teachers providing information about what has been done well and what needs improving) the effect size is near 1.0. All these effect sizes are from research that uses standardised tests, (Prof Robert Marzano below uses teacher devised tests, which makes his effect sizes about 25% higher.)

Prof Robert Marzano in the States has done something very similar but limited to factors that are in the teacher’s control such as teaching methods. See page 5.

Hattie and Marzano find that what has the greatest effect on achievement, is the precise teaching strategy used by the teacher. It is not the teacher as a person that makes the difference, but the strategies the teacher uses.

Some teaching strategies such as ‘medal and mission’ feedback raise students’ achievement by the equivalent of 2 grades at GCSE or A level, more or less doubling the rate at which students learn. The methods that work best are often ‘active’ methods.

This research has shown that teaching is very improvable.

Many teachers say active learning would be great ‘if they had the time’. But control and experimental groups had the same time, so if you make the time for effective active learning by doing less didactic teaching, then your students will do better.

These methods work best at every academic level. Peter Westwood, summarizing research on how best to teach students with learning difficulties argued for highly structured, intensive, well directed, active learning methods, exactly the methods that Hattie and Marzano find work best.

Which teaching methods work best?

Lets look at some examples of methods that have done particularly well in these rigorous trials; they use teacher-devised tests and have average effect sizes around 1.0 standard deviation.

‘Same and different’ Effect Size 1.3: Tasks that require the learner to identify similarities and differences between two or more topics or concepts, often one they are familiar with, and one they are presently studying: 'Compare and contrast viral and bacterial infections'

‘Graphic organisers’ Effect Size 1.2: The student creates their own diagrammatic representation of what they are learning, for example in a mind-map, flow diagram or comparison table. They get out of their place to look at other students work, to help them improve their own. Then they self-assess their own diagram using a model diagram provided by the teacher.

‘Decisions-Decisions’ Effect Size 0.89: Students are given a set of cards to match, group, rank, or sequence. For example: 'rank these advantages of stock taking in order of importance, then sort them by who benefits: customer, business, supplier, or investor. Students are asked to reject any 'spurious' cards that do not describe an advantage of stock taking.

‘Feedback’Effect Size 1.1: There are many feedback methods including self-assessment and peer assessment. Ask students to decide on what was done well, and what they could improve, or inform them of this. However empty praise’ such as ‘excellent work’ doesn’t work, and grades or percentages inhibit learning so must be used sparingly.

Hypothesis testing Effect Size 0.79: You give students a statement that is partly true, but partly false: "The more advertising the better". "Cromwell was religiously motivated". Then you ask them to work in groups to evaluate the statement. When the groups are finished you get one reason in favour of the hypothesis from each group in turn, continuing until all their reasons have been given. You nominate a member of each group to give the reason and to justify it: 'why did your group think that?’ When a reason has been given say 'thank you' but don't agree or disagree with it. Repeat for reasons against. When all the reasons are in, ask the class as a whole to try and agree reasons for and against. Then give your thoughts on their ideas.

Reviews of quantitative research nearly always provide a qualitative description, based on qualitative research into why some effect sizes are higher or lower. For example, I expect you can guess why these methods work: they force students to think, and into making sense of what they are being taught, and they correct misconceptions. (See p11)

Qualitative research.

Most research is of this kind, and from the teacher’s perspective this is most useful for telling us what the learning process is, what brings it about, and what inhibits it.

Research in this area has been summarised by Bransford et al. 2000. Findings from this body of research include:

  • Set challenging tasks/goals and include some that not entirely achievable (e.g. open challenging tasks)
  • Strive to get students to understand deeply, rather than simply be able to recall parrot fashion
  • Students create their own meanings, they don’t just remember what teachers say, so we need to set tasks that get students to express their understanding and then use this expression of understanding as feedback that enables teachers to improve their understanding. See diagram on page 11
  • Give students ownership and responsibility for their own learning, for example by self or peer assessment
  • Learning and meanings are socially constructed – encourage dialogue….

The very methods that qualitative research recommends, are often the ones that get the highest effect size in quantitative trials. For example qualitative researchers stress the importance of feedback, and we find from quantitative research reviews, that feedback has a very high effect size on average, and if it is done well it can achieve an effect size of 1.0 or more. What’s more, the reviews of research on feedback, which cover both quantitative and qualitative studies, have shown us how to make feedback as effective as possible.

So qualitative research explains why the high effect size methods work so well, and how to make them work to their maximum. And quantitative research shows which qualitative ideas are most important.

Field Research

The question asked by this research is “What do the very best teachers do?” There is very little good field research, and there needs to be much more.

The researchers identify exceptionally good teachers from the point of view of ‘value-added’ achievement (improvement in achievement) for example. Then these teachers are observed, interviewed, and the research attempts to determine why they are so successful. This might include comparing the exceptionally good teachers with a control group of teachers who are similarly experienced, but whose results are not so good.

I was struck in studying this research at how often these exceptional teachers use the methods often recommended by qualitative research, and with exceptionally high effect sizes, even when these methods are rather old-fashioned or unusual. For an example exceptional teachers almost all use high quality Whole Class Interactive Teaching (direct instruction) with whole class dialogue, and they require students to write their own notes (Ayres et al 2004).

Critical Theory and humanistic psychology

Suppose a teaching method worked brilliantly, but it turned students off the subject, and even off education more generally? Or suppose a method worked well for the most able students in the class, but not for the others?

What are we to do? Luckily this tension is rare, but it occurs in some cramming methodologies and in setting of students. Both these methods have low or very low effect sizes. Hattie addresses the setting issue in ‘Visible Learning’ and comes down against it as a strategy, despite modest effect sizes for more able students, because of its discouraging effect on weaker learners.

The biggest problems in the UK education system is that the curriculum is amost exclusively academic. Practical and vocational education have been excised almost completely. Arts education is also very poorly represented. The whole focus is an ineffective attempt to make the student a useful contributor to the economy, not to treat the learner, and their flourishing, as an end in itself. It is a serious human rights issue in my view, but teachers have little control over these weaknesses, so I pass them by here.

How do we inform teachers of this research?

One could be forgiven for thinking that once research has established the methods, strategies, mindsets etc that work best, and why, it would be a simple matter of telling teachers about this, or even simply requiring them to use these methods. But summaries of evidence show that this strategy does not work. Teachers are extremely practised, skilful and comfortable with the methods they are familiar with, and one-off training events are not enough to change their practice.

There are two major research reviews on how to improve teaching to the point where student achievement is raised. One is a book by Joyce and Showers (2002), the other a rigorous research review by Helen Timperley (2007). They both find that simply informing teachers of best practice does not work. Indeed the only methodology that does, is to set up what is variously called “Communities of Practice”, “Supported Experiments”, or “Teacher Learning Communities.” These are groups of teachers who support each other in trying out highly effective teaching strategies and adapting them with repeated use, until they work well for teachers and students.