Introductory Learning (HJC)

Animal Learning

The handout outlines what you need to know. There will be two lectures (one on classical and one on instrumental learning). The handout covers both.

  • These lectures are introductory and the objective is in each case to understand what is meant by the key terms and phenomena given (bulleted) below.
  • The handout gives a verbal explanation.
  • The slides will mostly show you data that make the same point. This you would not be expected to reproduce in an exam and I haven’t put the figures on the handout (this would break copyright!). Sketch any graphs that help your understanding.

Reading

Introductory texts:

Any number of these will help you with the key terms and ideas (just check in the index).

I used: Gray, P. (1999). Psychology (3rd Ed.) Worth.

Main texts for core animal learning theory:

Chance, P. (1999). Learning and Behaviour. (4th Ed). Brooks/Cole.

Pearce, J.M. (1997). Animal Learning and Cognition. (2nd Ed.) Psychology Press.

Hall, G. (1983). Behaviour: An Introduction to Psychology as a Biological Science. Academic Press.

Alternative texts for core animal learning theory:

Dickinson, A. (1980) Contemporary Animal Learning Theory. CUP.

Mackintosh, N. J. (1983) Conditioning and Associative Learning. OUP. Chapter 2. [Difficult but might be some use for reference].

Mackintosh, N. J. (1974) The Psychology of Animal Learning. Academic Press. [Again difficult but might be some use for reference. Much of this still relevant as alternative to Mackintosh, 1983].

Lecture 1: Classical Conditioning

  • Pavlov’s dogs.

Pavlov discovered that dogs could be conditioned to salivate in response to stimuli that reliably preceded the presentation of food. This was the discovery of classical or Pavlovian conditioning.

  • Unconditioned stimulus and response (UCS, UCR).

Without any learning a motivationally significant stimulus like food will normally result in a response (in this case salivation).

  • Conditioned stimulus and response (CS, CR).

A previously neutral stimulus (without any prior motivational significance) like a bell will not elicit a response. After several pairings with a UCS like food it becomes conditioned and does elicit a response (in this case salivation). This is like the law of association by contiguity, but Pavlov could measure learning objectively.

  • Acquisition and extinction.

Learning can be plotted as the increase in conditioned responses over trials. Extinction (‘unlearning’) can be measured as the decrease in responding to CS over trials when the UCS is no longer presented.

  • Generalisation.

Conditioning to one stimulus will generalise to others. For example, if we condition an eye-blink CR to a 1200 Hertz tone, there will also be some response to other tone frequencies. The level of response depends on how close the tone is to the training frequency.

  • The conditioned emotional response and suppression ratio.

1. Baseline response: some measurable activity in a motivated animal (e.g. licking for water or lever pressing for food)

2. Conditioning: CS (e.g. light stimulus)-> UCS (e.g. mild footshock)

3. Test of strength of learning:(a) reinstate responding

(b) present CS

Measure:

(a) time to make 10 licks prior to signal (A period)

(b) time to make 10 licks after signal (B period)

Compute suppression ratio (SR): SR = A/(A+B).

The suppression ratio ranges from 0.5 when the A and B period are exactly the same (i.e. no learning) to zero (i.e. high learning). Low scores are seen when an animal (who did drink before CS presentation) stops afterwards (as the B period gets longer, the SR tends towards zero).

A similar ratio is used when the measure of learning is number of responses in a set time rather than time to make a set number of responses.

  • Autoshaping.

If a stimulus reliably precedes a UCS like food delivery CRs will soon follow. Actions like key-pecking for a pigeon or lever-pressing for a rat can be turned into instrumental responses if the contingencies are now changed so that food does not now follow automatically, but only if the designated response is made. From the animal’s point of view, what we think of as a CR may always have some instrumental action (e.g. salivation may make food taste better).

  • Temporal contiguity.

Pavlov’s theory: Simply the idea that events must be presented closely together if they are to become associated.

  • Garcia’s taste aversion.

Garcia showed that temporal contiguity is not in fact necessary for conditioning to proceed. Aversion to a taste stimulus that precedes illness occurs over quite long intervals between the experience of the taste and the subsequent illness.

  • Stimulus substitution.

Pavlov’s idea that the CR for any given UCS will always be part of the set of UCRs seen to that UCS. It turns out that this is often (but not necessarily) true.

  • Extinction and spontaneous recovery.

Responses that seem to have been unlearned through extinction (decreased responding to the CS over trials when the UCS is no longer presented) can sometimes reappear.

  • Behaviour therapy.

Most fears are not learned but where there are obvious triggers they can be treated by exposure (without adverse consequences) to extinction.

Lecture 2: Instrumental Learning

  • Thorndike’s kitten.

Thorndike found that a cat placed in a puzzle box could learn to open it by pushing a lever on the floor. With repeated experience of the puzzle box, the cats would get faster and faster at making the required response.

Thorndike developed the idea that of the many possible responses in the puzzlebox, only the one resulting in the satisfaction of escape would be strengthened. Thus effective responses would be more likely to be emitted on successive trials and satisfiers would stamp in S-R habits.

  • Problems with the “Law of Effect”.

In more naturalistic settings, animals may need to learn a whole series of responses to reach the ultimate reward. How do earlier responses, still necessary, but distant from the animal’s ends get stamped in?

Tolman found that rats use cognitive maps and that they learn various routes to reward, irrespective of whether they’ve previously been rewarded for taking a particular path. However, this spatial learning only become apparent when reward is introduced (Tolman & Honzik, 1930).

  • Animals show sensitivity to the consequences of their actions and this just doesn’t fit with Thorndike’s S-R theory. Dickinson & Dawson (1987) trained hungry rats to press a lever, some got sugar water and some dry food. Tested later in extinction (without further reinforcement) but now when thirsty, the rats previously trained with dry food made fewer responses than those trained with sugar water.
  • The Skinner box.

When the rat presses a lever, food or water is automatically delivered. The equipment is controlled by computer and records responses as the required instrumental response (the lever press) is learned.

  • Autoshaping.

Autoshaping is often used to get instrumental learning going. In the Skinner box example, the levers are normally retracted in the side walls of the chamber. If lever presentation precedes food delivery, through classical conditioning the rat will usually come to contact the lever. We all know pigeons are stupid (they’ll peck keylights even if this response actually prevents an otherwise programmed food delivery!) but there was a big debate over whether rats bite or knaw levers (I’ve seen them do it).

  • Positive reinforcement.

A positive reinforcer is one that the animal will work to obtain.

  • Negative reinforcement.

A negative reinforcer is one that the animal will work to remove or terminate. Contrast punishment - always results in a reduction in responding.

  • Response shaping.

Behavioural modification relies on the use of reinforcement. This can also involve the use of secondary reinforcers like money or some other token for human subjects. Secondary reinforcers are associated with motivationally significant primary reinforcement through classical conditioning.

The fact that biofeedback can work shows that even physiological responses can be brought under instrumental control.

  • Partial reinforcement.

In instrumental learning, when it’s the response that determines the outcome, learning is paradoxically stronger when the response is only sometimes reinforced. Partial reinforcement can be set up so that a reinforcer occurs after every nth response (fixed ratio); after a set period of time (fixed interval); after a variable number of responses around some average (variable ratio); or after a variable period of time around some average interval (variable interval).

The variable (more unpredictable) schedules produce the greatest resistance to extinction.

Partial reinforcement may help to explain why gambling is so persistent.

  • Imprinting & observational learning.

Just to say that there are other forms of learning!

1