Chapter 10-Perceiving Depth and Size

Issues

1. Why is this chapter where it is in the text?

The processes of perception dictate that we must perceive the object before we can perceive its distance from us.

So this chapter naturally follows the chapter on object perception.

It needn’t follow the chapter on color.

Ultimately, to understand how we perceive the distance an object is from us, we must understand how we perceive the object itself.

2. Why is the name of this chapter Perceiving Depth and Size, rather than

Perceiving Size and Depth? Why does Depth get to go first?

As we’ll see, there is a lot of evidence that distance trumps size.

How does the visual system create perception in three dimensions?

Let’s first think about how the brain knows where objects are located in two dimensions?

I. Assume all objects are at the same distance from the observer, so that the brain only has to determine horizontal and vertical locations.

If all objects were at the same distance from us, the position of objects in space would correspond closely to location of activity on the retina, and due to retinotopic maps, to location of activity in the visual cortex.

This means thatlocation of activity in the cortexcould serve as an indicator if all objects were the same distance from us.

II. Problem: Now assume that objects can vary in distance from the observer – x and y in the figure below.

When points vary in 3 dimensions, location of activity on the retina (and in cortex) does not correspond in a direct fashion to location in space.

So position of activity on the retinacould signal differences in vertical position, but it could also signal differences in distance. How is the retina and therefore, the cortex, to know??

So the issue is: How does the brain identify the location of objects in 3 dimensions when the location of activity on the retina gives an ambiguous indication of location?

So, how do we know how far things are from us?

Perhaps we know distance by making size judgments.

Theory: We determine the size of an object and then use that to determine distance.

Size of image on the retina would seem to be the stimulus characteristic most easily related to distance.

After all, as objects get farther away, their images on the retina get smaller.

So, a logical rule for judging distance would be: Objects which have smaller retinal images are farther away.

Problems with this theory of distance perception:

1. Requires prior knowledge of the object. You would be unable to judge the distance of unfamiliar objects. How would you know that the retinal image is “small” unless you’d had prior experience with the object.

2. Common objects come in different sizes, so size will be ambiguous.

3. There is a huge amount of evidence is that in fact, distance is perceived before size, and that distance information then determines our perception of size.

So no one believes that the perception of the size of an object has much to do with our perception of its distance from us.

Theories of perception of depth / distance

1. Cue theory. (Structuralist-like)

Depth is synthesized through the combination of many imperfect cues.

People with this perspective believe that the visual world contains many indicators of distance.

However, each single indicator by itself ambiguously represents distance of an object.

So, each of the many cues is an ambiguous indicator, giving a little bit of information about distance, but not the whole picture.

People holding this perspective believe that the many ambiguous or imperfect indicators must be synthesized into a perception of distance by higher order cortical processes.

So the key aspects of this view are

1) Individual cues for distance are ambiguous. No one cue gives a perfect indication of distance.

2) The ambiguous cues are integrated/synthesized by higher order cortical processes.

3) The perception of distance is an inference based on the integration/synthesis of many cues.

If you’re thinking “Structuralism” here, you’re correct. This approach is very much like the analysis/synthesis approach advocated by the Structuralists. If you’re thinking that a question comparing theories of object perception and depth perception would be a good integrative final exam essay question, you’re also correct.

2. The Ecological approach. Gestalt-like

This approach holds that our perception of distance of objects is a direct experience of some complex characteristic of the visual stimulus that is unambiguously related to distance.

The main proponent of this approach was James Gibson.

This approach holds that there is no need for “higher order integration” because there exists an unambiguous stimulus characteristic that directly elicits depth perceptions in some lower cortical center.

The Gibsonian view emphasizes complex stimulus configurations that change from moment to moment.

The view holds that the unambiguous cues for distance exist in the configuration of the stimulus.

If you’re thinking, Gestalt approach, you’re correct.

Resolution

The current thinking probably supports an analysis/synthesis view of distance perception.

But Gibson and colleagues certainly contributed to the search for cues for distance perception, and the jury is still out on the nature of the integration of information from the visual stimulus regarding distance perception.

Cues for distance perception.

/ Occlusion
/ Relative Height
/ Relative Size
/ Static / Familiar Size
Atmospheric Persp.
Linear Perspective
/ Monocular / Texture Gradient
/ Motion Parallax
/ Pictorial / Dynamic
Deletion/Accretion
Binocular / Binocular Disparity
Depth Cues
/ Accommodation
Oculomotor
Convergence

Oculomotor – May result from our experience of the sensation from muscles or our experience of signals sent tomuscles.

Some videos related to depth perception

*Monocular depth cues:

Student project; on a field; illustrate using their placement on the grassy field

*Pitting cues for depth perception against each other:

Simple, short;

*Augmented Reality:

*Medical augmented reality:

Not really about depth perception as it is about combining two visual images of the same object into an augmented reality view; one of the images is regular, the other is x-ray

*Test for stereoscopic perception using blue/red glasses:

Gets pretty technical; related to hitting a baseball; 7:26’ total

Two Cues that are least ambiguous

1. Motion / movement parallax

The differences in speeds of movement of images of objects on the retina as those objects move at the same physical speed.

A monocular cue. Requires only one eye.

Requires that the observer be able to perceive motion. This means that some people may not be able to perceive depth generated through motion parallax because they can’t perceive motion.

Demonstration

Mike - These files are in MDBT\P312\Ch08Depth\MotionParallaxDemos

Web demos

Good:

Interesting: parallax/index.html

The upper room / lower room screen clearly shows the effect of motion parallax.

Neural Correlates of Motion Parallax

There is mounting evidence that there are neurons or neuron clusters that respond to motion parallax.

The existence of such neurons would give support to the Gibsonian idea of depth perception as the direct experience of a specific characteristic of the external stimulus – in this case motion parallax.

2. Binocular Disparity

The use of difference between the projections of the visual scene on the left and the right retina.

Since each eye views the same scene from a slightly different angle, the projection of the visual scene on the left retina will be slightly different from the projection on the right retina. The left eye “sees” a slightly different scene than the right eye. Following is an example of the difference in the two eyes’ views.

The images shown in the two photos below are arranged so that the left imageis the view seen by the right eye and the right image is the view seen by the left eye. They’re backwardfrom what we might expect so that those people who can cross their eyes in such a manner as to cause the two images to “fuse” will see them in depth.

Note that the differencesbetween the images are not huge. You must inspect the two figures to discover the small differences.

Yet the visual system integrates the two views into a single perception of the scene which has differences distance of various objects incorporated into it.

(A simple demonstration of the different images seen by each eye can be obtained by holding a finger up in front of your eyes and viewing the scene alternatively with the left eye and then the right eye. )

Details of Binocular Disparity

Horopter: An arc surrounding us with the following characteristic: The images of all objects on the horopter are projected to corresponding points on the retina.

All images on the horopterproject to corresponding points on the two retinas.

So A and A’, B and B’, and C and C’ are three pairs of corresponding images.

Images of objects beyond the fixation point project to disparate points.

Images of objects closer than the fixation point alsoproject to disparate points.

So A and ~A and C and ~C are pairs of disparate points on the retinas.

Why do we not always see double images of objects that are not on the horopter?

The following figure shows the left and the right retina viewing an object, F.

Object C is the same distance from the eyes as F. It is on the horopter defined by F.

Object B is closer to the eyes than F and C. B’s image projects onto disparate points on the two retinas.

Without special processing of the activation caused by B, it would be seen as a double image. But the visual system integrates the two disparate points to yield a single image of B perceived at a closer distance than F and C. It is believed that the special processing is the activation of special, disparity-detecting neurons.


Creating binocular disparity: Presenting different images to each eye. How Avatar could have been presented.

There are several ways of presenting separate, different images to each eye. The goal is to create a situation in which the left eye receives only the left-eye image and the right eye receives only the right-eye image.

1. The natural way. View any scene with the two eyes open. The left eye image is slightly different from the right eye scene by virtue of the fact that the eyes are separated. Each eye sees only its own scene.

2. Use a stereoscope. The stereoscope is a device which holds two images (usually photos) and forces the left eye to see one image and the right eye to see the other.

Demonstrate this.

3. Use polarized lenses. (The IMAX/ Avatar / Alice in Wonderland method.)

The left eye view is projected using vertically polarized light. The right eye view is projected using horizontally polarized light. Both views are projected at the same place in the visual field.

The left eye is covered with a vertically polarized filter. This allows only the vertically polarized light to strike the left eye. Conversely, the right eye is covered with a horizontally polarized filter, allowing only horizontally polarized light to strike the right eye.

This process results in the left eye “seeing” only the left-eye view and the right eye “seeing” only the right eye view.

4. Use colored classes.

The left-eye view is created using a predominantly blue image. The right eye view is created using a predominantly red image.

The left eye is covered with a red lens. This reflects the predominantly red image but passes the predominantly blue image, allowing the left eye to “see” the blue image. Vice versa for the right eye, covered by a blue lens. Demonstrations later.

5. Use special surfaces that allow one image to be viewed by the left eye and another image to be viewed by the right eye.

6. View a “backwards” stereoscopic slide of two images with crossedeyesso that the left eye and right eye double images converge to one. See image on page 7 of these notes. See also:

7. Create aregular stereoscopic imagewith diverged eyes, with the left eye view to the left and the right eye view to the right. Diverge the eyes so that the left eye and right eye double images are experiences as one image. I can’t do this.

8. Use specially created glasses electronically synchronized with the display. Used in 3-D TVs now for sale. The left eye glass becomes clear and the right eye glass is made opaque for 1/30 sec while the left eye image is displayed. Then the left eye glass become opaque and the right eye glass is made clear the next 1/30 sec as the right eye image is displayed.

Examples of stimuli that demonstrate binocular disparity when viewed through red and blue lenses.

You need colored glasses for these demonstrations. Red lens over the left eye. Blue lens over the right.

The red lens blocks the red image and passes the blue – sending the blue image to the left eye.

The blue lens blocks the blue image and passes the red – sending the red image to the right eye.

1. Two squares. Left eye: Red lensRight eye: Blue lens

You should see a box in front of the page.

Now reverse the lenses:Left eye: Blue lensRight eye: Red lens

You should now see the square behind the page.

2. Text: Left eye: Red lensRight eye: Blue lens

The center rectangle of text should appear in front of the page.

Perception of Depth- 11/12/2019

Gorilla at large: (1954 Movie starring Cameron Mitchell, Lee J Cobb, Anne Bancroft, Raymond Burr, and Lee Marvin)

Left eye: Red lensRight eye: Blue lens

Perception of Depth- 11/12/2019

Do we need monocular views of the object to perceive its distance?

Random Dot Stereograms.

Do we need to perceive the object with each eyebefore we perceive its depth? Or can the disparity of object elements define the object itself? Most objects we see in depth can be experienced by either eye alone.

But consider the following square . . .

It can be seen by either eye. It can be identified. But now consider the same square embedded in a larger square

Where is the original square? It’s there, because I cut it out of the larger square and pasted it onto this page. But neither eye, nor both together can identify the object in question – the square.

Now consider the following. The square above is in both sides of the following figure.

When viewing the two larger square normally, most people (perhaps all people) cannot see the smaller square within each.

But if the left half of the above figure is presented to the left eye and the right half to the right eye, most people with normal binocular vision can identify the smaller square within the larger squares.

This indicates that an object whose left-eye image activates disparity detecting neurons and whose right eye activates the same disparity detecting neurons can be identified.

This is object definition through binocular disparity – through the correspondence of images between the eyes. The images are defined by the correspondence.

Idea for television show – a killer leaves clues as random dot stereograms such as the two squares above. Only the hero can cross his/her eyes to be able to read the clues.

The above display requires a stereo viewer. The display below, which is based on the same principle can be viewed with colored lenses.

See VL 10.8 for a better random dot stereogram viewed through colored glasses. Be sure to click on “Draw it!”
Perception of Size

Most obvious visual scene cues for object size is visual angleof the object.

Large Object has large visual angleSmall object has small visual angle.

The problem with visual angle as an indicator of size is that visual angle also depends on distance of the object.

So viewing the large object at a distance makes the visual angle to the large object as small as the visual angle of the small object viewed close up.

This means that visual angle depends on BOTH object size and object distance.

So, how do we perceive size in the face of the confound illustrated here?

The Holway and Boring Experiment to determine if distance perception affects perception of size.