Nov 19, 2019 By Team YoungWonks *
How does depth perception work? How do humans and cameras detect distances to objects? And why should we know about it? This blog answers some of these questions and more. For starters, let’s look at what is depth perception.
What is Depth Perception?
Depth perception is the ability to visually perceive the world and its objects in three dimensions (3D) and the distance of such objects. Depth sensation is the corresponding term for animals. Animals can sense the distance of an object - thanks to their ability to move accurately, or to respond consistently, according to the distance - but it is not known if they “perceive” distance the way humans do.
Among humans, depth perception takes place due to binocular vision, also known as stereopsis. In other words, having two eyes allows us to have binocular vision. If someone lacks this, they will have to depend on other visual cues to gauge depth, and their depth perception will be less than accurate. And while we do use other cues in our environment for depth perception, the most important one is having binocular vision.
The farther the eyes are apart, the better depth perception one has. Insects, animals, and fish that have their eyes spaced very far apart and thus have a very high level of depth perception.
Why is depth perception important?
A lack of depth perception can affect some key areas in our lives. A lack of depth perception can be caused by conditions such as amblyopia, optic nerve hypoplasia, and strabismus. People with only one eye do not have depth perception, as depth perception needs two working eyes. So lacking the ability to perceive depth can affect one’s life in more ways than one. For instance, it can adversely affect a kid’s ability to learn. It can cause problems while driving and navigating roads. It can hamper an athlete from reaching his or her full potential. It can also come in the way of getting certain jobs that need good depth perception.
What Are Visual Cues That Aid Depth Perception?
Depth perception is made possible due to an array of depth cues. These are typically categorised into binocular cues that are based on the receipt of sensory information in three dimensions from both eyes and monocular cues that can be shown in just two dimensions and seen with just one eye.
Monocular cues allow us to have some sense of depth perception when true binocular stereopsis is not possible. Let us look at these monocular cues:
1. Motion parallax: Motion parallax is when we move our head back and forth. Objects at different distances will move at slightly different speeds. This is because closer objects move in the opposite direction of our head movement and objects farther away move with our heads. Our brains get to sense this and it gives us cues on depth perception. This effect can be seen clearly when driving in a car. Nearby things pass quickly, while far off objects appear stationary.
2. Depth from motion: When an object moves toward the viewer, the retinal projection of an object expands over a period of time, which leads to the perception of movement in a line toward the observer. Another name for this phenomenon is depth from optical expansion. The dynamic stimulus change allows the viewer to not only see the object as moving, but to detect the distance of the moving object. So in this context, the changing size acts as a cue for distance. A related phenomenon is the visual system’s capacity to calculate time-to-contact (TTC) of an approaching object from the rate of optical expansion; this particularly comes in handy in situations such as driving a car or playing a ball game. However, calculation of TTC is, strictly speaking, perception of velocity rather than depth.
3. Kinetic depth effect: If a stationary rigid figure say, a wire cube is kept in front of a point source of light so that its shadow falls on a translucent screen, the person on the other side of the screen will see a two-dimensional pattern of lines. Now if the cube rotates, the visual system will extract the necessary information for perception of the third dimension from the movements of the lines, and a cube is seen. This is an instance demonstrating the kinetic depth effect.
4. Perspective: The property of parallel lines converging in the distance, at infinity, lets us reconstruct the relative distance of two parts of an object, or of landscape features. An example would be standing on a straight road, looking down the road, and observing how the road narrows as it goes off in the distance.
5. Relative size: If two objects are known to be the same size (for example, two trees) but their absolute size is not known, relative size cues can offer information about the relative depth of the two objects. If one subtends a larger visual angle on the retina than the other, the object subtending the larger visual angle will appear closer.
6. Familiar size: Given the fact that the visual angle of an object projected onto the retina decreases with distance, one can club this information with previous knowledge of the object’s size to arrive at the absolute depth of the object. For example, people usually know the size of an average automobile. This knowledge, along with information about the angle it subtends on the retina, helps decide the absolute depth of an automobile in a scene.
7. Absolute size: Even if the actual size of the object is not known and only one object visible, a smaller object seems further away than a large object that is presented at the same location.
8. Aerial Perspective: Color and contrast cues tell us how far an object is. For instance, when light travels from a distance, it is scattered. Scattered light blurs the outlines of things we see and our brain reads this as the object being farther away. Objects at a great distance have lower luminance contrast and lower color saturation. This is why images seem hazy the farther they are away from the viewer. In computer graphics, this is often called distance fog. The foreground has high contrast, whereas the background has low contrast. Objects varying only in their contrast with a background appear to be at different depths. Distant objects also tend to move toward the blue end of the spectrum (e.g., distant mountains).
9. Accommodation: This is an oculomotor cue for depth perception. When we try to focus on far away objects, the ciliary muscles stretch the eye lens, making it thinner, and thus change the focal length. The kinesthetic sensations of the contracting and relaxing ciliary muscles (intraocular muscles) are sent to the visual cortex where they are used for interpreting distance/depth. Accommodation is only effective for objects at distances less than 2 meters.
10. Occultation: Occultation (also referred to as interposition) happens when near surfaces overlap far surfaces or when objects overlap each other. So if one object partially blocks the view of another object, humans perceive it as closer. The presence of monocular ambient occlusions include object’s texture and geometry.
11. Curvilinear perspective: Parallel lines become curved at the outer extremes of the visual field, as seen in a photo taken through a fisheye lens. This effect greatly improves the observer’s sense of being positioned within a real, three-dimensional space.
12. Texture gradient: One can clearly see fine details on nearby objects as opposed to faraway objects. Texture gradients are grains of an item. For example, on a long gravel road, the gravel near the observer can be clearly seen of shape, size and colour. In the distance, this cannot be seen as clearly.
13. Lighting and shading: The way that light falls on an object and reflects off its surfaces, and the shadows that are cast by objects help the brain decide the shape of objects and their position in space.
14. Defocus blur: Selective image blurring is used regularly in photographic and video for giving the impression of depth. This can act as a monocular cue even when all other cues are absent. It may contribute to the depth perception in natural retinal images, because the depth of focus of the human eye is limited.
15. Elevation: When an object is visible relative to the horizon, we tend to detect objects that are closer to the horizon as being farther away from us, and objects that are farther from the horizon as being closer to us. Similarly, when an object moves from a position close to the horizon to a position higher or lower than the horizon, it will appear to move closer to the observer.
Binocular cues offer / help with depth information when one is watching a scene with both eyes.
a. Stereopsis, or retinal (binocular) disparity, or binocular parallax: Animals that have their eyes placed in the front can use information gained from the different projection of objects onto each of the two retinas so as to judge depth. By referring to two images of the same scene obtained from slightly different angles, one can triangulate the distance to an object with a greater degree of accuracy. So one has to bear in mind that each eye - be it left or right - has a slightly different angle of an object and this occurs due to the horizontal separation parallax of the eyes. If an object is far away, the disparity of that image falling on both retinas will be small, but if the object is closer, the disparity will be larger. It is this stereopsis, aka retinal (binocular) disparity (also known as binocular parallax), that tricks people into thinking they perceive depth when viewing 3D movies and stereoscopic photos.
b. Convergence: This is a binocular oculomotor cue for distance/depth perception. Here stereopsis causes the two eyeballs to focus on the same object and the resulting convergence stretches the extraocular muscles. Like in case of monocular accommodation cue, kinesthetic sensations from these extraocular muscles help in depth/distance perception. Again, the angle of convergence is smaller when the eye is focusing on objects at a distance. Convergence is effective for distances within 10 meters.
c. Shadow Stereopsis: Shadows too have a key role to play in depth perception. Regardless of the position and the number of light sources, shadows from each source can get fused, imparting the correct depth to the object with respect to the shadow and as a result, to the surface upon which the shadows are cast.
It is important to note that among these several cues, only convergence, accommodation and familiar size give us absolute distance information. All other cues are relative in that they help us tell which objects are closer in relation to others.
*Contributors: Written by Vidya Prabhu; Lead image by: Leonel Cruz