Reading 10: 3D Perception

April 2, 2010

in Assignments

(due Tuesday, April 6th)

For this lecture, we’ll move on to the third dimension. While it is very tempting to suggest we read something on the various artistic techniques or things used in visualization, we’ll start with the perceptual foundations. If you’re interested in lighting or shading or … it can make a great project.

  • James Todd. The Visual Perception of 3D Shape. Trends in Cognitive Science. 2004. A nice, compact article.
  • Chapter 5 of Ware’s Visual Thinking for Design. This actually discusses a lot more than just 3D perception.

As usual, post (at least one) comment about the readings.

{ 12 comments }

lyalex April 4, 2010 at 11:28 pm

I suggest to read the Chapter 5 of Ware’s book first, as it contains several basic concepts such as occlusion and a brief list of depth cues. The acctual parts related to third dimension is ” Depth Perception and Cue Theory”, “Stereoscopoic Depth”, “Structure from Motion”, “2.5D Design”, “How Much of the Third Dimension”, “Affordance” and “the Where Pathway”. The most critical thing indicated by Colin Ware is that we have a way better perception ability for the 2-D retinal image then the depth direction perception, thus, 2.5D design would be taken into account since not only depth direction must be considered quite differently but also it also relieves some constraint of perceptions in 3D.

Comparing to the book chapter, the review paper by James Todd is more technical while also clear. The author first described the sources for 3-D shape information, then he listed the methods for psychophysically investgating of 3D shape perception, and then how to represent 3D shapes. It also includes the neural processing of 3D shape. The future research question in BOX2 covers the questions I can ask. I’m still interested in question 4, how do observers identify different types of image features? Why didn’t researchers make much headways? Are there enough experiments? And any results from them?

Jim Hill April 5, 2010 at 9:26 pm

I found Todd’s paper interesting. More specifically, I found the part about perceptions of 3D objects being within a skewing transformation of the actual object a little odd. This could mean that I didn’t understand what he meant by it. It seems like if this is true than the amount by which the observation is skewed must be fairly small as I don’t generally think of things as being really skewed.

His discussion of contour and occlusion lines for showing the basic shape of an object without complicated shading was very interesting to me. This is almost the basis for animation and non photo-realistic rendering which are both things that I’m interested in. His comment that there is still a lot of work to be done gives me hope that I may be able to contribute.

Ware’s discussion of 3D perception was a little more readable that Todd’s. I appreciated his enumeration of the different depth queues and how they stacked up against each other. I wasn’t surprised to see that occlusion beat out pretty much everything else. I also wasn’t surprised to see that stereoscopic depth perception really isn’t that important to pulling information from a scene. It’s interesting to see Ware referring to a lot of the 3D TV’s and Movies as gimmicks. He does leave some room for interaction though.

Both authors mentioned James Gibson as a front runner in perception although Ware suggests that his ideas have been superseded so I would be interested to know whether there’s any value to looking at his work.

Finally, I found it most interesting that the brain has a naive physical reference and that people can become better at interacting with computers that with the real world. I would have to agree that there are probably a lot of gamers out there who can move around a virtual world better than the real one. This definitely opens up questions about how much interaction children should have with computers, especially as computers play a larger and larger part of everyday life.

Nakho Kim April 5, 2010 at 10:35 pm

Reading Ware’s nice overview of depth perception cues, one interesting thing was that pictorial depth cues are often enough to create 3D(in fact 2.5D) visuals. On the other hand, he is less enthusiastic about more cognition-heavy processes such as stereoscopes and structure from motion. Given the additional effort often needed to implement those cues, I agree that they won’t be very efficient design choices.

Thinking of design strategy implications, it can be inferred that 3D/2.5D is helpful in layering additional information, preferrably least cognitive load as possible. However, I think there may be also some benefits in more cognitive-heavy 3D perception as well if a specific visualization can benefit from immersive experience, such as photo-real exploration. For example, affordance could be a good concept to combine with interactivity, since it includes action by definition.

Todd’s article states that the “3D perceptual representation is primarily based on qualitative aspects of the structure itself” including occlusion contours or high curvature edges, and argues that ambiguities can be resolved by assigning better cues to the visual. It reminded me of the “spinning dancer illusion”(http://en.wikipedia.org/wiki/The_Spinning_Dancer), where the illusion is removed just by adding one small contour cue. As for design strategy, if we can use contours and edges to visualize the shape of 3D structures wisely, it will give us more freedom to use other encodings such as color and pattern for representing other data classes.

punkish April 5, 2010 at 10:42 pm

Perceiving 3D shape by Todd
—————————

Todd provides a summary of the current understanding of our visual perception of 3D shape. How do we determine the 3D shape of an object by looking at a pattern of optical simulation, be it image shading, gradients of textures made up of dots or lines, or line drawings showing contours and curvature. It is possible for different 3D shapes to “create” the same pattern of optical simulation, yet, we seem to figure out what we are looking at even if the thing we are looking at has been distorted. We can do this as long at the distortions are within “limits.” We may be employing clues from the natural environment to figure out what we are actually looking at. Perception of 3D shapes remains relatively poorly understood, but survival seems to be the recurring theme in influencing how we see things.

Frankly, I found Todd’s paper to be unnecessarily dense and difficult to read, especially in comparison with Ware’s 5th chapter on Visual Space and Time.

Visual Space and Time by Ware
—————————–
Ware posits that scanning for information is essentially traversing space to get information into our heads. In that sense, it is conceptually the same as traveling physically over distance to see someone or something. From this perspective, he approaches the structure of 3D space.

The perceptual 3D space is made up of /up/, the /sideways/ and the /towards-away/ dimensions. Depth cues are bits of environmental clues that we use to judge distances in the /towards-away/ dimension. This dimension has inherently less information than the other two dimensions, since the latter are mapped on the retina. Hence, equating the /t-a/ dimension to half a dimension. Depth cues can be pictorial (occulusion, size and texture gradients, linear perspective, shadows, height on the picture plane, etc.) or non-pictorial (stereopsis, motion, focus, etc.).

Ware does a cost-benefit analysis of physically moving through space vs. “visualizing” 3D space, which can be useful in guiding our approach to visualization.

The take-away point is that depth is perceived in fundamentally different ways than other spatial dimensions. The visualization designer can utilize a combination of different cues to convey depth.

Shuang April 5, 2010 at 11:04 pm

Todd’s paper discusses about the perceptual representation of three dimensional data. Recent mathematical analysis shows that the ambiguity of 3D visualization can be constrained, which is different from previous results. The paper was written by a psychologist, so it contains cognitive knowledge about viewing a 3D plot, including the history of visualizing 3D in psychology field. Figure I in this paper is a good example for comparison. The line drawing describes the shape of shaded image. When I took the topology courses, we used the same technology (drawing the lines where the first derivative is zero) to present 3D structures, yet it may cause confusion between boundary and contours.

Chapter 5 of Ware’s textbook expands the 2.5D view to 3D. As the title of this chapter shows, the third dimension can be (but not necessary) either the spatial or the time dimension. The 2.5D discussion in previous chapters is mostly for static and non-interactive designs. The concept of affordances introduced by psychologist in 60s is useful in studying the visualization. Cognitive affordances are perceived possibilities for actions. To describe the third dimension, ten methods are listed in this chapter with features introduced. It is a quite useful reference for visualizing different types of data.

Jeremy White April 5, 2010 at 11:55 pm

Todd’s article on the visual perceptions associated with 3D shapes touched on a number of interesting areas and ideas. One area that I believe did not receive enough attention is the relational aspect of shape topology. Since the object, according to Todd, is broken into regional property maps by the observer, it stands to reason that differences in topology become more noticeable if the surrounding topology is vastly different. This doesn’t necessarily mean that large curves or clearly defined edges need to be present if the neighboring topology is perceived as more uniform.

The qualitative aspects of 3D object perception seems as if there is more room for interpretation and error. Perception of unfamiliar shapes could, therefore, be mistakenly identified as familiar objects by the observer, especially if other attributes besides shape (size, hue, orientation, etc.) are similar to known objects.

I’m beginning to grow tired of Ware’s chapters. His section on artificial interactive spaces came across as complete filler. His assertions are comparable to Tufte in the sense that there is often no reference to empirical research that supports his position. For a book on visualization and design, there seems to be a lot of low quality images and graphics that most people would consider clip art. I wouldn’t mind the examples as much if the written content didn’t come across as so groundlessly basic.

jeeyoung April 6, 2010 at 7:51 am

Ware’s chapter 5 presented basic methods showing depth cues such as occlusion, size gradients, texture gradients, grid, shadow, focus and so on, which also have been used in drawings. He also explained nonpictorial cues like streoscopic depth and structure from motion. I guess streoscopic technique have improved a lot as people applause the movie Avatar, but there would be a lot to be done in this area because some people get nauseated in 3D movie and game.

Todd’s paper was hard to read. fMRI seems to contribute a lot in cognitive science.

ChamanSingh April 6, 2010 at 8:20 am

Human eyes are remarkable or mysterious biological device which can detect shape of 3D object instantaneously. Many theories have been suggested to explain the working principles.

I am extremely surprised ( and doubtful ) to know the author’s finding that humans can detect both convex and concave shapes quite accurately.

Although shading is an important clue to understand the depth, it also produces ambiguities. The concept of “Stable” maps is something new to me, and although I knew the importance of curvature, ridge and valley, hatching etc could provide extremely powerful clues, I wasn’t aware that it is because of “Stable Data Structures”.

Image features are important and “T Junction” and “Cusp” locations provide important clues because of viewpoint-invariance. But other papers ( Malick etc)
gave better explanation than the present one.

It is interesting to note that local structures are sampled efficiently by human eyes are result is consistent in least square sense.

Lastly, I am extremely surprised to know the involvement of monkeys in understanding the human 3D perception capabilities.

watkins April 6, 2010 at 8:31 am

Ware first discusses all the ways we assume the third spatial dimension, depth, and points out the differences between this and the up-down and left-right dimensions, then uses this to guide 2.5-dimensional design. Designers can pick and choose what depth cues they want to use to make their design as understandable as possible, even if it isn’t physically accurate.

The most interesting part of the this chapter is the second half, where he discusses artificial spaces, and how they are beginning to become as natural to navigate as real physical spaces, even though they operate by a completely different set of rules. I wonder if learning a second set of rules for navigating an environment compromises our ability to understand and navigate the physical world?

Todd’s article addressed the fact that 3D shape perception isn’t inherently a correspondence in the mathematical sense, but maybe with the right set of constraints it could be. He points out that shading can be ambiguous under certain circumstances, and analyzing patterns is only effective when making specific assumptions that may not be true. He seems to focus mainly on curvature of an object, since it is an intrinsically defined attribute, and it is invariant to small rotations in most cases.

dalbers April 6, 2010 at 9:28 am

The Todd article brought up a very logical argument that appears largely taken for granted in modern 3D design. The fact that we can construct three-dimensional figures on a two-dimensional surface via what appears to amount to a mathematical illusion is, mechanically, a pretty wild concept. Yet, it feels as if these techniques are so frequently used in real life that I’d seldom sat down and actually thought through how such images could be created. The parallels and dissimilarites between primate and human visual processing, while not directly relevant to visualization design, were nonetheless very interesting given what is known about evolution.

Ware’s chapter tended to stay more true to using depth and perception cues to create the actual illusion of three-dimensional figures discussed in the Todd article. Some of the techniques discussed for using the thrird dimension as a visualization tool actually appear surprisingly effective. For instance, using subtle highlighting to enhance UI widgets is actually far more common than I’d realized. There are several instances of it to be found just simply looking at my browser window.

Additionally, his discussion of techniques such as motion parallax and atmospheric depth cues offered a chance to understand very commonly used techniques in three-dimensional modelling that the layman may not immediately recognize as illusion. According to Ware, such techniques can be integrated into current three-dimensional visualizations in order to improve the perceptual “friendliness” of these visualizations. However, Ware also notes that motion techniques have the power to make the stationary observer feel as if they are actually moving. This technique does logically seem like a useful tool in virtual reality simulations.

faisal April 6, 2010 at 10:41 am

The chapter from Ware’s book explained further the idea of 2.5 dimensions, briefly mentioned in some earlier chapter. I found his enumeration of properties of different depth cues particularly useful from a design perspective. The discussion on affordances although useful seems incomplete to me. I am still wondering about its exact design implications.

The Todd’s reading took the above idea of properties of depth cues a little further. The main focus in this reading was on perception of depth cues. Fully understanding some of the concepts presented in his paper seems required in order to fully exploit the power of 3D or 2.5D design in visualization. For example the proper use of environmental constraints that can lead to accurate stimulation of a physical 3D shape etc. This seems especially true if one is to design custom 3D shapes.

dhe April 6, 2010 at 10:59 am

The notion that some visual information of a 3d object are more invariant to transformations in space than others is interesting. Todd specifies local curvature as one such property. This explains the success of line drawings of 3d objects, and especially when lines depict curvature ridge lines.

I wonder if perceiving 3d shape from a 2d image has anything to do with manipulating 3d objects in visual memory.

Previous post:

Next post: