Interaction Paradigms

by Kevin Ponto on April 14, 2011

Moving objects in space: exploiting proprioception in virtual-environment interaction
Mark R. Mine; Frederick P. Brooks, Jr.; Carlo H. Sequin;


What do you think of the metaphors for virtual world interactions listed in the paper?  Do you think that they would intuitive or confusing in a virtual environment?

Does this article feel applicable with the recent advances with the Kinect, or does it feel dated?

Finally, as usual, select a topic that you find interesting, dubious, confusing or curious and explain why.  Write at least a paragraph of explanation and add citations if warranted.


Alex April 24, 2011 at 7:24 pm

I thought the metaphors that were discussed in this paper were really interesting. I would love to get a chance to use a hand held widget. I think that it would be confusing at first, and would take some time to get used to, but I do think over time it has the potential to be very good. This is dependent on the application that it is used for, but for application requiring more complicated interaction, I think that this could be very cool.

I feel that the kinnect is a step in a great direction. I think that the ideas that are in the paper are good to keep in mind developing interaction paradigms with the kinnect. I think the biggest problem is that the kinnect is still not enough to accomplish these complex interactions. I think it is good enough to get good approximations, and build a useful tool for interaction of a modes complexity, but I think that its going to take another generation or two to get the point that this paper would have more relevance in.

It found the head-butt zoom an interesting idea. As i mentioned in the one of the previous readings, is that things with engineering applications are really cool to me. I really like the point that often, when using CAD or other design tools, that is nice to have multiple levels of detail clearly abstracted from each other. Using a framing tool to choose your zoom level and then moving back and forth to change levels of detail sounds really cool.

Reid April 24, 2011 at 11:18 pm

I have a hard time believing that scaling the entire world wouldn’t be noticeable or disconcerting to users. The paper suggests people don’t even notice it but it seems to me that the world’s shaping sliding around in the corner of your view would be extremely distracting. The two handed flying, pull down menu, and over the shoulder deletion seem like they would be easily remembered and intuitive for users.
I think this article could still be very relevant since the kinect has a limited range of operation and does not have super accurate tracking abilities. It would be much more preferable to have purely gestural control in a kinect application rather than forcing the user to repeately return to the keyboard to do things. Handheld controller are of course an option but have a limited set of button or controls. Gestrual control may be more intuitive and with the ability of software to generate skeletons, a not-to-difficult task.
I find it interesting that they make no mention of teleportation based movement. For example, a user could point with one arm to a location, or even select an object, and the use a gesture or button to command a ‘teleport’. In the pointing case the user’s position would shift to the location pointed at. In the selection case the user would appear at a position nearby the object to present a good view of the object. Teleportation has the advantage that you are instantly at the specified place with minimal movement or time spent controlling the system.

Leo April 24, 2011 at 11:24 pm

I think the metaphors would be intuitive, especially the pull-down menus and widgets. I feel like most people are comfortable with menu systems, whether they’re on your browser or in a game I feel like it would be equally intuitive to have it inside a VE that can be accessed by gestures. I think the widgets are also very intuitive and are similar to what we see in operating systems, more specifically managing windows. We have visible widgets to close, minimize, move or resize windows and it doesn’t seem to be any different here. Plus, it’s a lot better than the alternative of using abstract physical buttons to launch menus.

I feel like it’s a little bit dated to the Kinect since most of these things have been accomplished with the Kinect, especially as far as menus and widgets go. It seems like this article basically summarizes what the Kinect does and what it might be able to do in the future as far as haptic feedback goes.

I thought the that automatic scaling mechanism is an interesting idea since it takes away the drawbacks by having limited space to walk around in. I can see some interesting applications for this idea but I’m a little skeptical as to how effective it is in some scenarios. From how I understand it works, if you were in dungeon and you had a lever to open up a door, instead of walking up to it, it would scale so that it would be in your “working volume” so you could interact with it. I could definitely see this being useful for applications that require efficiency such as a modeling applications that uses gestures to build, but I would think this approach would take away from the immersion level of an application geared towards entertainment.

Nick April 24, 2011 at 11:26 pm

I thought the metaphors were ok and could work fine, such as virtual walking would make sense to me. However, as talked about before I believe, I don’t think metaphors can be or should be standardized in a paper, but come about organically over time by users.

I think that most of the paper is still applicable even with the Kinect in mind. I know that when they talk about limited input information, that the Kinect solves most of this, but things like lack of haptic feedback is still applicable. If you move an object or pick something up with the Kinect, there is still no feedback to tell if the object is heavy or smooth or anything at all. And therefore, the precision in manipulation is still difficult. Using the object to the body and using it as a sort of reference is an interesting idea, and even though it takes the reality out of the program, it may help with manipulation in some instances.

It is always interesting to talk about menus and a home screen of sorts, for virtual reality. These articles talk about needing new metaphors and a new way of doing menus because the PC and desktop/windows metaphors do not work in these cases. However, maybe they need to move even further away and think about something entirely different like no menus at all. Or instead of the menu being part of the virtual world could be something you carry on your person and manipulate there. Not that I’m trying to come up with new great ideas, but that it is interesting how it seems like when articles bring up menus and metaphors, they are trying to take old ones and change them to fit new technologies when maybe that is not the way to go.

Russel April 25, 2011 at 12:08 am

I think the scaled world grab was pretty interesting. It seems like it would be straightforward enough and useful, but at the same time it seems kind of strange. Should the user really be able to grab anything in the world? It just seemed strange to me. I thought the hand held widgets could be kind of confusing. Again, interacting with stuff far away, which was listed as one of the advantages, seems really weird to me. How can you even do something useful with an object really far away?

However, the gesture stuff really jumps out at me as being really useful. Headbutt zoom seems cool but the real highlight of the show to me was look at menus. I can’t count how many times using a dual monitor system I look at a different monitor and start typing, forgetting that focus was actually still on the other monitor. If I could look at something and activate it that would be awesome. However, I can also see it being a problem where if you look somewhere else all kinds of crazy stuff could happen.

The article seems sort of like it knew the Kinect was coming. Although it’s gesture stuff does seem a bit dated it’s still useful. The two handed flying idea, again, makes it seem like it knew the Kinect was coming, since a lot of these ideas were incorporated.

I thought the coolest part of this article was actually the deletion gesture. A problem I see with a lot of gestures is that they might be accidentally be invoked or they’re confusing or easy to forget. The idea of just tossing something over your shoulder to get rid of it has a kind of natural feel, like you should always have been able to do that to delete something. It seems similar to the “pinch to zoom” idea on smartphones–it’s something that just makes sense. I’m not so sure how I feel about being able to go back over your shoulder to “undo,” as that seems a little bit weird, but overall I thought the idea was awesome.

kusko April 25, 2011 at 12:20 am

Although claiming that there was difficulty in providing a good metaphor for applications the gestural control seemed to provide a pretty intuitive style of control. Using movement of throwing something over your shoulder like you would in reality seems like the best type of metaphor to use. If users were provided similar interactions to that in reality there should be no difficulty in even the most amateur user being able to accomplish a task. If constructed carefully and natural body movement is taken into account the metaphors could become very easily accepted.

The use of the Kinect seems to provide the availability of these advancements to the general public. Not so much dated, the articles main topics may see a much faster development with a larger market for such applications. Although this article focused on virtual environment manipulation the application of the Kinect’s capabilities could be applied to manipulation of reality. Controlling a robot arm for space,underwater or any dangerous repair could be controlled by a user from a safe distance. Also if a program like in the video World Creator was possible, if attached to a 3D printer one would be able to create real objects out of a virtual rendering.

Although cameras and tracked devices provides a pretty slick inexpensive way for gestural interpretation it seems like the use of a suit like Iron man’s could provide more robust capabilities. A tracked suit with a head mounted display could provide an augmented reality for the user with tracking of the suit. This would provide very fine gestural controls, without the danger of losing a tracked item that is not covered by the camera. The incorporation of haptic and sound augmentation would also be much simpler with this application.

Liana Zorn April 25, 2011 at 2:57 pm

I think that the metaphors in the paper wouldn’t be confusing, but many would be kind of pointless. They are either metaphors that already exist put into a new medium, or they are kind of out-there. The idea of scaling a world to move in it seems like it would be intuitive to actually do, but disconcerting for it to happen around you.
The metaphors seem similar to the kinect – they are reworkings of current metaphors so that they remain intuitive to users. In this sense, I guess the article is a little bit dated because these things have been accomplished, but it also mentions many new ideas, even if they aren’t so good.
I think it would be interesting to actually try out some of these ideas with the kinect. I’m not capable myself, but from what I’ve seen, they wouldn’t be terribly difficult for someone better than me to program into the kinect. I would like to find out if they actually are as absurd as they sound, or if they make sense in certain applications (or most applications).

Nathan Mitchell April 25, 2011 at 3:04 pm

Some of the metaphors I liked and some I didn’t. The general concept
of body-as-coordinate-system I thought was a good one. The idea that
when lacking good haptic reconstruction you can use your own body
makes a good sense, especially when its the only object that you can
guarantee is near by. Now some of the gestures I don’t know if I would
like. In particular the gestures involving eye-hand direction seem
weird. I would need to try them to fully understand I think. On the
other hand, the idea of a tool belt around the head or tossing garbage
over the shoulder does sound intuitive and easy to grasp. With the
addition of activation sounds, I think they would make excellent

Given the project that I working on for the Animation class, I gave
this question some thought. Several of the gestures are unusable,
purely due to the sensing restrictions of the Kinect. Eye tracking is
right out, unless the Kinect is paired with another system. However,
the purely hand and limb based motions are certainly applicable. And
while I think it would need to be altered, the concept of pulling
objects up close for the purposes of interaction sounds perfect for
the Kinect. Given that it is difficult to ‘move’ in front of a Kinect,
the best you can do is really alter your body pose, walking towards
objects is not easy. In this sense the article is not dated at
all. The problem discussed at the beginning about this restriction is
still very real.

I think one of the big problems with the “working in arm’s reach”
paradigm is that of sensor precision. Big wide gestures are easy to
spot as the position deltas of the joints is large and clear. However,
take this example: you want to map a control panel to your arm and different distances up the arm correspond to different buttons. The sensor would need to be able to determine hand and arm position even when they are touching (the Kinect sometimes has trouble with this) and unless the contact zones are very large, the sensor needs to be able to this with high precision.

I think the body makes an excellent prop for gestures, and I agree
with the reasons stated in the beginning of the article. I just don’t
believe that the sensors are quite ready yet. At least not Kinect
style sensors – perhaps a different method would be equally
unobtrusive but more effective.

Aaron Bartholomew April 25, 2011 at 10:37 pm

I find the interaction metaphors in here to probably be as intuitive as it can get in an environment like the CAVE. Given the locomotive constraints of an enclosed space, keeping the controls as close to the user as possible is logical. Although metaphors like the scaled-world grab may not conform to expected reality, it is an action that makes sense relative to the world: you can’t physically walk to an object, so you must instead indicate intent through gestures. For some reason, the over-the-shoulder deletion metaphor just seems so right to me; the intent of the gesture perfectly maps from the real world to the virtual.

This article definitely feels applicable to recent advances with the Kinect; they both address the same issue of how to have the user physically interact with virtual environments in a space-constrained and haptic-limited context. Although I’ve never used a Kinect myself, it seems that the consensus for both methods is to use gesture recognition while keeping controls close to the user.

Interestingly enough, the idea for “drop-down” and “look-at” menus have been becoming somewhat popular in video games. Dead Space received a lot of positive response for their non-screen space UI (you can see a good example of it here People claimed that having this style of menu interaction made the virtual environment seem more real; seeing their character interact with menus attached to him within the context of the world added a level of plausibility. Although this UI doesn’t necessarily follow the same metaphors from this paper, I could foresee them both resulting in the same manner of added immersion since the metaphor instills plausibility as an integrated “toolbelt” for communication that is realized within the context of the world.

Nathalie April 25, 2011 at 11:38 pm

I believe mode switches in virtual world interaction design are too often overlooked and am appreciative that they at least touched upon it on page 4: “Widgets for changing the viewing properties can be stored by the user’s head; widgets for manipulating objects can be stored by the user’s hands.” This is a good idea as higher-level/system-related functions should be used in a mode that is infrequently called upon and thus should be in a stance or position that is probably more tiring than gestures used for general commands (more likely to be around the torso of the upper body).

Head-butt zoom seems to act well as a metaphor for the user wanting to look closer at something (making use of depth recognition of the user stepping forward to zoom in and backward to zoom out).

I’m somewhat concerned about the select/grab sequence of gestures on page 3 – what if the user accidentally selected the wrong object in the virtual system and instinctively pull back their arm because they made a mistake? Upon pulling back their arm, they would be calling upon the grab function, which would cause for some issues.

I feel that this article is very applicable with the Kinect. However, there are definitely notable differences in the gestures used for certain games vs. other applications, which reminds me of the link that Joe sent out to us about Microsoft already making a professional version of our galaxy map:

What I find challenging/dubious about trying to develop a gestural paradigm for VR applications is differentiating between instinctive/automatic/involuntary gestures and defined/intentioned gestures for the virtual reality system. How do we avoid instances where people are physically moving the way they normally move in various situations vs. moving to call a function?

Rachina Ahuja April 26, 2011 at 2:06 am

I think that some of the metaphors like the Scaled world Grab and Pull Down menus would be pretty intuitive because for eg when you want to look at something closely , the first thing you would think of doing is grabbing it and bringing it closer to examine. Hand held widgets would also work, partly because we are used to associating hand held devices with what happens on a screen and partly because then the object can be seen without being obscured by widgets as mentioned in the paper.

The Kinect is a large step in the field of gesture recognition but it doesn’t make this article feel dated to me. Right now the Kinect has a proven utility in the field of gaming only as far as I know. It also can’t recognize grabbing gestures, finger movements etc but things like the two handed flying could probably be attempted using the Kinect. I do think that there is a long way to go though with developing intuitive interaction metaphors.

The idea of Look-At menus doesn’t appeal to me. It doesn’t sound like it would be intuitive but only more of a strain. Also, it feels like it would be easy to select the wrong thing. Moreover, how long would one have to ‘look at’ (which would basically be stare at without moving from my understanding) something to select it?
I do however, really like the idea of ‘over the shoulder deletion’. This would be really intuitive(not to mention fun, somehow). I’m not sure about being able to retrieve a deleted object by reaching over your shoulder and grabbing it but the idea is very interesting nevertheless.

Rebecca Hudson April 26, 2011 at 6:48 am

Based on the researcher’s claims about naieve user reports at demonstrations, the scaled-world grab seems like it would be highly effective for some applications. This would probably not be the case, however, unless those attempting to use it had already become accustomed to an “incompatible” grabbing padigram. I cannot imagine that scaled world grab would work for locomotion unless a user’s sense/idea of their own walking were not central to he goal of the simulation; because walking is a continuous process, people
don’t walk on ‘auto-pilot’. Hiding a menu just outside a user’s field of view is only one more conceptual step than that of “shoot offscreen to reload”, and would probably not be unreasonable or confusing (unless conflicting padigrams tried to coexist). “Head butt” zoom could be either very useful or nauseating, depending on how scaling factors were chosen and distances calibrated. Two handed flying, as described by the authors sounds like it could work well.

The article feels dated with respect to the state of technology when it was written. However, the issues raised in this article are as relevant as ever, because haptic interfaces are still in their infancy. The outward extension and continued flapping of a person’s arms is fatiguing, and if haptic interfaces are to realize their potential, they must be more ergonomic and effective.

Interesting: Unlearning and relearning.
Though there may be a first (Kinect?), there’s unlikely to remian only one consumer haptic interaction system. The present stage of haptic input is that very few users have a specific idea of what to expect from haptic interfaces. This means that they will probably have an easy time learning the first interface they are taught/work with extensively. Because vendors will want to compete for users, especially early adopters of new technology, they will rush their interfaces and applications out of the development process as quickly as possible. Problems with usability, or fatigue may invite competition from still other vendors to come up with a more usable, if radically different interface. Ultimately, the way that haptic (as any other) interfaces will be designed will be guided as much my marketability and user expectation/habits/comfort as by how many commands must be issued to perform a specific task. I see over the next 20 to 30 year, much of the research and development work on haptic interfaces will focus on either managing or exploiting the habits that new users carry over from outdated systems.

Andrego Halim April 26, 2011 at 9:20 am

The metaphors sound confusing at first, but after reading it more thoroughly it does sound more intuitive. I particularly think the virtual grabbing works best in improving the accuracy to grab/manipulate object. With that said, I think in terms of its usefulness it would really depend on other factors. I understand from their experiment of the cube alignment that they have significant result but they also mention that it doesn’t really work with extremely far object. In such cases, they suggest two ideas of virtual walking by doing scaled world grab for locomotion or map down-scaling. But that would decrease the intuitiveness. Let’s say I’m in a super-large forest and want to walk traverse through it. Then with the scaled work grab, I’d imagine that the user needs to grab a tree then pull himself towards it. But imagine how congregated the trees are. In one pull, I’d assume that I’d have to virtually “crash” into a bunch of trees. Secondly with the map down scaling; if we have to scale the forest into miniature of trees, and pass through it that way, the user won’t be as real since it will just look like toys. And they also mention the two-handed flying for traveling, but I don’t like the idea of having to use two hands just to fly, seems like too much of a hassle.
It does feel dated, and I’d assume this is logical since this is a 14 year old publication. But I do feel that some of the ideas are being used in Kinect, particularly the idea of grabbing in improving accuracy.
For the interesting topic, I have a simple idea about traveling that’s not mentioned in the paper and this would work well when exploring an extremely large world map in an adventure game. With the use of the menus(either the drop down or the look at) like the one in the paper, one can travel by just displaying the world map(maybe with a shortcut at the corner of the view) then just click on the places that they would want to go. Similarly, if the locations are already known(i.e. X Town, Y Forest, Z Kingdom), they can be displayed as an ordered list so the user can easily search for it. This idea might sound simple but I find it more useful compared to the ones mentioned in the paper especially when exploring large areas.

Joe Kohlmann April 26, 2011 at 9:59 am

What do you think of the metaphors for virtual world interactions listed in the paper? Do you think that they would intuitive or confusing in a virtual environment?
I think I needed to see this paper three months ago, frankly—it’s chalk-full of interaction gems! Some of the interactions, such as scaled-world grab, the worlds-in-miniature paradigm, and even the horribly-named head-butt zoom are brilliant in my mind. None of these gestures are necessary intuitive, but they make up for intuition with cleverness. Considering I’m working on a galaxy map, I can completely appreciate the utility of reaching any visible destination with a single grab operation. The head-butt zoom idea also sounds invigorating—rather than have any tangible user interface, many systems could change viewpoints based in a simple lean-in or lean-out gesture. This actually seems like the most intuitive interaction apart from the two-handed flying gesture.
Does this article feel applicable with the recent advances with the Kinect, or does it feel dated?
The Kinect could very possibly make all of these elements much more applicable to user interface design now that a mass-market “proprioceptive input device” exists. I was half-surprised at the omission of a hover-hold gesture in this paper (common in the Xbox Kinect UI from what I understand), since so many of these gestures could fit right in with Kinect’s rudimentary but important capabilities. None of it feels dated; on the contrary, it feels like the world of I/O may have finally caught up with Mine et al’s vision (or need for advanced gesture controls).
Finally, as usual, select a topic that you find interesting, dubious, confusing or curious and explain why. Write at least a paragraph of explanation and add citations if warranted.
It seems like hand-held widgets and other gestures might apply equally well to augmented reality. Though we can imagine very practical implementations of these in virtual reality (perhaps demonstrated best by the short film World Builder), the combined possibilities of local widgets and scaled-world grab resonate with. As a simple example, an AR system showing an overlay of networked devices within view could let the user reach out to a visible node and instantly monitor that system’s status. A lean-in could start a remote terminal session. Hand-held widgets and some kind of haptic feedback system could offer an instant QWERTY keyboard (or other input device) literally at the user’s fingertips. This is just one possible integration of these ideas into an AR context.
Bonus inspiration: the Omni-Tool from Mass Effect. It’s an arm-mounted computing device controlled with one- or two-handed gestures, such as rotating a holographic dial to iterate through an index, or waved with a certain hand gesture. Its interaction techiques aren’t incredibly well-defined, but such a system would clearly exploit the idea of hand-held widgets that Mine discusses in this paper.

Comments on this entry are closed.

Previous post:

Next post: