Reading 9-1 – Comments on Group One

by Mike Gleicher on February 15, 2011 · 26 comments

in Handins

Use this page to comment on papers from Group 1 from Reading 9 and 10.

{ 24 comments }

David February 20, 2011 at 5:34 pm

Paper 3, Synchronized Multi-Character Motion Editing

The basic idea of this paper seems pretty simple: use linear constraints to coordinate motion between multiple actors. The authors start with a scene (that apparently already has characters and motion paths) and add relational constraints. For example, two characters carrying a chair together need to remain a fixed distance apart. The authors also mentioned they could add constraints to end effectors, but sort of skipped the relevant details.

Another idea presented was discrete motion editing, which automatically edits a motion to better fit with a Laplacian transform. For example, a walk motion that has been transformed to cover a larger distance would show artifacts: foot sliding, a huge stride, or some other unnatural behavior. Discrete motion editing traverses a motion graph to find places where motions can be added, removed, or replaced, and then chooses an edit that minimizes deformation energy. I honestly got a bit lost here; I’m not sure what sort of energy they’re actually measuring. One thing that was clear is that they are not doing any clever searching for a motion edit, they are simply computing all the edits produced by the motion graph and selecting the best one. Some clever pruning of motions helps here, but it’s still rather brute force.

The results of this paper can be used to quickly create scenes with multiple characters that interact in consistent and believable ways. I must admit I was a bit skeptical of this paper before watching the example video, but their program seems very fast and responsive. I don’t think it’s fast enough to be used in a fully-rendered interactive setting (i.e. videogame) but it could certainly be used to author large-scale actions quickly for use in movies or other animation.

Reid February 20, 2011 at 9:06 pm

Splicing Upper-Body Actions with Locomotion

What: A fast and simple algorithm to splice upper and lower body motions while preserving style and posture.

Why: Existing methods for splicing upper and lower body motions are very simplistic, leading to unrealistic motions. Splicing portions of the body is desirable since it reduces the total number of motions that need to be captured.

How: The algorithm uses time alignment to synchronize the body motions (ex: swinging of arms with movement of legs), this is done by using dynamic programming to minimize the ‘distance’ between corresponding frames, then fitting a spline to create the time alignment curve. Once this is done the upper body motion is rotated to align with the upper body of the lower body motion by minimizing the distance between the point clouds of the upper shoulders and spine by rotating about the pelvis. Finally, the posture of the upper body motion is transferred to the spliced motion by generating a point cloud representation of the overall motion and then aligning the result of the previous two steps with it.

I’m curious why the spatial alignment is done in relation to the position of the pelvis and shoulders between the two images, it seems to me this would negate the effect of torso twisting actions that may be desirable in the spliced motion. This isn’t really mentioned in the paper so maybe it wasn’t an issue, or maybe the algorithm just doesn’t work on these types of motions?

I’m also curious as to why the posture operation is global. Maybe it just isn’t an issue due to short motion lengths, but wouldn’t the averaging of the point clouds remove high frequency information that could potentially be vital to the upper body motion?

sgallege February 20, 2011 at 9:20 pm

Splicing Upper-Body Actions with Locomotion

The key idea of the paper is to splice upper body actions with locomotion (lowerbody) to create new natural looking motions. A simple naïve splicing of the upper body and lower body would create undesirable artifacts such as unsynchronized arm and foot movement or improper postures.

The paper presents 3 steps to create more naturalistic upper body and lower body splicing.
1. Time warp: Matches phase and speed of the two original clips so that when the upper body and lower body are spliced the motions are synchronized. To match up the motions a time alignment curve is generated using a point cloud representation and minimum squared distances.
2. Spatial Alignment: Matches up the orientation of the upper body and lower body again using a using a point cloud representation and minimum squared distances.
3. Posture Transfer: removes any undesired effects added by Spatial Alignment especially in motions where shoulders are leaning to a side relative to the hips.

Looking at the sample clips the algorithm does a very good job even in seemingly unrelated motion sequences such as boxing and climbing stairs.

One thing that made me wonder is how well it would work for motions from 2 different actors, there are two additional challenges I can think of
1. The size mismatch: The actor’s limbs may not be the same size/length and might cause unnatural movements when spliced together. And I don’t think a simple up/down solve the problem.
2. Style: Different actors may have distinct styles of walking and doing things, so would this make the result unnatural.

Also as mentioned in the paper I find it interesting that lack of explicit physics constraints may result in some unrealistic results, for example the actor would not walk at the same speed when carrying full cup and would perhaps take smaller steps.

sghosh February 20, 2011 at 9:53 pm

> Bidirectional Search for Interactive Motion Synthesis

The authors in this paper have proposed an efficient algorithm for motion synthesis using motion graphs. The technique used is a modified version of the bidirectional A* search algorithm. Simple bi-directional search expands search trees from the two end nodes (in this case these are the end frames from the 2 motion clips to be merged). The modified version has the property that the search space is dynamically divided into 2 parts using a ‘cut’, carry out the partial searches and then merge the two half trees at the cut. They also use a data structure called merge hash table to keep track of candidate ‘nodes’ that can be merged based on certain state space constraints. The trick in this method is to decide where the cut will be – initially the cut is placed somewhere in the middle and then this is adjusted depending on the convergence of the two sub-trees. As a result of this the search space gets reduced to nearly half of the earlier size and there is a significant speedup in searches.

The authors also present an interactive sketching tool that allows users to intuitively specify and edit complex character motions. The user can draw a curve showing the path of a part of a character in one of the 2d views and the system produces the optimized 3d path for the same. The overall results seem pretty good. The system is able to generate the motions at an interactive rate. The final rendered version of the character is generated in another pass.

sandrist February 20, 2011 at 10:02 pm

Geostatistical Motion Interpolation

This paper introduces a method that treats motion interpolations as statistical predictions of missing data in an arbitrarily definable parametric space. Universal kriging is used to estimate correlations between the dissimilarity of motions and their distance in parametric space. The term “geostatistics” simply refers to spatial statistics, and “kriging” refers to making spatial predictions based on geostatitistics.

This method results in a much smoother motion space constructed from the example motions than previous methods relying on radial basis functions. It can generate plausible motions without using any corrective techniques such as IK. It is accurate and robust, allows for interactive manipulation, and provides reliability estimates of generated interpolations. Spatial constraints and style parameters can be combined into a unified control space.

I glossed over much of the in-depth math of this paper, so I would not be able to explain exactly how they pull this method off. I’m a little bit rusty with my knowledge on things like multivariate analysis and radial basis functions, and I wasn’t able to fully wrap my head around what variogram functions are.

adrm February 20, 2011 at 10:05 pm

Synchronized Multi-Character Motion Editing

Key Idea: Use laplacian editing to manipulate spatial and temporal constraints (motion paths and time warps). Combine this with motion graphs as a way of dealing with long clips. Linear constraints are then incorporated into this system.

Questions: At a first glance, I did not fully understand how the linear constraints would be solved using laplacian editing (they call it as rigid as possible), how are these two combined?

I need to understand what they are calling laplacian editing, I understand what the operator does, but do not yet understand how it is used for these rigid as possible deformations

Michael Correll February 20, 2011 at 11:02 pm

I read the Lo & Zwicker paper.

The key question seems simple enough: how to traverse the search space of next frames in a motion graph in a tractable way. The Kovar thesis presented one way (a sort of directionality), but the tradeoffs for optimality may not be worth making. A* would provide an optimal path through the search space, but is expensive. So the natural idea is to start simultaneous searches from the start and end configurations of motion, subdividing the problem. Divide and conquer in search is nothing new, and it’s heartening to know that what works well elsewhere also works in motion.

I liked the sketch interface presented in the paper (in a way I think that’s the more interesting contribution). Doing stuff on the fly (okay, on the order of seconds for computation) is what is interesting to me about the motion graphs technique, and being able to generate natural looking motions without a decade or so of animation experience. Doing the back end work to make things run smoothly is nice, but the front end ability to play around with the techniques is better (and I suspect a non-trivial motivation behind the current project).

csv February 20, 2011 at 11:04 pm

Paper: Splicing Upper Body Actions with Locomotion:

Brief information :
***************
Category : Engineering (Describes a technique to improve existing methods).
********
Citations Since 2006 : 1
******************
Breakthrough Idea: Still searching ( mild way to say, probably Nothing).
****************
Unfamiliar terms: None
***************
Kolmogorov Minimum Description: The paper describe a technique to improve splicing
****************************** upper-body motions from one example motion to the lower-body locomotion and preserve correlations in them.

Reproducibility: Probably hard.
*************

This was a cool paper, very simple to read and understand the basic idea. Splicing
locomotion and the action is an important task for which many existing methods fails
to produce realistic frames ( this paper is also no exceptions, it fails in many situations described in the last sections). The biggest challenge (and hence the focus) is to model high correlation between motion and action.

This paper clearly explains the strength and weakness of the approach, which make this
paper highly readable. It is not very hard to find many more examples, where the human dexterity (with non-linear correlation) in motion could surpass computer generated motion.

The methods use plethora of techniques ( time alignment, spatial alignment, pose transfer, point cloud matching etc) and still report execution time less than 2 seconds, which is hard to believe (unless done with extremely simple models).

I am always concerned about some words in the paper which are subjectives. This paper
says, we have developed a fast, simple and reliable algorithm. All these terms should
be peer reviewed.

Nathan Mitchell February 20, 2011 at 11:37 pm

Bidirectional Search for Interactive Motion Synthesis

The authors of this paper demonstrate the use of a bidirectional A* variant as a search method for constructing motion sequences on motion graphs. In order to make efficient use of the bidirectional search, they employ an adaptive cut, the border between the ‘front’ and ‘back’ of the search, that changes depending on the rate of convergence of the searches. The goal being to adjust the cut point until each side is roughly half of the search space. The purpose of using A* is that it generates near optimal solutions in terms of minimizing the number of graph nodes required for the motion, while the bidirectional search method is to increase the overall efficiency of A*.

While I am generally familiar with A* from path finding problems in game AI ( though not any bidirectional approaches), there is something that I am somewhat confused on:

On page 4, near the end of the second column, the authors discuss how they explicitly wish to avoid early termination of one path. However, if the path is able to quickly converge to an optimal solution, would not the act of altering the cut produce a solution that is less than optimal? Perhaps I don’t fully grasp the concept, but I would think that arriving at an optimal solution quickly on one side is more valuable than adjusting the cut such that both sides continue to iterate. Unless by not doing so, the time required for the non-terminated side to then converge is vastly greater than the alternative of moving the cut.

sghosh February 22, 2011 at 12:24 pm

I think the authors have confused the use of ‘terminate’ and ‘converge’. The penultimate paragraph in that column says –
“Hence, one tree may surpass the cut and terminate much earlier than the other.” and ” It is also possible that one search converges more quickly to the optimal solution than the other”
What I interpret from that is
– termination is when one tree hits the cut -> then just shift the cut towards the other side
-if both converge before then maybe lookup the merge tables and find if there are any merge-able nodes

Leslie February 21, 2011 at 12:32 am

I read “Synchronized Multi-Character Motion Editing” (3).

This paper describes a method that allows animators to interactively edit character motion by specifying various spatial and temporal constraints, such as absolute position, location relative to another character, timing (both absolute and relative), and end-effector locations. The motion of the character subject to all but the end-effector constraints can be posed as a linear least squares problem. In addition to editing continuous motion sequences, discrete motion sequences can be blended when necessary to create a more natural-looking result.

Most of my confusions about this paper were cleared up by watching the video. Initially, I didn’t understand the difference between continuous and discrete motion editing, or why this technique was ideal for multiple interacting characters, but the video addressed both questions. My only remaining questions is: What is “as-rigid-as-possible curve editing”? This is a term I’m not familiar with.

Aaron Bartholomew February 21, 2011 at 1:43 am

Dear god, this paper is brutal. I thought I was somewhat mathematically literate, but I’m second guessing myself now. Let me try and summarize what the authors are presenting here. By using universal kriging, a modified form of spatial prediction which compensates for the lack of intrinsic stationarity in motion data, Mukai & Kuriyama are able to parametrize motions using dynamically computed kernel functions to get optimal blending between poses. Radial basis functions accomplish a similar goal, but they apply a uniform kernel across the pose space which can provide results with high-frequencies (jerkiness/imprecision ensues). In order to determine this predictive model, a pre-processing step is required, which computes the trend estimate and variogram estimate of the data. The trend estimate is actually a hyperplane that is formed by minimizing the least squares, for a given component of a motion, of the trend component (will need to be elaborated) and the synthesized pose (aka residual of this element). The variogram is computed using the residuals of the per-pose distance metric (created by Kovar, point cloud), between sampled and initial poses. These results parametrize the inputted motions into a space of time-aligned motions. Then during runtime, this information is used to compute the blending kernel function as well as predict residuals, which effectively predict the appropriate motion.

In order to truly understand this method, I’m going to help with the following:

-What the hell is kriging?
-What are the variogram function and trend space components of kriging?
-I’m still not clear on the definition of a pseudo-inverse matrix, is that when a sub-matrix of the original is invertible?

Aaron Bartholomew February 21, 2011 at 1:44 am

I should really re-read my responses before posting them…

The paper I read is Tomohiko Mukai and Shigeru Kuriyama’s, “Geostatistical Motion Interpolation”.

Aaron Bartholomew February 22, 2011 at 10:00 pm

After re-reading the geostatistics paper, I think I can sum up the intuition for kriging in relatively few words. The overall goal is to use least-squares regression to minimize the variance of the prediction error. Universal kriging assumes a linear relationship for the spatial dependence of values in the random field (i.e. it linearly interpolates the sampled values). Using this expectance of spatial dependence combined with a variogram (a dissimilarity function of distance, estimated with the samples), we can form a geometrical probability model that will have minimized predictive error variance from kriging. With this statistical model, an optimal blending kernel can be computed that best compensates for the similarity (rather than just fitting a kernel to the nearby samples). This model is intrinsically stationary, or invariant, so it can be maintained as a character moves around.

Although I’m certainly not ready to implement this yet, I feel like I have a much better grasp on the ideas. Kriging threw so many new terms at me that it seemed more difficult than it really is; it really just comes down to a glorified form of least squares.

danieljc February 21, 2011 at 1:54 am

“Splicing Upper-Body Actions with Locomotion” provides a way to take motion from the upper portion of a body in one data set and combine it with the motion from the lower part of another motion set. It is intended only for motions where the person is moving around. To make this work, the phase of the motions has to be first approximately matched up between the two different motions to be merged. After the phases have been matched, the rotations of the two motion sets also has to be coordinated, and is typically matched from around the pelvis. Finally, the motion between the shoulders and the hip is also used for matching up the upper and lower body. The effects of physics are not directly calculated when combining the upper and lower motions, but the paper says that by matching up parts of the motion on the upper and lower bodies, closely accurate results are still often achieved. Also, because of the use of matching shoulder to hip motions, large motions in the shoulders of the upper motions can cause problems.

I was curious how well this performed when the walking speeds of two motions were different. The paper talked about matching up the motion phases, but it seems like the overall motion of someone walking at different speeds is also much different even if the timing is fixed.

Jim Hill February 21, 2011 at 4:18 am

I read the paper on Bidirectional Searching for Motion Synthesis

The two main contributions of this paper are first, a method of searching a motion graph for a specified beginning and ending configurations, and second a sketching interface for intuitive animation.

The search is performed bidirectionally, meaning that two searches are carried out, one from the staring configuration and one from the ending configuration. In order to make things a little faster, the configuration space of the motion graph is roughly divided in half and each search stays on it’s own side. This is called a “cut”, if a search runs out of nodes to look at (either because there aren’t any left or there aren’t any that satisfy the cost function) the cut may be modified to give that search more stuff to do. The actual searches are done using something called an A* search. This method apparently attempts to predict the best node to search to by summing the current cost to get to that node and a heuristic that attempts to predict the cost through that node to the end configuration. The paper may have described the heuristic used, but I missed it and I’m not sure what it was. Merging of the search trees is done using a “Merge Hash Table” which I don’t understand very well. It seems that each configuration is hashed into a grid such that other configurations show up in the same grid cell. When looking for good matches for merges, the cell that the configuration falls into is searched.

The sketching interface allows the user to sketch, in the 2D image plane, a trajectory for the figure to attempt to move along. The algorithm attempts to minimise a cost function consisting of a distance cost, a speed cost, and a complete cost (I’m not sure what the complete cost is, but it has to do with the smoothness of the transitions).

raja February 21, 2011 at 8:31 am

I read the Motion Splicing paper and glanced through the ideas in the Bidirectional Search and Synchronized multi-character editing papers.
The motion slicing paper provides a better solution than the simple DOF replacement strategy and uses time warping to sync the motions (upperbody and lowerbody), spatial alignment to fix the orientations and then posture transfer as a post-processing step. This way, the underlying spatial and temporal relationships of the original motions are carried over to a large extent.

I just LOVED the UI in both the bidirectional search and sync multi-character papers. Simple, sweet and intuitive! I think I’ll summarize these papers on wednesday instead. I tried looking at Geostatistical motion interpolation since it sounded cool, but terms like gaussian process regression, intrinsic stationarity, variogram and the statistics knowledge one needs to really understand it made me cry.

points worth discussing:
i) are per-frame IK techniques more “real-time” than techniques that have a more global view (such as spacetime)?
ii) importance of “fast synthesis” and being able to “edit at real time”; are these one and the same?
iii) can temporal and spatial constraints/requirements be traded for each other? what i’m thinking here is related to the slicing paper. one of the problems was that the “mass” could not be regulated(when an object was being carried). one way to enforce the importance of the mass is either making the steps smaller (spatial editing) or taking slower steps (temporal editing). has this been done before?

Danielle February 21, 2011 at 8:48 am

I read the paper Geostatistical Motion Interpolation. This paper attempts to interpolate between motions by optimizing the kernel functions used for the interpolation. They leverage knowledge of spatial computations from geostatistics, most specifically a method called ‘universal kriging’ to generate real-time motion control in over a set of parameters using linear systems.

To really understand this paper, you would need an in-depth understanding of statistical methods. The paper relies heavily on a comparison of kriging to radial basis kernels to support it’s claims of being ‘better’ than previous methods and also to provide a groundwork for understanding their paper.

One of the most interesting things about this work is their integration of previously proven methods from other fields into animation. At first glance, it appears to provide very quick and plausible motions and a real-time constraint solving mechanism that needs little, if any, post-processing correction.

Danielle February 21, 2011 at 8:50 am

Also, as far as questions go, the one overall question I want to know is why this works? Why do geospatial interpolation methods map to plausible motion?

xlzhang February 21, 2011 at 9:04 pm

Synchronized Multi-Character Motion Editing

This paper presented a relatively simple, intuitive framework for simultaneously editing motions of multiple characters, while enforcing constraints using a Laplacian editing method. Although most of the techniques used in this paper are not entirely novel, I still found it to be a great paper because of the authors’ attention to details I’d not previous thought of; not to mention the incredibly simple yet full featured user interface.

Some of the things they paid attention to in implementing their framework that stood out: paying attention to possible flipped tangents, what to do when a user tries to edit stationary movement, incremental updates instead of searching well connected motion graphs, automatically inserting extra loops of the walking motion to retain naturalness if a sequence is dragged out too long. Also, their method of editing motion addresses something that I thought was strange when reading the motion graph paper last week, specifically the part about not being able to synthesize sharp turns unless the data contains examples of a sharp turn; seem like they solved that problem.

csv February 22, 2011 at 5:44 pm

Hello,

I tried to explore some of the issues that we had discussed in the class on Monday. One of the thing was “Laplacian Editing” which now I understand is the extremely simple and may not take more than 100 lines of C++ code to implement.

Here is the video from one of the authors which will explain what it does:
http://visgraph.cse.ust.hk/projects/dual_laplacian

There are some issues which authors are hiding or not forthcoming is that “Volume preservation ” issue. but I thing there are always workaround or engineering solution (for example distort the model from the invisible side and preserve the shape from visible side )

csv February 22, 2011 at 10:21 pm

For the 3rd paper, I started reading “Synchronized Multi-Character Motion Editing” with lots of excitement generated by the abstract and introduction, but here are some of the question that I have for this paper that dampened my spirit.

1. Laplacian operator is a low pass filter, so any close curve will soon turn into a circle and open curve into a straight line ? ( by Eigenvalues analysis ) and probably too soon. So it is that our actors will soon be forced into moving in circle or straight line ?

2. What exactly are the soft-constraints in the solver ? does it mean that some points are allowed to move at half the allowable speed ?

3. The whole beauty of the “Laplace Solver ” is messed up in equation 4, it is no more positive definate or probably diagonal dominant. I don’t understand why an elegant sparse and NxN matrix was converted into an overdetermined system of equation and Pseudo-Inverse came into scene. I am not sure, if purist will still call it Laplacian Editor.

4. The final dampener was that the reported work was done in 2D only. Equation 2 used only (x,y) coordinates and that also in local coordinate system. Never heard someone doing global operation with local system. I am thoroughly confused with their formulation. Something is surely missing in the paper.

It is well known fact that any generalization from lower dimension to higher dimension is non-trivial in most cases. I surely will not buy the arguments of the
authors in section 7, para #2.

5. Many strong sentences have been used in the paper without any references. “Motion path is notorious for its exponential computational complexity” should have been substantiated with citations and better explanation.

Jim Hill February 23, 2011 at 4:07 am

I looked at Lo and Zwicker again and got about the same understanding. The bidirectional search isn’t to complicated, nor is the cut, although I don’t understand exactly how the configuration space is divided.

One thing that I don’t understand is exactly how the sketch interface is implemented, I understand what it does, just not how it uses the bidirectional search to find it.

Michael Correll February 23, 2011 at 8:13 am

My understanding was that the bidirectional search makes the motion synthesis fast enough that it’s nearly real time, so the sketch interface is just drawing spacetime curves as per normal and then running a motion through them. That’s what it looks like is going on on the video, in any event.

{ 2 trackbacks }

Previous post:

Next post: