Motion
models
The key function of PatchMaker is estimating motion of an image region
selected by the user. Motion is analyzed and described as a parametric
transformations of one of the model types supported by PatchMaker. In
this section, the merits of using a particular motion model in each particular
case are discussed.
Theoretically, a parametric model can be applied to describe
the region motion if:
-
The given region corresponds to a plain surface in space, or
-
The region "covers" different objects at different depths
relative to the camera but the camera does not move with respect to
these objects during the episode.
In most other cases, object image will evolve in a way that cannot be
described exactly by a relatively simple parametric model. Even so, certain
models can yield a good enough fit to the real motion.
The use of parametric motion models allows PatchMaker to accurately determine
object motion by taking into account the displacements of all its pixels.
The following motion models are supported:
-
Translation — a two-parameter model. The parameters
are the displacements in the horizontal and vertical directions;
-
Rigid — a three-parameter model. The displacements
along the two axes and the rotation angle are estimated;
-
Rigid & Scale — a four-parameter model.
The displacement vector, the rotation angle, and the scaling factor
are determined;
-
Affine — a six-parameter model. The displacement
vector, the rotation angle, the two scaling factors along two perpendicular
axes, and their orientation angle are computed.
The motion model to be used is a property of the selected segment:

The default is the affine motion model. It is worth remembering, however,
that, when the object motion can be accurately described by a simpler
model, the use of a more complex model might lead to inferior results,
especially if the object is small or has few textured details.
If the object's perspective does not change
The example below demonstrates shows how the Rigid
& Scale model can be applied when the aspect of the moving
object does not change. Suppose, we have two consecutive frame and the
task is to track the plane motion in order to place a number on the plane
body.
 
Let us shift, rotate, and scale the first frame to match the image of
the plane on the second frame. If we simply copy the first frame plane
contour to the second frame, we get:

By shifting the first frame by 13 pixels left and 3 pixels down, we get:

The matching can be improved if we now rotate the aircraft image by a
small angle counterclockwise about its "mass center":

Finally, shrinking the yellow contour a little bit towards the "mass
center" yields an almost exact matching:

This is roughly what PatchMakerdoes as it tries to find a match for the
key frame mask in another frame. The basic transforms that we applied
combine to yield the four parameter values of the Rigid
& Scale transform. Finally, the overlay available for the key
frame is warped using this transform and pasted onto the next frame.
When An affine model is indispensable
In the following example, the aspect of the object changes during the
episode — the car first moves towards the camera and then turns
right revealing its side:
 
As you can see, to match the car front panels in these two frames requires
different scaling along the horizontal and vertical axes. This is possible
only in the affine motion model.
Unstability of tracking with excessive parameters
Suppose we have to follow the motion of a low low-contrast or noisy region
as in the example below:

If an overlay fails to move as it should (correctly and stably) under
the affine model, it can be because of low "information density"
of the tracked region. Indeed, to estimate six affine parameters, the
mask region must have at least three salient gradients in different directions
or three distinguishable spots. In such cases, a model with a smaller
number of parameters (Translation, Rigid,
Rigid & Scale) can yield better results.
Motion models in PatchMaker do not cover all types
of transformations of a plain surface
In the video clip below, the task was to "color" the fence.
However, PatchMaker (version 1.0) failed to do this. The most complex
motion that PatchMaker can currently handle is Affine
motion represented by affine transformations in the image plane.
Such transformations transform parallel lines into parallel lines. This
condition is not satisfied in the two frames below
 
The right transform to be used here is an eight-parameter perspective
transform, but PatchMaker version 1.0 does not support it.
Exact MOTION Model equations
The overlay color at pixel (xk, yk)
of frame #k retains the color C(x0,
y0) of the pixel (x0, y0)
in the match frame (the frame #0), where (xk,
yk) and (x0, y0) are
related by one of the following transformations:
-
The Translation model:
,
where tx , ty
are the displacement parameters;
-
The Rigid model:

where
is the rotation angle (in radians);
-
The Rigid & Scale model:

where r is the scaling factor;
-
The Affine model:

where a, b, c, d, tx , ty are the
coefficients of an affine transform.
|