Search For Motion Vectors - MULTIMEDIA

The search for motion vectors MV (u , u) as defined above is a matching problem, also called a correspondence problem. Since MV search is computationally expensive, it is usually limited to a small immediate neighborhood. Horizontal and vertical displacements i and j are in the range [—p, p], where p is a positive integer with a relatively small value. This makes a search window of size (2p + 1) x (2p + 1), as the above figure shows. The center of the macroblock (x0, yo) can be placed at each of the grid positions in the window.

For convenience, we use the upper left comer (x, y) as the origin of the macroblock in the Target frame. Let C(x + k, y + l) be pixels in the macroblock in the Target (current) frame and R(x + i + k, y + j + l) be pixels in the macroblock in the Reference frame, where k and l are indices for pixels in the macroblock and i and j are the horizontal and vertical displacements, respectively. The difference between the two macroblocks can then be measured by their Mean Absolute Difference (MAD), defined as

Mean Absolute Difference (MAD)

where N is the size of the macroblock.

The goal of the search is to find a vector (i, j) as the motion vector MV = (u, v), such that MAD(i, j) is minimum:

(u, v) = [(i, j) | MAD(I, j) is minimum, i Є [-p, p], j Є[—p, p]]

We used the mean absolute difference in the above discussion. However, this measure is by no means the only possible choice. In fact, some encoders (e.g., H.263) will simply use the Sinn of Absolute Difference (SAD). Some other common error measures, such as the Mean Square Error (MSE), would also be appropriate.

Sequential Search

The simplest method for finding motion vectors is to sequentially search the whole (2p + 1) x (2p +1) window in the Reference frame (also referred to asfidl search). A macroblock centered at each of the positions within the Window is compared to the macroblock in the Targetframe, pixel by pixel, and their respective MAD is then derived. The vector (i,j) that offers the least MAD is designated the MV (u, v) forthe macroblock in the Targetframe.

PROCEDURE Motion - vector: sequential search

Motion - vector: sequential search

Clearly, the sequential search method is very costly. Each pixel comparision requires three operations (subtraction, absolute value, addition). Thus the cost for obtaining a motion vector for a single macroblock is (2p + 1). (2p + 1). N2. 3 => 0(p2N2).

As an example, let's assume the video has a resolution of 720 x 480 and a frame rate of 30 fps; also, assume p = 15 and N — 16. The number of operations needed for each motion vector search is thus

(2p+1) 2 - N2.3 = 312 x 162 x 3.

Considering that a single image frame has 720 x 480 / N.N macroblocks, and 30 frames each second, the total operations needed per second is

total operations needed per second

This would certainly make real - time encoding of this video difficult.

2D Logarithmic Search

A cheaper version, suboptimal but still usually effective, is called Logarithmic Search. The procedure for a 2D Logarithmic Search of motion vectors takes several iterations and is akin to a binary search. As the following figure illustrates, only nine locations in the search window, marked "1," are initially used as seeds for a MAD - based search. After the one that yields the minimum MAD is located, the center of the new searchregion is moved to it, and the stepsize (offset) is reduced to half. In the next iteration, the nine new locations are marked "2," and so on. For the macroblock centered at (xo, yo) in the Target frame, the procedure is as follows:

PROCEDURE Motion - vector: 2D - Logarithmic - search

Motion - vector: 2D - Logarithmic - search

Instead of sequentially comparing with (2p + 1)2 macroblocks from the Reference frame, the 2-D Logarithmic Search will compare with only 9. ([log2p] + 1) macroblocks. In fact, it would be 8. ([log2 p] + 1) + 1, since the comparison that yielded the least MAD from the last iteration can be reused. Therefore, the complexity is dropped to 0(log p. N2). Since p is usually of the same order of magnitude as N, the saving is substantial compared to 0(p2N2).

Using the same example as in the previous subsection, the total operations per second drop to

total operations per second drop

2D Logarithmic search for motion vector

2D Logarithmic search for motion vector

A three - level hierarchical search for motion vectors

A three - level hierarchical search for motion vectors

Hierarchical Search

The search for motion vectors can benefit from a hierarchical (multi resolution) approach in which initial estimation of the motion vector can be obtained from images with a significantly reduced resolution. The above figure depicts a three - level hierarchical search in which the original image is at level 0, images at levels 1 and 2 are obtained by down sampling from the previous levels by a factor of 2, and the initial search is conducted at level 2. Since the size of the macroblock is smaller and p can also be proportionally reduced at this leyel, the number of operations required is greatly reduced (by a factor of 16 at this level).

The initial estimation of the motion vector is coarse because of the lack of image detail and resolution. It is then refined level by level toward level 0. Given the estimated motion vector (uk, vk) at level k, a 3 x 3 neighborhood centered at (2 . uk, 2 . vk) at level k - 1 is searched for the refined motion vector. In other words, the refinement is such that at level k — I, the motion vector (uk - l, vk - 1) satisfies

(2uk - 1 u k - l ≤ 2uk + 1, 2vk - 1 v k - l ≤ 2vk + 1),

and yields minimum MAD for the macroblock under examination.

Let (xok, yok) denote the center of the macroblock at level k in the Target frame. The procedure for hierarchical motion vector search for the macroblock centered at (xok, yok) in the Target frame can be outlined as follows:

PROCEDURE Motion - vector: hierarchical - search

Hierarchical Search

We will use the same example as in the previous sections to estimate the total operations needed each second for a three - level hierarchical search. For simplicity, the overhead for initially generating multiresolution target and reference frames will not be included, and it will be assumed that Sequential search is used at each level.

The total number of macroblocks processed each second is still 720 x 480 / N.N x 30. However, the operations needed for each macroblock are reduced to

macroblock

Table Comparison of computational cost of motion vector search methods according to the examples

Table Comparison of computational cost of motion vector search methods according to the examples

The above table summarizes the comparison of the three motion vector search methods for a 720 x 480, 30 fps video when p — 15 and 7, respectively.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

MULTIMEDIA Topics