H.263 is an improved video coding standard for video conferencing and other audio ­ visual services transmitted on Public Switched Telephone Networks (PSTN). It aims at low bitrate communications at bitrates of less than 64 kbps. It was adopted by the ITU - T Study Group 15 in 1995. Similar to H.261 it uses predictive coding for inter - frames, to reduce temporal redundancy, and transform coding for the remaining signal, to reduce spatial redundancy (for both intra - frames and difference macroblocks from inter - frame prediction).

In addition to CIF and QCIF, H.263 supports sub - QCIF, 4CIF, and I6C1F. The following table summarizes video formats supported by H.263. If not compressed and assuming 30 fps, the bitrate for high - resolution videos (e.g., 16CIF) could be very high (> 500 Mbps). For compressed video, the standard defines maximum bitrate per picture (BPPmaxKb), measured in units of 1,024 bits. In practice, a lower bit rate for compressed H.263 video can be achieved.

As injf.26I, the H.263 standard also supports the notion of group of blocks. The difference is that GOBs in H.263 do not have a fixed size, and they always start and end at the left and right borders of the picture. As the following figure shows, each QCIF luminance image consists of 9 GOBs and each GOB has 11 x 1 MBs (176 x 16 pixels), whereas each 4CIF luminance image consists of 18 GOBs and each GOB has 44 x 2 MBs (704 x 32 pixels).

Table Video formats supported by H.263

Video formats supported by H.263

Arrangement of GOBs in H.263 luminance images

Arrangement of GOBs in H.263 luminance images

Motion Compensation in H.263

The process of motion compensation in H.263 is similar to that of H.261. The motion vector (MV) is, however, not simply derived from the current macroblock. The horizontal and vertical components of the MV are predicted from the median values of the horizontal and vertical components, respectively, of MV1, MV2, MV3 from the "previous", "above" and "above and right" macroblocks. Namely, for the macroblock with MV(u,v)

Motion Compensation in H.263

Instead of coding the MV (u, v) itself, the error vector (δu, δv) is coded, where δu = u - up and δv = v - vp. As shown in the following figure, when the current MB is at the border of the picture or GOB, either (0, 0) or MV1 is used as the motion vector for the out - of - bound MB(s).

To improve the quality of motion compensation — that is, to reduce the prediction error H.263 supports half - pixel precision as opposed to full - pixel precision only in H.261. The default range for both the horizontal and vertical components u and v of MV(u,v) is now [-16, 15.5].

The pixel, values needed at half - pixel positions are generated by a simple bilinear inter - polation method, where A, B, C, D and a, b, c, d are pixel values at full - pixel positions and half - pixel positions respectively, and " / " indicates division by truncation (also known as integer division).

Prediction of motion vector in H.263: (a) predicted MV of the current macroblock is the median of (MVl, MV2, MV3); (b) special treatment of MVs when the current macroblock is at border of picture or GOB

Prediction of motion vector in H.263

Optional H.263 Coding Modes

Besides its core coding algorithm, H.263 specifies many negotiable coding options in its various Annexes. Four of the common options are as follows:

Half - pixel prediction by bilinear interpolation in H.263

Half - pixel prediction by bilinear interpolation in H.263

  • Unrestricted motion vector mode. The pixels referenced are no longer restricted to within the boundary of the image. When the motion vector points outside the image boundary, the value of the boundary pixel geometrically closest to the referenced pixel is used. This is beneficial when image content is moving across the edge of the image, often caused by object and / or camera movements. This mode also allows an extension of the range of motion vectors. The maximum range of motion vectors is [—31.5, 31.5], winch enables efficient coding of fast - moving objects in videos.

  • Syntax - basedarithmetic coding mode. Like H.261, H.263 uses variable - length coding as a default coding method for the DCT coefficients. Variable - length coding implies that each symbol must be coded into a fixed, integral number of bits. By employing arithmetic coding, this restriction is removed, and a higher compression ratio can be achieved. Experiments show bitrate savings of 4% for inter - frames and 10% for intra - frames in this mode.

As in H.261, the syntax of H.263 is structured as a hierarchy of four layers, each using a combination of fixed - and variable - length code. In the syntax - based arithmetic coding (SAC) mode, all variable - length coding operations are replaced with arithmetic coding operations. According to the syntax of each layer, the arithmetic encoder needs to code a different bitstream from various components. Since each of these bitstreams has a different distribution, H.263 specifies a model for each distribution, and the arithmetic coder switches the model on the fly, according to the syntax.

  • Advanced prediction mode.In this mode, the macroblock size for motion compensation is reduced from 16 to 8. Four motion vectors (from each of the 8 x 8 blocks) are generated for each macroblock in the luminance image. Afterward, each pixel in the 8 x 8 luminance prediction block takes a weighted sum of three predicted values based on the motion vector of the current luminance block and two out of the four motion vectors from the neighboring blocks — that is, one from the block at the left or right side of the current luminance block and one from the block above or below. Although sending four motion vectors incurs some additional overhead, the use of this mode generally yields better prediction and hence considerable gain in compression.

  • PB - frames mode. As shown by MPEG (detailed discussions in the next chapter), the introduction of a B-frame, which is predicted bidirectionally from both the previous frame and the future frame, can often improve the quality of prediction and hence the compression ratio without sacrificing picture quality. In H.263, a PB - frame consists of two pictures coded as one unit: one P - frame, predicted from the previous decoded I - frame or P - frame (or P - frame part of a PB - frame), and one B - frame, predicted from both the previous decoded I - or P - frame and the P - frame currently being decoded.

The use of the PB - frames mode is indicated in PTYPE. Since the P - and B - frames are closely Coupled in the PB - frame, the bidirectional motion vectors for the B - frame need not be independently generated. Instead, they can be temporally scaled and further enhanced from the forward motion vector of the P - frame so as to reduce the bitrate overhead for the B - frame. PB - frames mode yields satisfactory results for videos with moderate motion. Under large motions, PB - frames do not compress as well as B - frames. An improved mode has been developed in H.263 version 2.

A PB - frame in H.263

A PB - frame in H.263

H.263+ and H.263++

The second version of H.263, also known as H.263+ was approved in January 1998 by ITU - T Study Group 16. It is fully backward compatible with the design of H.263 version 1.

The aim of H.263+ is to broaden the potential applications and offer additional flexibility in terms of custom source formats, different pixel aspect ratios, and clock frequencies. H.263+ includes numerous recommendations to improve code efficiency and error resilience. It also provides 12 new negotiable modes, in addition to the four optional modes in H.263.

Since its development came after the standardization of MPEG - 1 and 2, it is not surprising that it also adopts many aspects of the MPEG standards. Below, we mention only briefly some of these features and leave their detailed discussion to the next chapter, where we study the MPEG standards.

  • The unrestricted motion vector mode is redefined under H.263+. It uses Reversible Variable Length Coding (RVLC) to encode the difference motion vectors. The RVLC encoder is able to minimize the impact of transmission error by allowing the decoder to decode from both forward and reverse directions. The range of motion vectors is extended again to [—256, 256].
  • A slice structure is used to replace GOB for additional flexibility. A slice can contain a yariable number of macroblocks. The transmission order can be either sequential or arbitrary, and the shape of a slice is not required to be rectangular.
  • H.263+ implements Temporal, SNR, and Spatial scalabilities. Scalability refers to the ability to handle various constraints, such as display resolution, bandwidth, and hardware capabilities. The enhancement layer for Temporal scalability increases perceptual quality by inserting B - frames between two P - frames.

SNR scalability is achieved by using various quantizers of smaller and smaller step size to encode additional enhancement layers into the bitstream. Thus, the decoder can decide how many enhancement layers to decode according to computational or network constraints. The concept of Spatial scalability is similar to that of SNR scalability. In this case, the enhancement layers provide increased spatial resolution.

  • H.263+ supports improved PB - frames mode, in which the two motion vectors of the B - frame do not have to be derived from the forward motion vector of the P - frame, as in version 1. Instead, they can be generated independently, as in MPEG - 1 and 2.
  • Deblocking filters in the coding loop reduce blocking effects. The filter is applied to the edge boundaries of the four luminance and two chrominance blocks. The coefficient weights depend on the quantizer step„ size for the block. This technique results in better prediction as well as a reduction in blocking artifacts.

The development of H.263 has continued beyond its second version, with the new extension known informally as H.263++. H.263++ includes the baseline coding methods of H.263 and additional recommendations for enhanced reference picture selection (ERPS), data partition slice {DPS), and additional supplemental enhancement information. ERPS mode operates by managing a multiframe buffer for stored frames, enhancing coding efficiency and error resilience. DPS mode provides additional enhancement to error resilience by separating header and motion - vector data from DCT coefficient data in the bitstream and protects the motion - vector data by using a reversible code. The additional supplemental enhancement information provides the ability to add backward - compatible enhancements to an H.263 bitstream.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status