The MPEG - 1 audio / video digital compression standard was approved by the International Organization for Standardization / International Electro technical Commission (ISO / IEC) MPEG group in November 1991 for coding of moving pictures and Associated - Audio for Digital Storage Media at up to about 1.5 Mbit / s. Common digital storage media include compact discs (CDs) and video compact discs (VCDs), Out of the specified 1.5 Mbps, 1.2 Mbps is intended for coded video, and 256 kbps can be used for stereo audio. This yields a picture quality comparable to VHS cassettes and a sound quality equal to CD audio.

In general, MPEG - 1 adopts the CCIR601 digital TV format, also known as Source input Format (SIF). MPEG - 1 supports only noninterlaced video. Normally, its picture resolution is 352 x 240 for NTSC video at 30 fps, or 352 x 288 for PAL" video at 25 fps. It uses 4:2:0 chroma subsampling..

The MPEG - 1 standard, also referred to as ISO / IEC 11172, has five parts: 11172 - 1 Systems, 11172 - 2 Video, 11172 - 3 Audio, 11172 - 4 Conformance, and 11172 - 5 Software. Briefly, Systems takes care of, among many things, dividing output into packets of bit - streams, multiplexing, and synchronization of the video and audio streams. Conformance (or compliance) specifies the design of tests for verifying whether a bitstream or decoder complies with the standard. Software includes a complete software implementation of the MPEG - 1 standard decoder and a sample software implementation of an encoder.

The need for bidirectional search

need for bidirectional search

Motion Compensation in MPEG - 1

As discussed in the last chapter, motion - compensation - based video encoding in H.261 works as follows: In motion estimation, each macroblock of the target P - frame is assigned a best matching macroblock from the previously coded I - or P - frame. This is called a prediction. The difference between the macroblock and its matching macroblock is the prediction error, which is sent to DCT and its subsequent encoding steps.

Since the prediction is from a previous frame, it is called forward prediction. Due to unexpected movements and occlusions in real scenes, the target macroblock may not have a good matching entity in the previous frame. The above figure illustrates that the macroblock containing part of a ball in the target frame cannot find a good matching macroblock in the previous frame, because half of the ball was occluded by another object. However, a match can readily be obtained from the next frame.

MPEG introduces a third frame type -B - frames - and their accompanying bidirectional motion compensation. The following figure illustrates the motion - compensation - based B - frame coding idea. In addition to the forward prediction, a backward prediction is also performed, in which the matching macroblock is obtained from a future I - or P - frame in the video sequence. Consequently, each macroblock from a B - frame will specify upto two motion vectors, one from the forward and one from the backward prediction.

If matching in both directions is successful, two motion vectors will be sent, and the two corresponding matching macroblocks are averaged (indicated by "%" in the figure) before Comparing to the target macroblock for generating the prediction error. If an acceptable match can be found in only one of the reference frames, only one motion vector and its corresponding macroblock will be used from either the forward or backward prediction.

B - frame coding based on bidirectional motion compensation

frame coding based on bidirectional motion compensation

The inevitable delay and need for buffering become an important issue in real - time network transmission, especially in streaming MPEG video.

The following figure illustrates a possible sequence of video frames. The actual frame pattern is determined at encoding time and. is specified in the video's header. MPEG uses M to indicate the interval between a P - frame and its preceding I - or P - frame, and N to indicate the interval between two consecutive I - frames. In the figure, M = 3, N -9. A special case is M -1, when no B - frame is used.

Since the MPEG and decoder cannot work for any macroblock from a B - frame without its succeeding P - or I - frame, the actual coding and transmission order is different from the display order of the video (shown above).

MPEG frame sequence

MPEG frame sequence

Table The MPEG - 1 constrained parameter set

MPEG - 1 constrained parameter set

Other Major Differences from H.261

Beside introducing bidirectional motion compensation (the B - frames), MPEG - 1 also differs from H.261 in the following aspects:

Source formats. R261 supports only CIF (352 x 288) and QCIF (176 x 144) source formats. MPEG - 1 supports SIF (352 x 240 for NTSC, 352 x 288 for PAL). It also allows specification of other formats, as long as the constrained parameter set (CPS), shown in the above table is satisfied.

Slices. Instead of GOBs, as in H.261, an MPEG - 1 picture can be divided into one or more slices, which are more flexible than GOBs. They may contain variable numbers of macroblocks in a single picture and may also start and end anywhere, as long as they fill the whole picture. Each slice is coded independently.

Slices in an MPEG - 1 picture

Slices in an MPEG - 1 picture

Table Default quantization table (Q) for intracoding

Default quantization table (Q) for intracoding

For example, the slices can have different scale factors in the quantizer. This provides additional flexibility in bitrate control.

Moreover, the slice concept is important for error recovery, because each slice has a unique slice - start - code. A slice in MPEG is similar to the GOB in 11.261 (and H.263): it is the lowest level in the MPEG layer hierarchy that can be fully recovered without decoding the entire set of variable - length codes in the bitstream.

Quantization. MPEG - 1 quantization uses different quantization tables for its intra - and inter - coding. The quantizer numbers for intra - coding vary within a macroblock. This is different from H.261, where all quantizer numbers for AC coefficients are constant within a macroblock.

The step size [it j] value is now determined by the product of Q[i, j] and scale, where Q1 or Q2 is one of the above quantization tables and scale is an integer in the

Table Default quantization table (Q2) for inter - coding

range [1, 31]. Using DCT and QDCT to denote the DCT coefficients before and after quantization, for DCT coefficients in intra - mode,

Default quantization table (Q2) for inter - coding

and for DCT coefficients in inter - mode

DCT coefficients in inter - mode

where Q1 and Q2 refer to above tables.

  • To increase precision of the motion - compensation - based predictions and hence reduce prediction errors, MPEG - 1 allows motion vectors to be of subpixel precision (1 / 2 pixel)! The technique of bilinear interpolation discussed in Section forH.263 can be used to generate the needed values at half - pixel locations.

  • MPEG - 1 supports larger gaps between I - and P - frames and consequently a much larger motion - vector search range. Compared to the maximum range of ±15 pixels for motion vectors in H.261, MPEG - 1 supports a range of [—512, 511.5] for half - pixel precision and [—1,024, 1,023] for full - pixel precision motion vectors. However, due to the practical limitation in its picture resolution, such a large maximum range might never be used.

  • The MPEG - 1 bitstream allows random access. This is accomplished by the Group of Pictures (GOP) layer, in which each GOP is time - coded. In addition, the first frame in any GOP is an I - frame, which eliminates the need to reference other frames. Thus, the GOP layer allows the decoder to seek a particular position within the bitstream and start decoding from there.

The following table lists typical sizes (in kilobytes) for all types of MPEG - 1 frames. It can be seen that the typical size of compressed P - frames is significantly smaller than that of I - frames,

Table Typical compression performance of MPEG - 1 frames

Typical compression performance of MPEG - 1 frames

Layers of MPEG - 1 video bitstream

Layers of MPEG - 1 video bitstream

because inter - frame compression exploits temporal redundancy. Notably, B - frames are even smaller than P - frames, due partially to the advantage of bidirectional prediction. It is also because B - frames are often given the lowest priority in terms of reservation of quality; hence, a higher compression ratio can be assigned.

MPEG - 1 Video Bitstream

The above figure depicts the six hierarchical layers for the bitstream of an MPEG - 1 video.

  • Sequence layer.A video sequence consists of one or more groups of pictures (GOPs). It always starts with a sequence header. The header contains information about the picture, such as horizontal size and vertical size, pixel - aspect - ratio, frame - rate, bit - rate, buffer size, quantization - matrix, and so on. Optional sequence headers between GOPs can indicate parameter changes.

  • Group of Pictures (GOPs) layer.A GOP contains one or more pictures, one of which must be an I-picture. The GOP header contains information such as time - code to indicate hour - minute - second - frame from the start of the sequence.

  • Picture layer. The three common MPEG - 1 picture types are I - picture (intra - coding), P - picture (predictive coding), and B - picture (Bidirectional predictive coding), as discussed above. There is also an uncommon type, D - picture (DC coded), in which only DC coefficients are retained. MPEG - 1 does not allow mixing D - pictures with other types, which makes D - pictures impractical.

Table Profiles and Levels in MPEG - 2

Profiles and Levels in MPEG - 2

  • Slice layer. As mentioned earlier, MPEG - 1 introduced the slice notion for bitrate control and for recovery and synchronization after lost or corrupted bits. Slices may have variable numbers of macroblocks in a single picture. The length and position of each slice are specified in the header.

  • Macroblock layer. Each macroblock consists of four Y blocks, one Q, block, and one Cr block. All blocks are 8 x 8.

  • Block layer. If the blocks are intra - coded, the differential DC coefficient (DPCM of DCs, as in JPEG) is sent first, followed by variable - length codes (VLC), for AC coefficients. Otherwise, DC and AC coefficients are both coded using the variable - length codes.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status