H.261 is an earlier digital video compression standard. Because its principle of motion - compensation - based compression is very much retained in all later video compression standards, we will start with a detailed discussion of H.261.
The International Telegraph and Telephone Consultative Committee (CCITT) initiated development of H.261 in 1988. The final recommendation was adopted by the International Telecommunication Union - Telecommunication standardization sector (ITU - T), formerly CCITT, in 1990.
The standard was designed for videophone, video conferencing, and other, audio visual services over ISDN telephone lines. Initially, it was intended to support multiples (from 1 to 5) of 384 kbps channels. In the end, however, the video codec supports bitrates of p x 64 kbps, where p ranges from 1 to 30. Hence the standard was once known as p * 64, pronounced "p star 64". The standard requires the video encoders delay to be less than 150 msec, so that the video can be used for real - time, bidirectional video conferencing.
H.261 belongs to the following set of ITU recommendations for visual telephony systems:
H.221. Frame structure for an audiovisual channel supporting 64 to 1,920 kbps
H.230. Frame control signals for audiovisual systems
Table Video formats supported by H.261
The above table lists the video formats supported by H.261. Chroma subsampling in H.261 is 4:2:0. Considering the relatively low bitrate in network communications at the time, support for CCIR 601 QCIF is specified as required, whereas support for CIF is optional.
The following figure illustrates a typical H.261 frame sequence. Two types of image frames are defined: ultra - frames (I - frames) and interframes (P - frames).
I - frames are treated as independent images. Basically, a transform coding method similar to JPEG is applied within each I - frame, hence the name "intra".
P - frames are not independent. They are coded by a forward predictive coding method in which current macroblocks are predicted from similar macroblocks in the preceding I: or P - frame, and differences between the macroblocks are coded. Temporal redundancy removal is hence included in P - frame coding, whereas I - frame coding performs only spatial redundancy removal. It is important to remember that prediction from a previous P - frame is allowed (not just from a previous I - frame).
The interval between pairs of I - frames is a variable and is determined by the encoder. Usually, an ordinary digital video has a couple of I - frames per second. Motion vectors in H.261 are always measured in units of full pixels and have a limited range of ±15 pixels that is, p = 15.
H.261 Frame sequence
I - frame coding
Intra - Frame(l - Frame) Coding
Macroblocks are of size 16 x 16 pixels for the Y frame of the orignal image. For Cb and Cr frames, they correspond to areas of 8 x 8, since 4:2:0 chroma subsampling is employed. Hence, a macroblock consists of four Y blocks, one Cb, and one Cr, 8 x 8 blocks.
For each 8 x 8 block, a DCT transform is applied. As in JPEG, the DCT coefficients go through a quantization stage. Afterwards, they are zigzag - scanned and eventually entropy - coded.
Inter - Frame (P - Frame) Predictive Coding
The following figure shows the H.261 P - frame coding scheme based on motion compensation. For each macroblock in the Target frame, a motion vector is allocated by one of the search methods discussed earlier. After the prediction, a difference macroblock is derived to measure the prediction error. It is also carried in the form of four Y blocks, one Cb, and one Cr block. Each of these 8 x 8 blocks goes through DCT, quantization, zigzag scan, and entropy coding. The motion vector is also coded.
Sometimes, a good match cannot be found — the prediction error exceeds a certain acceptable level."The macroblock itself is then encoded (treated as an intra macroblock) and in this case is termed a non - motion - compensated macroblock.
P - frame coding encodes the difference macroblock (not the Target macroblock itself). Since the difference macroblock usually has a much smaller entropy than the Target macroblock a a large compression ratio is attainable.
In fact, even the motion vector is not directly coded. Instead, the difference, MVD, between the motion vectors of the preceding macroblock and current macroblock is sent for entropy coding:
Quantization in H.261
The quantization in H.261 does not use 8 x 8 quantization matrices, as in JPEG and MPEG. Instead, it uses a constant, called stepsize, for all DCT coefficients within a macroblock.
H.261 P - frame coding based on motion compensation
According to the need (e.g., bitrate control of the video) stepsize can take on any one of the 31 even values from 2 to 62. One exception, however, is made for the DC coefficient in intra mode, where a step size of 8 is always used. If we use DCT and QDCT to denote the DCT coefficients before and after quantization, then for DC coefficients in intra mode,
where scale is an integer in the range of [1, 31]
H.261 Encoder and decoder
The following figure shows a relatively complete picture of how the H.261 encoder and decoder work. Here, Q and Q - 1 stand for quantization and its inverse, respectively. Switching of the intra - and inter - frame modes can be readily implemented by a multiplexer. To avoid propagation of coding errors,
H.261: (a) encoder; (b) decoder
Table Data flow at the observation points in H.261 encoder
Table Data flow at the observation points in H.261 decoder
To illustrate the operational detail of the encoder and decoder, let's use a scenario where frames I, P1, and P2 are encoded and then decoded. The data that goes through the observation points, indicated by the circled numbers in the above figure is summarized in the above tables. We will use I, P1, P2 for the original data,for the decoded data (usually a lossy version of the original), and P' 1, P' 2for the predictions in the Inter - frame mode.
For the encoder, when the Current Frame is an Intra - frame, Point number 1 receives macroblocks from the I - frame. DCT, Quantization, and Entropy Coding steps, and the result is sent to the Output Buffer, ready to be transmitted.
Meanwhile, the quantized DCT coefficients for I are also sent to Q - 1 and IDCT and hence appear at Point as I. Combined with a zero input from Point, the data at Point remains as I and this is stored in Frame Memory, waiting to be used for Motion Estimation and Motion - Compensation - based Prediction for the subsequent frame P1.
Quantization Control serves as feedback — that is, when the Output Buffer is too full, the quantization step size is increased, so as to reduce the size of the coded data. This is known as an encoding rate control process.
When the subsequent Current Frame P1 arrives at Point 1, the Motion Estimation process is invoked to find the motion vector for the best matching macroblock in frame I for each of the macroblocks in P1. The estimated motion vector is sent to both Motion - Compensation - based Prediction and Variable - Length Encoding (VLE). The MC - based Prediction yields the best matching macroblock in P1. This is denoted as P`1 appearing at Point 2.
At Point, the "prediction error" is obtained, which is D1 = P1 - P`1. Now D1 undergoes DCT, Quantization, and Entropy Coding, and the result is sent to the Output Buffer. As before, the DCT coefficients for D1 are also sent to Q - l and IDCT and appear at Point 4 as D1.
Added to P’1 at Point, we have P' 1 = P' 1 + D' 1at Point6. This is stored in Frame Memory, waiting to be used for Motion Estimation and Motion - Compensation - based Prediction for the subsequent frame P2. The steps for encoding P2 are similar to those for P1, except that P2will be the Current Frame and P1 becomes the Reference Frame.
For the decoder, the input code for frames will be decoded first by Entropy Decoding, Q - 1, and IDCT. For Intra - frame mode, the first decoded frame appears at Point 1 and then Point 4 as I. It is sent as the first output and at the same time stored in the Frame Memory.
Subsequently, the input code for Inter - frame Pi is decoded, and prediction error D1 is received at Point. Since the motion vector for the current macroblock is also entropy - decoded and sent to Motion - Compensation - based Prediction, the corresponding predicted macroblock P’1 can be located in frame I and will appear at Points.
Combined with D' 1, we have P'1 = P' 1 + D' 1 at point, and it is sent out as the decoded frame and also stored in the Frame Memory, Again, the steps for decoding P2 are similar to those for P1
A Glance at the H.261 Video Bitstream Syntax
Let's take a brief look at the H.261 video bitstream syntax. This consists of a hierarchy of four layers: Picture, Group of Blocks (GOB), Macroblock, and Block.
Each GOB has its Start Code (GBSC) and. Group number (GN). The GBSC is unique and can be identified, without decoding the entire variable - length code in the bitstream. In case a network error causes a bit error or the loss of some bits, H.261 video can be recoyered and resynchronized at the next identifiable GOB, preventing the possible propagation of errors.
Syntax of H.261 video bitstream
GQuant indicates the quantizer to be used in the GOB, unless it is overridden by any subsequent Macroblock Quantizer (MQuant). GQuant and MQuant are referred to as scale. Each macroblock (MB) has its own Address, indicating its position within the GOB, quantizer (MQuant), and six 8 x 8 image blocks (4 Y, 1 Cb, 1 Cr). Type denotes whether it is an Intra- or Inter, motion - compensated or non - motion - compensated macroblock. Motion Vector Data (MVD) is obtained by taking the
Arrangement of GOBs in H.261 luminance images
difference between the motion vectors of the preceding and current macroblocks. Moreover, since some blocks in the macroblocks match well and some match poorly in Motion Estimation, a bitmask Coded Block Pattern (CBP) is used to indicate this information. Only well - matched blocks will have their coefficients transmitted. Block layer. For each 8^ x. g block, the bitstream starts with DC value, followed by pairs of length of zero - run (Rim) and the subsequent nonzero value (Level) for ACs, and finally the End of Block (EOB) code. The range of "Run" is [0,63]. "Level" reflects quantized values its range is [ - 127,127], and Level ≠ 0.
MULTIMEDIA Related Tutorials
|Adobe Photoshop Tutorial|
MULTIMEDIA Related Interview Questions
|MULTIMEDIA Interview Questions||Adobe Photoshop Interview Questions|
|Illustrator Interview Questions||3D Animation Interview Questions|
|Video Editing Interview Questions||UI Developer Interview Questions|
|Synchronized Multimedia Integration Language (SMIL) Interview Questions||Multimedia compression Interview Questions|
|Gif Animation Interview Questions|
Introduction To Multimedia
Multimedia Authoring And Tools
Graphics And Image Data Representations
Colour In Image And Video
Fundamental Concepts In Video
Basics Of Digital Audio
Lossless Compression Algorithm
Lossy Compression Algorithms
Image Compression Standards
Basic Video Compression Techniques
Mpeg Video Coding I – Mpeg 1 And 2
Mpeg Video Coding Ii- Mpeg-4, 7, And Beyon
Basic Audio Compression Techniques
Mpeg Audio Compression
Computer And Multimedia Networks
Multimedia Network Communications And Applications
Content-based Retrieval In Digital Libraries
All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.