The Jpeg2000 Standard - MULTIMEDIA

The JPEG standard is no doubt the most successful and popular image format to date. The main reason for its success is the quality of its output for relatively good compression ratio. However, in anticipating the needs and requirements of next - generation imagery applications, the JPEG committee has defined a new standard: JPEG2000.

The new JPEG2000 standard aims to provide not only a better rate - distortion tradeoff and improved subjective image quality but also additional functionalities the current JPEG standard lacks. In particular, the JPEG2000 standard addresses the following problems :

  • Low - bitrate compression. The current JPEG standard offers excellent rate - distortion performance at medium and high bitrates. However, at bitrates below 0.25 bpp, subjective distortion becomes unacceptable. This is important if we hope to receive images on our web - enabled ubiquitous devices, such as web - aware wristwatches, and so on.

  • Lossless and lossy compression. Currently, no standard can provide superior lossless compression and lossy compression in a single bitstream.

  • Large images. The new standard will allow image resolutions greater than 64 k x 64 k without tiling. It can handle image sizes up to 232 — 1.

  • Single decompression architecture. The current JPEG standard has 44 modes, many of which are application - specific and not used by the majority of JPEG decoders.

  • Transmission in noisy environments. The new standard will provide improved error resilience for transmission in noisy environments such as wireless networks and the Internet.

  • Progressive transmission. The new standard provides seamless quality and resolution scalability from low to high bitrates. The target bitrate and reconstruction resolution need not be known at the time of compression.

Region - of - interest coding. The new standard permits specifying Regions of Interest (ROI), which can be coded with better quality than the rest of the image. We might, for example, like to code the face of someone making a presentation with more quality than the surrounding furniture.

Computer - generated imagery. The current JPEG standard is optimized for natural imagery and does not perform well on computer - generated imagery.

Compound documents. The new standard offers metadata mechanisms for incorporating additional non - image data as part of the file. This might be useful for including text along with imagery, as one important example.

In addition, JPEG2000 is able to handle up to 256 channels of information, whereas the current JPEG standard is able to handle only three color channels. Such huge quantities of data are routinely produced in satellite imagery.

Consequently, JPEG2000 is designed to address a variety of applications, such as the Internet, color facsimile, printing, scanning, digital photography, remote sensing, mobile applications, medical imagery, digital library, e - commerce, and so on. The method looks ahead and provides the power to carry out remote browsing of large compressed images.

The JPEG2000 standard operates in two coding modes: DCT - based and wavelet - based. The DCT - based coding mode is offered for backward compatibility with the current JPEG standard and implements baseline JPEG. All the new functionalities and improved performance reside in the wavelet - based mode.

Code block structure of EBCOT

Code block structure of EBCOT

Main Steps of JPEG2000 Image Compression*

The main compression method used in JPEG2000 is the (Embedded Block Coding with Optimized Truncation) algorithm (EBCOT), designed by Taubman. In addition to providing excellent compression efficiency, EBCOT produces a bitstream with a number of desirable features, including quality and resolution scalability and random access.

The basic idea of EBCOT is the partition of each subband LL, LH, HL, HH produced by the wavelet transform into small blocks called code blocks. Each code block is coded independently, in such a way that no information for any other block is used.

A separate, scalable bitstream is generated for each code block. With its block - based coding scheme, the EBCOT algortthm has improved error resilience. The EBCOT algorithm consists of three steps:

  • Block coding and bitstream generation
  • Postcompression rate distortion (PCRD) optimization
  • Layer formation and representation

Block Coding and Bitstream Generation.Each subband generated for the 2D discrete wavelet transform is first partitioned into small code blocks of size 32 x 32 or 64 x 64. Then the EBCOT algorithm generates a highly scalable bitstream for each code block Bi. The bitstream associated with Bi may be independently truncated to any member of a predetermined collection of different lengths Rni, with associated distortion Dni.

For each code block Bj, let si-[k] — Si[k1,k2] be the two - dimensional sequence of small code blocks of subband samples, with k1 and k2 the row and column index. (With this definition, the horizontal high - pass subband HL must be transposed so that k1 and k2 will have meaning consistent with the other subbands. This transposition

Dead zone quantizer. The length of the dead zone is 25. Values inside the dead zoue are quantized to 0

Dead zone quantizer. The length of the dead zone is 25. Values inside the dead zoue are quantized to 0

means that the HL subband can be treated in the same way as the LH, HH, and LL subbands and use the same context model.)

The algorithm uses a dead zone quantizer shown in the above figure — a double - length region straddling 0. Let x,-[k] e {—1, 1} be the sign of 5,- [k] and let v,- [k] be the quantized magnitude. Explicitly, we have

quantized magnitude

where δβi is the step size for subband βi, which contains code block Bi. Let vpi [k] be the pth bit in the binary representation of Vi[k], where p = 0 corresponds to the least significant bit, and let p t max be the maximum value of p such that vi p1 max [k]≠ 0 for at least one sample in the code block.

The encoding process is similar to that of a bitplane coder, in which the most significant bit vi p1 max [k] is coded first for all samples in the code block, followed by the next most significant bit vi p1 max [k], and so on, until all bitplanes have been coded. In this way, if the bitstream is truncated, then some samples in the code block may be missing one or more least - significant bits. This is equivalent to having used a coarser dead zone quantizer for these samples.

In addition, it is important to exploit the previously encoded information about a particular sample and its neighboring samples. This is done in EBCOT by defining a binary valued state variable σi [k], which is initially 0 but changes to 1 when the relevant sample's first nonzero bitplane vip[k]=1 is encoded. This binary state variable is referred to as the significance of a sample.

The underlying observation behind the zerotree data structure is that significant samples "tend to be clustered, so that it is often possible to dispose of a large number of samples by coding a single binary symbol.

EBCOT takes advantage of this observation; however, with efficiency in mind, it exploits the clustering assumption only down to relatively large sub - blocks of size 16 x 16. As a result, each code block is further partitioned into a two - dimensional sequence of sub - blocks Bitf] For each bitplane, explicit information is first encoded that identifies sub - blocks containing one or more significant samples. The other sub - blocks are bypassed in the remaining coding phases for that bitplane.

Let σp(Bi[j]) be the significance of sub - block (Bi[j]) in bitplane p. The significance map is coded using a quad tree. The tree is constructed by identifying the sub-blocks with leaf nodes — that is, Bi0[j] = Bi[j], The higher levels are built using recursion: higher levels are built using recursion The root of the tree represents the entire code - block: BiT[0] = UjBi - |j].

The significance of the code block is identified one quad level at a time, starting from the root at t = T and working toward the leaves at t =0. The significance values are then sent to an arithmetic coder for entropy coding. Significance values that are redundant are skipped. A value is taken as redundant if any of the following conditions is met:

  • The parent is insignificant.
  • The current quad was already significant in the previous bitplane.
  • This is the last quad visited among those that share the same significant parent, and the other siblings are insignificant.

EBCOT uses four different coding primitives to code new information for a single sample in a bitplane p, as follows:

Zero coding. This is used to code vip[k], given that the quantized sample satisfies vi[k] < 2P+1. Because the sample statistics are measured to be approximately Markovian, the significance of the current sample depends on the values of its eight immediate neighbors. The significance of these neighbors can be classified into three categories:

neighbors can be classified into three categories

The neighbors outside the code block are considered to be insignificant, but note that sub - blocks are not at all independent. The 256 possible neighborhood configurations are reduced to the nine distinct context assignments listed in the following table.

Run - length coding. The run - length coding primitive is aimed at producing runs of the 1 - bit significance values, as a prelude for the arithmetic coding engine. When a horizontal run of insignificant samples having insignificant neighbors is found, it is invoked instead of the zero coding primitive. Each of the following four conditions must be met for the run - length coding primitive to be invoked:

  • Four consecutive samples must be insignificant.
  • The samples must have insignificant neighbors.
  • The samples must be within the same sub - block.
  • The horizontal index kI of the first sample must be even.

The last two conditions are simply for efficiency. When four symbols satisfy these conditions, one special bit is encoded instead, to identify whether any sample in the group is significant in the current bitplane (using a separate context model). If any of the four samples becomes significant, the index of the first such sample is sent as a 2 - bit quantity.

Sign coding. The sign coding primitive is invoked at most once for each sample, immediately after the sample makes a transition from being insignificant to significant during a zero coding or run - length coding operation. Since it has four horizontal and vertical neighbors, each of which may be insignificant, positive, or negative, there are 34 = 81 different context configurations. However, exploiting both horizontal and vertical symmetry and assuming that the conditional distribution of χi[k], given any neighborhood configuration, is the same as that of — χi[k], the number of contexts is reduced to 5.

Table Context assignment for the zero coding primitive

Table Context assignment for the zero coding primitive

Forward - significance - propagation pass (Р1p). The sub - block samples are visited in scanline order. Insignificant samples and samples that do not satisfy the neighborhood requirement are skipped. For the LH, HL, and LL subbands, the neighborhood requirement is that at least one of the horizontal neighbors has to be significant. For the HH subband, the neighborhood requirement is that at least one of the four diagonal neighbors must be significant.

For significant samples that pass the neighborhood requirement, the zero coding and run - length coding primitives are invoked as appropriate, to determine whether the sample first becomes significant in bitplane p. If so, the sign coding primitive is invoked to encode the sign. This is called the forward - significance - propagation pass, because a sample that has been found to be significant helps in the new significance determination steps that propagate in the direction of the scan.

Appearance of coding passes and quad - tree codes in each block's embedded bitstream

Appearance of coding passes and quad - tree codes in each block's embedded bitstream

  • Reverse - significance - propagation pass (Р2p)- This pass is identical to (Р1p), except that it proceeds in the reverse order. The neighborhood requirement is relaxed to include samples that have at least one significant neighbor in any direction.

  • Magnitude refinement pass (Р3p). This pass encodes samples that are already significant but that have not been coded in the previous two passes. Such samples are processed with the magnitude refinement primitive.

  • Normalization pass (Р4p). The value v1p [k] of all samples not considered in the previous three coding passes is coded using the sign coding and run - length coding primitives, as appropriate. If a sample is found to be significant, its sign is immediately coded using the sign coding primitive.

The above figure shows the layout of coding passes and quad - tree codes in each block's embedded bitstream. Sp denotes the quad - tree code identifying the significant sub - blocks in bitplane p. Notice that for any bitplane p, Sp appears just before the final coding pass (Р4p), not the initial coding pass (Р1p). This implies that sub - blocks that become significant for the first time in bitplane p are ignored until the final pass.

Post Compression Rate - Distortion Optimization. After all the subband samples have been compressed, apost compression rate distortion (PCRD) step is performed. The goal of PCRD is to produce an optimal truncation of each code block's independent bitstream such that distortion is minimized, subject to the bit - rate constraint. For each truncated embedded bitstream of code block Bi having rate Rini, the overall distortion of the reconstructed image is (assuming distortion is additive)

Distortion Optimization

where Di niis the distortion from code block Bi having truncation point ni. For each code block Bi, distortion is computed by

Distortion Optimization

where sik is the 2D sequence of subband samples in code block Bi and sin[k] is the, quantized representation of these samples associated with truncation point n. The value w2bi is the L2 norm of the wavelet basis function for the subband bi that contains code block Bi.

Color wheel

Color wheel

High - resolution color and separate R, G, B color channel images. (a) example of 24 - bit color image forestfire .bmp. (b, c, d) R, G, and B color channels for this image

High - resolution color and separate R, G, B color channel images.

Example of 8 - bit color image

Example of 8 - bit color image

JPEG image with low quality specified by user

JPEG image with low quality specified by user

RGB and CMY color cubes

RGB and CMY color cubes

Additive and subtractive color: (a) RGB is used to specify additive color; (b) CMY is used to specify subtractive color

Additive and subtractive color: (a) RGB is used to specify additive color; (b) CMY is used to specify subtractive color

Y'UV decomposition of color image. Top image (a) is original color image; (b) is T; (c) is [/; (d) is V

Y'UV decomposition of color image. Top image (a) is original color image; (b) is T; (c) is [/; (d) is V

CIELAB model

CIELAB model

SMPTE Monitor Gamut

SMPTE Monitor Gamut

Sprite Coding, (a) The the foreground object (piper) in a blue - screen image; (b) the foreground object (piper) in a bluescreen image; (c) the composed video scene. Piper image courtesy of Simon Fraser University Pipe Band

Sprite Coding

MPEG - 7 Video segments

Video segments

Search by color histogram results. Some thumbnail images are from the Corel Gallery and are copyright Corel. All rights reserved

Search by color histogram results. Some thumbnail images are from the Corel

C - BIRD interface showing object selection using an ellipse primitive. Image is from the Corel Gallery and is copyright Corel. All rights reserved

BIRD interface showing object selection using an ellipse primitive.

Color locales, (a) Color locales for the model image; (b) color locales for a database image

Color locales, (a) Color locales for the model image; (b) color locales for a database image

Color locales, (a) Color locales for the model image; (b) color locales for a database image

The optimal selection of truncation points nican be formulated into a minimization problem subject to the following constraint:

optimal selection

where Rmax is the available bit rate. For some A, any set of truncation points {niλ} that minimizes

Rmax is the available

is optimal in the rate - distortion sense. Thus, finding the set of truncation points that minimizes Equation with total rate R(X) = Rmax would yield the solution to the entire optimization problem.

Since the set of truncation points is discrete, it is generally not possible to find a value of λ for which R(λ) is exactly equal to Rmax. However, since the EBCOT algorithm uses relatively small code blocks, each of which has many truncation points, it is sufficient to find the smallest value of A such that R(λ) < Rmax.

It is easy to see that each code block Bi can be minimized independently. Let Ni be the set of feasible truncation points and let j1 < j2 < be an enumeration of these feasible truncation points having corresponding distortion - rate slopes given by the ratios

rate slopes

where ΔRijk = Rijk-Rijk - 1and ΔDijk = Dijk-Dijk - 1. It is evident that the slopes are strictly decreasing, since the operational distortion - rate curve is convex and strictly decreasing. The minimization problem for a fixed value of A is simply the trivial selection

rate curve

The optimal value λ* can be found using a simple bisection method operating on the distortion - rate curve. A detailed description of this method can be found in .

Layer Formation and Representation. The EBCOT algorithm offers both resolution and quality scalability, as opposed to other well - known scalable image compression algorithms such as EZW and SPIHT, which offer only quality scalability. This functionality is achieved using a layered bitstream organization and a two - tiered coding strategy.

Layer Formation and Representation

Three quality layers with eight blocks each

Three quality layers with eight blocks each

Along with these incremental contributions, auxiliary information such as the length Liq, the number of new coding passes Niq niq - ni q - 1, the value pimax when Bi makes its first nonempty contribution to quality layer Qq, and the index qi of the quality layer to which Bi, first makes a nonempty contribution must be explicitly stored. This auxiliary information is compressed in the second - tier coding engine. Hence, in this two - tiered architecture, the first tier produces the embedded block bitstreams, while the second encodes the block contributions to each quality layer.

The focus of this subsection is the second - tier processing of the auxiliary information accompanying each quality layer. The second - tier coding engine handles carefully the two quantities that exhibit substantial interblock redundancy. These two quantities are pimax and the index qi of the quality layer to which B / first makes a nonempty contribution.

The quantity qi is coded using a separate embedded quad - tree code within each subband. Let Bi0 = Bi be the leaves and Bit be the root of the tree that represents the entire subband. Let qit = min(qj | Bj С Bit} be the index of the first layer in which any code block in quad Bit makes a nonempty contribution. A single bit identifies whether qit>q for each quad at each level t, with redundant quads omitted. A quad is redundant if either qit < q - 1 or qj t + 1 > q for some parent quad Bj t + 1.

The other redundant quantity to consider is pimax. It is clear that pimax is irrelevant until the coding of the quality layer Qq. Thus, any unnecessary information concerning pimax need not be sent'until we are ready to encode Qq. EBCOT does this using a modified embedded quadtree driven from the leaves rather than from the root.

Let Bit be the elements of the quad tree structure built on top of the code blocks Bi from any subband. In addition, let Bitt be the ancestor of quads from which Bi descends and let P be a value guaranteed to be larger than pimax for any code block Bi. When code block Bi first contributes to the bitstream in quality layer Qq, the value of pima = pipmax.0 is coded using the following algorithm:

algorithm

Adapting EBCOTto JPEG2000

JPEG2000 uses the EBCOT algorithm as its primary coding method. However, the algorithm is slightly modified to enhance compression efficiency and reduce computational complexity.

To further enhance compression efficiency, as opposed to initializing the entropy coder using equiprobable states for all contexts, the JPEG2000 standard makes an assumption of highly skewed distributions for some contexts, to reduce the model adaptation cost for typical images. Several small adjustments are made to the original algorithm to further reduce its execution time.

First, a low - complexity arithmetic coder that avoids multiplications and divisions, known as the MQ coder, replaces the usual arithmetic coder used in the original algorithm. Furthermore, JPEG2000 does not transpose the HL subband's code blocks. Instead, the corresponding entries in the zero coding context assignment map are transposed.

To ensure a consistent scan direction, JPEG2000 combines the forward - and reverse - significance - propagation passes into a single forward - significance - propagation pass with a neighborhood requirement equal to that of the original reverse pass. In addition, reducing the sub - block size to 4 x 4 from the original 16 x 16 eliminates the need to explicitly code sub - block significance. The resulting probability distribution for these small sub - blocks is highly skewed, so the coder behaves as if all sub - blocks are significant.

The cumulative effect of these modifications is an increase of about 40% in software execution speed, with an average loss of about 0.15 dB relative to the original algorithm.

Region - of - lnterest Coding

A significant feature of the new JPEG2000 standard is the ability to perform region - of - interest (ROI) coding . Here, particular regions of the image may be coded with better quality than the rest of the image or the background. The method is called MAXSHIFT, a scaling - based method that scales up the coefficients in the ROI so that they are placed into higher bitplanes. During the embedded coding process, the resulting bits are placed in front of the non - ROI part of the image. Therefore, given a reduced bitrate, the ROI will be decoded and refined before the rest of the image. As a result of these mechanisms, the ROI will have much better quality than the background.

Region of interest (ROI) coding of an image with increasing bit - rate using a circularly shaped ROI: (a) 0.4 bpp; (b) 0.5 bpp; (c) 0.6 bpp; (d) 0.7 bpp

Region of interest (ROI) coding of an image with increasing bit - rate using a circularly shaped ROI

Region of interest (ROI) coding of an image with increasing bit - rate using a circularly shaped ROI

One thing to note is that regardless of scaling, full decoding of the bitstream will result in reconstruction of the entire image with the highest iidehty available. The above figure demonstrates the effect of region - of - interest coding as the target bitrate of the sample image is increased.

Comparison of JPEG and JPEG2000 Performance

After studying the internals of the JPEG2000 compression algorithm, a natural question that comes to mind is, how well does JPEG2000 perform compared to other well - known standards, in particular JPEG? Many comparisons have been made between JPEG and other well - known standards, so here we compare JPEG2000 only to the popular JPEG.

Various criteria, such as computational complexity, error resilience, compression efficiency, and so on, have been used to evaluate the performance of systems. Since our main focus is on the compression aspect of the JPEG2000 standard, here we simply compare compression efficiency. (Interested readers can refer to and for comparisons using other criteria.)

Given a fixed bitrate, let's compare quality of compressed images quantitatively by the PSNR: for color images, the PSNR is calculated based on the average of the mean square error of all the RGB components. Also, we visually show results for both JPEG2000 and JPEG compressed images, so that you can make your own qualitative assessment. We perform a comparison for three categories of images: natural, computer - generated, and medical, using three images from each category. The test images used are shown on the textbook web site in the Further Exploration section for this chapter.

For each image, we compress using JPEG and JPEG2000, at four bitrates: 0.25 bpp, 0.5 bpp, 0.75 bpp, and 1.0 bpp. The following figure shows plots of the average PSNR of the images in each category against bitrate. We see that JPEG2000 substantially out performs JPEG in all categories.

For a qualitative comparison of the compression results, let's choose a single image and show decompressed output for the two algorithms using a low bitrate (0.75 bpp) and the lowest bitrate (0.25 bpp).


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

MULTIMEDIA Topics