Bilevel Image Compression Standards - MULTIMEDIA

As more and more documents are handled in electronic form, efficient methods for compressing bilevel images (those with only 1-bit, black - and - white pixels) are much in demand. A familiar example is fax images. Algorithms that take advantage of the binary nature of the image data often perform better than generic image - compression algorithms. Earlier facsimile standards, such as G3 and G4, use simple models of the structure of bilevel images.

Each scanline in the image is treated as a run of black - and - white pixels. However, considering the neighboring pixels and the nature of data to be coded allows much more efficient algorithms to be constructed. This section examines the JBIG standard and its successor, JBIG2, as well as the underlying motivations and principles for these two standards.

The JBIG Standard

JBIG is the coding standard recommended by the Joint Bi - level Image Processing Group for binary images. This lossless compression standard is used primarily to code scanned images of printed or handwritten text, computer - generated text, and facsimile transmissions. It offers progressive encoding and decoding capability, in the sense that the resulting bitstream contains a set of progressively higher - resolution images. This standard can also be used to code grayscale and color images by coding each bitplane independently, but this is not the main objective.

The JBIG compression standard has three separate modes of operation: progressive, progressive - compatible sequential, and single - progression sequential. The progressive - compatible sequential mode uses a bitstream compatible with the progressive mode. The only difference is that the data is divided into strips in this mode.

The single - progression sequential mode has only a single lowest - resolution layer. Therefore, an entire image can be coded without any reference to other higher - resolution layers. Both these modes can be viewed as special cases of the progressive mode. Therefore, our discussion covers only the progressive mode.

The JBIG encoder can be decomposed into two components:

  • Resolution - reduction and differential - layer encoder
  • Lowest - resolution - layer encoder

The input image goes through a sequence of resolution - reduction and differential - layer encoders. Each is equivalent in functionality, except that their input images have different resolutions. Some implementations of the JBIG standard may choose to recursively use one such physical encoder. The lowest - resolution image is coded using the lowest - resolution - layer encoder. The design of this encoder is somewhat simpler than that of the resolution - reduction and differential - layer encoders, since the resolution - reduction and deterministic - prediction operations are not needed.

The JBIG2 Standard

While the JBIG standard offers both lossless and progressive (lossy to lossless) coding abilities, the lossy image produced by this standard has significantly lower quality than the original, because the lossy image contains at most only one - quarter of the number of pixels in the original image. By contrast, the JBIG2 standard is explicitly designed for lossy, lossless, and lossy to lossless image compression. The design goal for JBIG2 aims not only at providing superior lossless compression performance over existing standards but also at incorporating lossy compression at a much higher compression ratio, with as little visible degradation as possible.

A unique feature of JBIG2 is that it is both quality progressive and content progressive. By quality progressive, we mean that the bitstream behaves similarly to that of the JBIG standard, in which the image quality progresses from lower to higher (or possibly lossless) quality. On the other hand, content progressive allows different types of image data to be added progressively. The JBIG2 encoder decomposes the input bilevel image into regions of different attributes and codes each separately, using different coding methods.

As in other image compression standards, only the JBIG2 bitstream, and thus the decoder, is explicitly defined. As a result, any encoder that produces the correct bitstream is "compliant", regardless of the actions it actually takes. Another feature of JBIG2 that sets it apart from other image compression standards is that it is able to represent multiple pages of a document in a single file, enabling it to exploit interpage similarities.

For example, if a character appears on one page, it is likely to appear on other pages as well. Thus, using a dictionary - based technique, this character is coded only once instead of multiple times for every page on which it appears. This compression technique is somewhat analogous to video coding, which exploits interframe redundancy to increase compression efficiency.

JBIG2 offers content - progressive coding and superior compression performance through model - based coding, in which different models are constructed for different data types in an image, realizing additional coding gain.

Model - Based Coding. The idea behind model - based coding is essentially the same as that of context - based coding. From the study of the latter, we know we can realize better compression performance by carefully designing a context template and accurately estimating the probability distribution for each context. Similarly, if we can separate the image content into different categories and derive a model specifically for each, we are much more likely to accurately model the behavior of the data and thus achieve higher compression ratio.

In the JBIG style of coding, adaptive and model templates capture the structure within the image. This model is general, in the sense that it applies to all kinds of data. However, being general implies that it does not explicitly deal with the structural differences between text and halftone data that comprise nearly all the contents of bilevel images. JBIG2 takes advantage of this by designing custom models for these data types.

The JBIG2 specification expects the encoder to first segment the input image into regions of different data types, in particular, text and halftone regions. Each region is then coded independently, according to its characteristics. Text - Region Coding.

Each text region is further segmented into pixel blocks containing connected black pixels. These blocks correspond to characters that make up the content of this region. Then, instead of coding all pixels of each character, the bitmap of one representative instance of this character is coded and placed into a dictionary. For any character to be coded, the algorithm first tries to find a match with the characters in the dictionary.

If one is found, then both a pointer to the corresponding entry in the dictionary and the position of the character on the page are coded. Otherwise, the pixel block is coded directly and added to the dictionary. This technique is referred to as pattern matching and substitution in the JBIG2 specification.

However, for scanned documents, it is unlikely that two instances of the same character will match pixel by pixel. In this case, JBIG2 allows the option of including refinement data to reproduce the original character on the page. The refinement data codes the current character using the pixels in the matching character in the dictionary. The encoder has the freedom to choose the refinement to be exact or lossy. This method is called soft pattern matching.

The numeric data, such as the index of matched character in the dictionary and the position of the characters on the page, are either bitwise or Huffman encoded. Each bitmap for the characters in the dictionary is coded using JBIG - based techniques. Halftone - Region Coding

The JBIG2 standard suggests two methods for halftone image coding. The first is similar to the context - based arithmetic coding used in JBIG. The only difference is that the new standard allows the context template to include as many as 16 template pixels, four of which may be adaptive.

The second method is called descreening. This involves converting back to grayscale and coding the grayscale values. In this method, the bilevel region is divided into blocks of size mbxnb. For an m x n bilevel region, the resulting grayscale image has dimension mg = [(m + (mb — 1))/mb} by ng = [(n + (nb — l))/nbJ. The grayscale value is then computed to be the sum of the binary pixel values in the corresponding mb x nb block. The bitplanes of the grayscale image are coded using context - based arithmetic coding. The grayscale values are used as indices into a dictionary of halftone bitmap, patterns. The decoder can use this value to index into this dictionary, to reconstruct the original halftone image.

Preprocessing and Postprocessing. JBIG2 allows the use of lossy compression but does not specify a method for doing so. From the decoder point of view, the decoded bit - stream is lossless with respect to the image encoded by the encoder, although not necessarily with respect to the original image. The encoder may modify the input image in a preprocessing step, to increase coding efficiency. The preprocessor usually tries to change the original image to lower the code length in a way that does not generally affect the image's appearance. Typically, it tries to remove noisy pixels and smooth out pixel blocks.

Postprocessing, another issue not addressed by the specification, can be especially useful for halftones, potentially producing more visually pleasing images. It is also helpful to tune the decoded image to a particular output device, such as a laser printer.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status