Wavelet-Based Coding - MULTIMEDIA

Decomposing the input signal into its constituents allows us to apply coding techniques suitable for each constituent, to improve compression performance. Consider again a time - dependent signal f(t) (it is best to base discussion on continuous functions to start with). The traditional method of signal decomposition is the Fourier transform. Above, in our discussion of the DCT, we considered a special cosine - based transform. If we carry out analysis based on both sine and cosine, then a concise notation assembles the results into a function f(ω), a complex - valued function of real - valued frequency.

Such decomposition results in very fine resolution in the frequency domain. However, since a sinusoid is theoretically infinite in extent in time, such a decomposition gives no temporal resolution.

Another method of decomposition that has gained a great deal of popularity in recent years is the wavelet transform. It seeks to represents a signal with good resolution in both time and frequency, by using a set of basis functions called wavelets.

There are two types of wavelet transforms: the Continuous Wavelet Transform (CWTJ and the Discrete Wavelet Transform (DWT). We assume that the CWT is applied to the large class of functions fix) that are square integrable on the real line — that is, f [f(x)] 2 dx < ∞. In mathematics, this is written as f(x) ε L2(S).

The other kind of wavelet transform, the DWT, operates on discrete samples of the input signal. The DWT resembles other discrete linear transforms, such as the DFT or the DCT, and is very useful for image processing and compression.

Before we begin a discussion of the theory of wavelets, let's develop an intuition about this approach by going through an example using the simplest wavelet transform, the so - called Haar Wavelet Transform, to form averages and differences of a sequence of float values.

If we repeatedly take averages and differences and keep results for every step, we effectively create a multiresolution analysis of the sequence. For images, this would be equivalent to creating smaller and smaller summary images, one - quarter the size for each step, and keeping track of differences from the average as well. Mentally stacking the full - size image, the quarter - size image, the sixteenth size image, and so on, creates a pyramid. The full set, along with difference images, is the multiresolution decomposition.

EXAMPLE A Simple Wavelet Transform

The objective of the wavelet transform is to decompose the input signal, for compression purposes, into components that are easier to deal with, have special interpretations, or have some components that can be thresholded away. Furthermore, we want to be able to at least approximately reconstruct the original signal, given these components. Suppose we are given the following input sequence:

{xnJ} = {10, 13, 25, 26, 29, 21, 7, 15}

Here I € [0.. 7] indexes "pixels", and n stands for the level of & pyramid we are on. At the top, n = 3 for this sequence, and we shall form three more sequences, for n = 2, 1, and 0.

At each level, less information will be retained in the beginning elements of the transformed signal sequence. When we reach pyramid level n = 0, we end up with the sequence average stored in the first element. The remaining elements store detail information.

Consider the transform that replaces the original sequence with its pairwise average xn-I, i and difference dn-1, i, defined as follows:

Notice that the averages and differences are applied only on consecutive pairs of input sequences whose first element has an even index. Therefore, the number of elements

in each set {x n-1,i} and {d n-1,i} is exactly half the number of elements in the original

sequence. "We can form a new sequence having length equal to that of the original sequence by concatenating the two sequences {x n-1,i} and {d n-1,i}. The resulting sequence is thus

{x n-1,i, {d n-1,i} = {11.5,25.5,25,11, -1.5, -0.5,4, -4}

where we are now at level n — 1 = 2. This sequence has exactly the same number of elements as the input sequence — the transform did not increase the amount of data. Since the first half of the above sequence contains averages from the original sequence, we can view it as a coarser approximation to the original signal.

The second half of this sequence can be viewed as the details or approximation errors of the first half. Most of the values in the detail sequence are much smaller than those of the original sequence. Thus, most of the energy is effectively concentrated in the first half. Therefore, we can potentially store {d n-1,i} using fewer bits.

It is easily verified that the original sequence can be reconstructed from the transformed sequence, using the relations.

This transform is the discrete Haar wavelet transform. Averaging and differencing can be carried out by applying a so - called scaling function and wavelet function along the signal. The following figure shows the Haar version of these functions.

Haar Wavelet Transform: (a) scaling function; (b) wavelet function

We can further apply the same transform to {x n-1,i}, to obtain another level of approxi­mation x n-2,i and detail d n-2,i:

{x n-2,i , d n-2,i, d n-1,i } = {18.5, 18, -7, 7, -1.5, -0.5, 4, -4}

This is the essential idea of multiresolution analysis. We can now study the input signal in three different scales, along with the details needed to go from one scale to another. This process can continue n times, until only one element is left in the approximation sequence. In this case, n = 3, and the final sequence is given below:

{x n-3,i, d n-3,i, d n-2,i, d n-1,i} = (18.25, 0.25, -7,7, -1.5, -0.5,4, -4}

Now we realize that n was 3 because only three resolution changes were available until we reached the final form. The value 18.25, corresponding to the coarsest approximation to the original signal, is the average of all the elements in the original sequence. From this example, it is easy to see that the cost of computing this transform is proportional to the number of elements N in the input sequence — that is, O (N).

Extending the one - dimensional Haar wavelet transform into two dimensions is relatively easy: we simply apply the one - dimensional transform to the rows and columns of the two - dimensional input separately. We will demonstrate the two - dimensional Haar transform applied to the 8 x 8 input image shown in the following figure.

EXAMPLE 2D Haar Transform

This example of the 2D Haar transform not only serves to illustrate how the wavelet transform is applied to two - dimensional inputs but also points out useful interpretations of the

Input image for the 2D Haar Wavelet Transform: (a) pixel values; (b)an 8 x 8 image

Intermediate output of the 2D Haar Wavelet Transform

transformed coefficients. However, it is intended only to provide the reader with an intuitive feeling of the kinds of operations involved in performing a general 2D wavelet transform. Subsequent sections provide more detailed description of the forward and inverse 2D wavelet transform algorithms, as well as a more elaborate example using a more complex wavelet.

2D Haar Wavelet Transform. We begin by applying a one - dimensional Haar wavelet transform to each row of the input.

The first and last two rows of the input are trivial. After performing the averaging and differencing operations on the remaining rows, we obtain the intermediate output shown in the above figure. We continue by applying the same ID Haar transform to each column of the intermediate output. This step completes one level of the 2D Haar transform. The following figure gives the resulting coefficients. We can naturally divide the result into four quadrants. The upper left quadrant contains the averaged coefficients from both the horizontal and vertical passes.

Therefore, it can be viewed as a low - pass - filtered version of the original image, in the sense that higher - frequency edge information is lost, while low - spatial - frequency smooth information is retained.

The upper right quadrant contains the vertical averages of the horizontal differences and can be interpreted as information about the vertical edges within the original image.

Similarly, the lower left quadrant contains the vertical differences of the horizontal averages and represents the horizontal edges in the original image. The lower right quadrant contains the differences from both the horizontal and vertical passes. The coefficients in this quadrant represent diagonal edges.

The inverse of the 2D Haar transform can be calculated by first inverting the columns and then inverting the resulting rows.

Output of the first level of the 2D Haar Wavelet Transform

A simple graphical illustration of the Wavelet Transform

Continuous Wavelet Transform

We noted that the motivation for the use of wavelets is to provide a set of basis functions that decompose a signal in time over parameters in the frequency domain and the time domain simultaneously. A Fourier transform aims to pin down only the frequency content of a signal, in terms of spatially varying rather than time varying signals.

What wavelets aim to do is pin down the frequency content at different parts of the image.

For example, one part of the image may be "busy" with texture and thus high - frequency content, while another part may be smooth, with little high - frequency content. Naturally, one can think of obvious ways to consider frequencies for localized areas of an image: divide an image into parts and fire away - with Fourier analysis. The time - sequence version of that idea is called the Short - Term (or Windowed) Fourier Transform. And other ideas have also arisen.

However, it turns out that wavelets, a much newer development, have neater characteristics.

To further motivate the subject, we should consider the Heisenberg uncertainty principle, from physics. In the context of signal processing, this says that there is a tradeoff between accuracy in pinning down a function's frequency, and its extent in time. We cannot do both

A Mexican - hat Wavelet: (a) a = 0.5; (b) its Fourier transform

accurately, in general, and still have a useful basis function. For example, a sine wave is exact in terms of its frequency but infinite in extent.

As an example of a function that dies away quickly and also has limited frequency content, suppose we start with a Gaussian function,

The parameter σ expresses the scale of the Gaussian (bell - shaped) function.

The second derivative of this function, called Ψ(t) looks like a Mexican hat, as in the above figure. Clearly, the function Ψ(t) is limited in time. Its equation is as follows:

We can explore the frequency content of function Ψ(t) by taking its Fourier transform. This turns out to be given by

The above figure displays this function: the candidate wavelet is indeed limited in frequency as well.

In general, a wavelet is a function ΨεL2 (R) with a zero average,

that satisfies some conditions that ensure it can be utilized in a multiresolution decompo­sition. The conditions ensure that we can use the decomposition for zooming in locally in some part of an image, much as we might he interested in closer or farther views of some neighborhood in a map.

The constraint (8.49) is called the admissibility condition for wavelets. A function that sums to zero must oscillate around zero. Also, from (8.26), we see that the DC value, the Fourier transform of Ψ{t) for ω = 0, is zero.

Another way to state this is that the 0th moment M0 of Ψ{t) it is zero. The pth moment is defined as

The function Ψ is normalized with Ψ = 1 and centered in the neighborhood of t = 0. We can obtain a family of wavelet functions by scaling and translating the mother wavelet Ψ as follows:

If Ψ{t) is normalized, so is Ψs,u(t).

The Continuous Wavelet Transform (CWT) of f ε L2(R) at time it and scale s is defined as

The CWT of a ID signal is a 2D function — a function of both scale s and shift u.

A very important issue is that, in contradistinction to (8.26), where the Fourier analysis function is stipulated to be the sinusoid, here (8.52) does not state what Ψ(t)actually is! Instead, we create a set of rules such functions must obey and then invent useful functions that obey these rules — different functions for different uses.

Just as we defined the DCT in terms of products of a function with a set of basis functions, here the transform W is written in terms of inner products with basis functions that are a scaled and shifted version of the mother wavelet Ψ(t).

The mother wavelet Ψ{t) is a wave, since it must be an oscillatory function. Why is it wavelet? The spatial - frequency analyzer parameter in (8.52) is s, the scale. We choose some scale s and see how much content the signal has around that scale. To make the function decay rapidly, away from the chosen s, we have to choose a mother wavelet Ψ{t) that decays as fast as some power of s.

It is actually easy to show, from (8.52), that if all moments Ψ{t) up to the nth are zero (or quite small, practically speaking), then the CWT coefficient VV(f, s, u) has a Taylor expansion around u=0 that is of order sn+2. This is the localization in frequency we desire in a good mother wavelet.

We derive wavelet coefficients by applying wavelets at different scales over many lo­cations of the signal. Excitingly, if we shrink the wavelets down small enough that they cover apart of the function f(t) that is a polynomial of degree n or less, the coefficient for that wavelet and all smaller ones will be zero. The condition that the wavelet should have vanishing moments up to some order is one way of characterizing mathematical regularity conditions on the mother wavelet.

The inverse of the continuous wavelet transform is:

where

and Ψ (ω) is the Fourier transform of Ψ(t).

The resulting infinite set of scaled and shifted functions is not necessary for the analysis of sampled functions, such as the ones arise in image processing. For this reason, we apply the ideas that pertain to the CWT to the discrete domain.

Discrete Wavelet Transform*

Discrete wavelets are again formed from a mother wavelet, but with scale and shift in discrete steps.

Multiresolution Analysis and the Discrete Wavelet Transform

The connection be­tween wavelets in the continuous time domain and filter banks in the discrete time domain is multiresolution analysis; we discuss the DWT within this framework. Mallat showed that it is possible to construct wavelets \$ such that the dilated and translated family

is an orthonormal basis of L2(S), where Z represents the set of integers. This is known as "dyadic" scaling and translation and corresponds to the notion of zooming out in a map by factors of 2. (If we draw a cosine function cos(t) from time 0 to 2tt and then draw cos(t/2), we see that while cos(t) goes over a whole cycle, cos(t/2) has only a half cycle: the function cos(2_1t) is a wider function and thus is at a broader scale.

Note that we change the scale of translations along with the overall scale 2J, so as to keep movement in the lower - resolution image in proportion. Notice also that the notation used says that a larger index j corresponds to a coarser version of the image.

Multiresolution analysis provides the tool to adapt signal resolution to only relevant details for a particular task. The octave decomposition initially decomposes a signal into an approximation component and a detail component. The ap­proximation component is then recursively decomposed into approximation and detail at successively coarser scales. Wavelets are set up such that the approximation at resolu­tion 2 - J contains all the necessary information to compute an approximation at coarser resolution 2 - (J+l).

Wavelets are used to characterize detail information. The averaging information is for­mally determined by a kind of dual to the mother wavelet, called the scaling function Φ(t).

The main idea in the theory of wavelets is that at a particular level of resolution 7, the set of translates indexed by n form a basis at that level. Interestingly, the set of translates forming the basis at the j + 1 next level, a coarser level, can all be written as a sum of weights times the level -j basis.

The scaling function is chosen such that the coefficients of its translates are all necessarily bounded (less than infinite).

The scaling function, along with its translates, forms a basis at the coarser level j + 1 (say 3, or the 1/8 level) but not at level j (say 2, or the 1/4 level). Instead, at level j the set of translates of the scaling function Φ along with the set of translates of the mother wavelet Φ do form a basis.

We are left with the situation that the scaling function describes smooth, or approximation, information and the wavelet describes what is left over — detail information.

Since the set of translates of the scaling function 0 at a coarser level can be written exactly as a weighted sum of the translates at a finer level, the scaling function must satisfy the so - called dilation equation:

The square brackets come from the theory of filters, and their use is carried over here. The dilation equation is a recipe for finding a function that can be built from a sum of copies of itself that are first scaled, translated, and dilated.

Not only is the scaling function expressible as a sum of translates, but as well the wavelet at the coarser level is also expressible as such:

Below, we'll show that the set of coefficients h1 for the wavelet can in fact be derived from the scaling function ones ho, so we also have that the wavelet can be derived from the scaling function, once we have one. The equation reads

So the condition on a wavelet is similar to that on the scaling function and in fact uses the same coefficients, only in the opposite order and with alternating signs.

Clearly, for efficiency, we would like the sums in both the equations to be as few as possible, so we choose wavelets that have as few vector entries ho and hi as possible. The effect of the scaling function is a kind of smoothing, or filtering, operation on a signal. Therefore it acts as a low - pass filter, screening out high - frequency content.

The vector values ho[n] are called the low-pass filter impulse response coefficients, since they describe the effect of the filtering operation on a signal consisting of a single spike with magnitude unity (an impulse) at time t = 0. A complete discrete signal is made of a set of such spikes, shifted in time from 0 and weighted by the magnitudes of the discrete samples.

Hence, to specify a DWT, only the discrete low-pass filter impulse response ho[n] is needed. These specify the approximation filtering, given by the scaling function. The

Table Orthogonal wavelet filters

discrete high - pass impulse response h i [fi] , describing the details using the wavelet function, can be derived from /iq [«] using the following equation:

The number of coefficients in the impulse response is called the number of taps in the filter. If ho[n] has only a finite number of nonzero entries, the resulting wavelet is said to have compact support. Additional constraints, such as orthonormality and regularity, can be imposed on the coefficients ho[n]. The vectors ho[n] and h i [n] are called the low - pass and high - pass analysis filters.

To reconstruct the original input, an inverse operation is needed. The inverse filters are called synthesis filters. For orthonormal wavelets, the forward transform and its inverse are transposes of each other, and the analysis filters are identical to the synthesis filters.

Without orthogonality, the wavelets for analysis and synthesis are called biorthogonal, a weaker condition. In this case, the synthesis filters are not identical to the analysis filters. We denote them as h0[n] and h1[n]. To specify abiorthogonal wavelet transform, we require both h0[n] and h0[n]. As before, we can compute the discrete high - pass filters in terms of sums of the low-pass ones:

Table Biorthogonal wavelet filters

For analysis, at each level we transform the series x[n] into another series of the same length, in which the first half of the elements is approximation information and the second half consists of detail information. For an TV - tap filter, this is simply the series

where for each half, the odd - numbered results are discarded. The summation over shifted coefficients in (8.62) is referred to as a convolution.

2D Discrete Wavelet Transform. The extension of the wavelet transform to two - dimensions is quite straightforward. A two - dimensional scaling function is said to be sepa­rable if it can be factored into a product of two one - dimensional scaling functions. That is,

For simplicity, only separable wavelets are considered in this section. Furthermore, let's assume that the width and height of the input image are powers of 2

Block diagram of the ID dyadic wavelet transform

For an N by N input image, the two - dimensional DWT proceeds as follows:

Convolve each row of the image with h0[n] and h1[n],discard the odd - numbered columns of the resulting arrays, and concatenate them to form a transformed row.

After all rows have been transformed, convolve each column of the result with h0[n] an h1[n]]. Again discard the odd - numbered rows and concatenate the result.

After the above two steps, one stage of the DWT is complete. The transformed image now contains four subbands LL, HL, LH, and HH, standing for low - low, high - low, and so on, as in the below figure (a) shows. As in the one - dimensional transform, the LL subband can be further decomposed to yield yet another level of decomposition. This process can be continued until the desired number of decomposition levels is reached or the LL component only has a single element left. A two level decomposition is shown in fig.

The inverse transform simply reverses the steps of the forward transform.

The two - dimensional discrete wavelet transform: (a) one - level transform; (b) two - level transform

For each stage of the transformed image, starting with the last, separate each column into low - pass and high - pass coefficients. Upsample each of the low - pass and high - pass arrays by inserting a zero after each coefficient.

Convolve the low - pass coefficients with ho[n] and high - pass coefficients with h1[n] and add the two resulting arrays. After all columns have been processed, separate each row into low - pass and high - pass coefficients and upsample each of the two arrays by inserting a zero after each coefficient.

Convolve the low - pass coefficients with ho[n] and high - pass coefficients with h1[n] and add the two resulting arrays.

If biorthogonal filters are used for the forward transform, we must replace the ho[n] and h1[n] above with in the inverse transform.

Lena: (a) original 128 x 128 image; (b) 16 x 16 subsampled image

The input image in numerical form is

I represents the pixel values. The first subscript of I indicates the current stage of the transform, while the second subscript indicates the current step within a stage. We start by convolving the first row with both ho[n] and h1 [n] and discarding the values with odd - numbered index. The results of these two operations are

where the colon in the first index position indicates that we are showing a whole row. If you like, you can verify these operations using MATLAB's conv function.

Next, we form the transformed output row by concatenating the resulting coefficients. The first row of the transformed image is then [245, 156, 171, 183, 184,173, 228, 160; -30, 3,0, 7, -5, -16, -3, 16]

Similar to the simple one - dimensional Haar transform examples, most of the energy is now concentrated on the first half of the transformed image. We continue the same process for the remaining rows and obtain the following result:

We now go on and apply the filters to the columns of the above resulting image. As before, we apply both ho[n] and h1 [n] to each column and discard the odd indexed results:

Concatenating the above results into a single column and applying the same procedure to each of the remaining columns, we arrive at the final transformed image:

This completes one stage of the Discrete Wavelet Transform. We can perform another stage by applying the same transform procedure to the upper left 8 x 8 DC image of I12 (x,y) The resulting two-stage transformed image is

Notice that I12 corresponds to the subband diagram shown in the above figure (a), and In corresponds to the above figure (b). At this point, we may apply different levels of quantization to each subband according to some preferred bit allocation algorithm, given a desired bitrate. This is the basis for a simple wavelet - based compression algorithm. However, since in this example we are illustrating the mechanics of the DWT, here we will simply bypass the quantization step and perform an inverse transform to reconstruct the input image.

We refer to the top left 8x8 block of values as the innermost stage in correspondence with the above figure. Starting with the innermost stage, we extract the first column and separate the low - pass and high - pass coefficients. The low - pass coefficient is simply the first half of the column, and the high - pass coefficients are the second half. Then we sample them by appending a zero after each coefficient. The two resulting arrays are

All columns in the innermost stage are processed in this manner. The resulting image is

We are now ready to process the rows. For each row of the upper left 8 x 8 sub - image, we again separate them into low - pass and high - pass coefficients. Then we up sample both by adding a zero after each coefficient. The results are convolved with the appropriate ho[n] and hi[n] filters. After these steps are completed for all rows, we have

We then repeat the same inverse transform procedure on I12(x,y), to obtain I00(x,y). Notice that I`00(x, y) is not exactly the same a I00(x, y), but the difference is small. These small differences are caused by round - off errors during the forward and inverse transform, and truncation errors when converting from floating point numbers to integer grayscale values

Wavelet - Based Reduction Program. Keeping only the lowest - frequency content amounts to an even simpler wavelet - based image zooming - out reduction algorithm. Pro­gram wavelet reduction. c on the book's web site gives a simple illustration of this principle, limited to just the scaling function and analysis filter to scale down an image some number of times (three, say) using wavelet - based analysis.