Project Final Report

A Study on Fractal Image Compression

By

Darshan S Alagud

ID: 100727906

Fall 2010

Instructor: Dr. K.R.Rao

List of Acronyms:

FIC: Fractal image compression.

IFS: Iterated function system.

JPEG: Joint Photographic Experts Group.

MSE: Mean square error.

PIFS: Partitioned iterated function systems.

PSNR: Peak signal to noise ratio.

QF: Quality factor.

SSIM: Structural similarity matric

Proposal:

In this project, a study on fractal-based image compression using PIFS algorithm and fixed-size partitioning will be made, analyzed for performance and compared with a standard frequency domain based image compression standard, JPEG.

Sample images will be used to perform compression and decompression. And performance metrics such as compression ratio, MSE, PSNR, SSIM, compression time and decompression time will be measured and compared for both PIFS and JPEG cases. Also the phenomenon of resolution/scale independence will be studied and described with examples.

Introduction:

Fractal-basedimage compression(FIC) techniquesexploit redundancy due to self-similarity properties in images to achieve compression.A fractal maybe defined as a geometrical shape that is self-similar i.e. it has parts that are similar to the whole. An example of a fractal is the below figure [8]:

Figure 1: The Sierpinski triangle [8]

In the above figure, the three triangles at each corner of a triangle are similar to the whole triangle. Self-similarity can be found in nature such as in clouds, trees, ferns, mountain ranges etc and in natural images. The amount of information therefore needed to describe such images is a lot lesser than it appears and hence image compression using fractal-based methods is possible. Fractal-based image compression such as PIFSexploits redundancy due to ordinary symmetry/similarity also in addition to self-similarity.

Mathematical Representation:

Mathematically, in FIC, an Iterated function system (IFS) describes an image.An IFS is a set of contractive affine transforms. An affine transform is a linear transform of the form [3]:

In the above equation, wi is the transform applied on a point (x, y, z) in the ith block of input image. Grayscale images can be represented by such points (x, y, z), in which case z would be the intensity or gray-level at pixel co-ordinates (x, y) in an image. The ai1, ai2, ai3, ai4, di1 and di2 are constants for the ith image block to which the transform is applied. An affine transform is said to be contractive if the distance between transformed points is less than that between the original points in the metric space. The Contractive-mapping theorem guarantees that for such a set of affine transforms or IFS, the repeated application of IFS to an initial random image, which converges to a finite fixed-point image. The goal of FIC is to find such an IFS that would describe the input image.

For example, consider the below IFS consisting of 2 affine transforms represented by their effect on a square (each function transforms the outlined square into the shaded square). Both functions are applied to the input image and a union (U) of the resulting images is formed in each iteration. Firstthree iterations are shown and then the final image (fixed point image) after several iterations is shown.

Figure 2: Illustration of construction of an image using an IFS of 2 affine functions. The above image is taken from Creative Commons, author: Scott Draves [11]

Encoding:

In this project the algorithm studied or used for Fractal image compression is known as Partitioned Iterative Function Systems or PIFS approach. In this approach rather than forming the image from copies of its whole self, the image is formed from copies of properly transformed parts of the image. In other words, the input image to the compressor is partitioned into a number of non-overlapping range blocks, Rj and non-overlapping domain blocks, Di. Then, a contractive mapping wi as described in Equation 1.0 and a Domain block Di is found for every Rj, such that the transformed copy of the Domain block Di is very close to Ri. (The distance metric used here is mean square error (MSE)).

The following image shows matching domain and range blocks in images that the PIFS algorithm looks for:

Figure 3: Matching Range and Domain blocks in an image [7]. (One pair with green borders matches due to ordinary symmetry and the other with red borders matches due to self-similarity)

In the above image the bigger red block matches the smaller one due to self-similarity. And the bigger green and smaller green blocks also match due to ordinary similarity between different parts of the same image.

This partitioning and comparison of Range and Domain block is achieved by first calculating a domain image from the range image as shown in the figure below:

Figure 4: Generation of domain image from range image and fixed-size partitioning.

In this project the input image partitioning is done with non-overlapping square range blocks of fixed size 16x16 pixels. The domain block size (before downsampling) is double the range block size, which is 32x32 pixels. In order to create the domain image every 4 neighboring pixels of the range image is averaged and mapped to one pixel of the domain image as shown above. (4 red pixels being mapped to one blue pixel in domain image). This is equivalent to down sampling the range image by half and then low-pass filtering. Now Rj blocks in range image and Di blocks in domain image areof same size, 16x16 pixels and are ready for comparison.

To find the best affine map, we need to determine the co-efficients in equation 1.0for every domain and range block combination and choose the affine map that generates the closest match between a (Rj, Di) pair.The ai1, ai2, ai3 and ai4 are constrained to be 0.5 here since the Domain size is always twice the range size. The translation constants di1 and di2 are determined by the position of the top left corner pixel of the domain.The constants ci and bi, which represent the contrast scaling and brightness offset are calculated for each (Rj, Di) pairby minimizing the sum of squared errors between a range block Rj and domain block Di which is given by [3]:

E(ci,bi) = Nk=1(ci*dk + bi – rk) 2 (2.0)

In the above equation, k is the index of the pixel in the domain or range block and i is the index of the domain block. k varies from 1 to N, the number of pixels in the domain/range block. The rk are the pixel values of each range block and dk are the pixel values of the domain block for which we wish to find an affine map that minimizes the above error. This can be done by taking partial derivatives of the above expression with respect to ci and biand equating it to zero. This gives rise to two linear equations in ci and bi, which when solved givesthe following expressions for ci and bi:

ci = (NNk=1(rk .dk) - (Nk=1 rk) (Nk=1 dk) ) / ( N(Nk=1 dk2) - (Nk=1 dk)2 ) (3.0)

bi = (Nk=1 rk - ciNk=1 dk) / N (4.0)

After finding the best affine map among all (Rj, Di) pairs, if the mean square error between Rj and Di meets the minimum mean square error tolerable as specified by a parameter called quality q, thenci and bivalues of the best affine map are quantized and stored as integers using 4 bits for ci and 6 bits for bi in the compressed file along with the domain number.

If the mean square error between the (Rj, Di) pair exceeds that specified by q, then the current range block is split into 4 more blocks and the whole process of finding the best affine map for each of the sub-blocks repeats recursively. A bit ‘1’ is written to the compressed file in this case (otherwise a ‘0’) to let the decoder know that it needs to split the range block similarly.

The above process of range-block splitting continues recursively until either the minimum quality q is met or if the range-block size reduces to 4X4 pixels. Note that quality q is different from quality factor Q used in JPEG, which determines the quantization matrix, and the number of bits used to encode a pixel. The quality, q parameter here represents the average mean square error per pixel that is tolerable and which the compressor must try to achieve.

Decoding:

At the decoder, an initial random image (in this project an image with maximum gray level for all pixels) is chosen and using the values of ci and bi in the compressed file and the values of other constants (as mentioned earlier), the affine maps are constructed and applied iteratively to this initial image. On doing so within 8-9 iterations the original image is reproduced with good quality. (Note: Compression based on PIFS is lossy since only approximate affine transforms are used).

The size of the initial image used is normally equal to that of the original image, unless the output image needs to be scaled, in which case a larger input image of size equal to that of the output image required is chosen. The algorithm is guaranteed to converge to an image that is similar to the original image by Collage theorem.

Project Execution:

Two sample images ‘balloons.pgm’ and ‘columns.pgm’ of size 640x480 pixels each were used in the project for performing compression and decompression on them. The program that was used for the study and for compressing and decompressing these files can be found in [3]. The source code is in C language and was compiled and linked using GCC (GNU C Compiler). All programs that were run and results generated were done so on a Mac with OSX 10.6.5 operating system and a 2.4 GHz Intel Core 2 Duo processor.

Figure 5: Sample input image ‘columns.pgm’[7] (50% of original size)

Figure 6: Sample input original image ‘balloons.pgm’[7] (50% of original size)

Two executables are created after compilation and linking: ‘frac’, which is used for compression and ‘defrac’ for decompression. The program ‘frac’ doesn’t accept input image in .pgm format but accepts in .GS/.DAT format, which is a binary file format. To do this file format conversion a Matlab program CONV_PGM2DAT.m was implemented. The compressed files were named with an extension of .ff for distinguishing them from other file formats. The sample image ‘balloons.pgm’, after compression to ‘balloons.ff’ was decoded with different number of iterations, n as input for the program ‘defrac’. The decoded images that were generated for n=1, 2, 4, 8 from ‘balloons.pgm’ are as shown below:

Figure 7: Decoded image after n=1 iterations

Figure 8: Decoded image after n=2 iterations

Figure 9: Decoded image after n=4 iterations

Figure 10: Decoded image after n=8 iterations

For n=1 to 10, parameters MSE (Mean Square Error), PSNR (Peak Signal to Noise Ratio) and SSIM (Structural similarity index)[12] were computed and plotted between the output decompressed image and original image using Matlab programs [12]. The results are as shown below:

Figure 11: MSE vs. n

Figure 12: PSNR vs. n

Figure 13: SSIM vs. n

We can observe from the plots above that within 7-8iterations the parameters MSE, PSNR and SSIM become steady and reach sufficiently good values. Beyond n=8 there is not much improvement in the output image quality. Thus n=8-12 is ideal for a tradeoff between decompression time and image quality.

Performance Comparison between PIFS and JPEG:

The two sample images ‘balloons.pgm’ and ‘columns.pgm’ were first converted to their corresponding binary files that were named as ‘balloons.dat’ and ‘columns.dat’. The .dat files were then used as inputs for the program ‘frac’ (mentioned above) and run with different values for quality, q. The q is varied from 1 to 50. After decoding the fractally compressed image (.ff file) with number of iterations, n=12, MSE, PSNR and SSIM were calculated and plotted. Also compression ratios were calculated in each case as:

Compression ratio = input file size (.dat file) / output file size (.ff file) (5.0)

Note that files with .pgm extension list the grey levels in ASCII format and hence these valuesare converted to their binary equivalents and written to a file (with .dat extension) whose file size is in turn used tocalculate compression ratio as mentioned above.

The results are shown below:

Figure 14: Compression ratio versus quality, q for PIFS.

Figure 15: MSE versus quality, q (n=12) for PIFS.

Figure 16: PSNR versus quality, q (n=12) for PIFS.

Figure 17: SSIM versus quality, q (n=12) for PIFS.

The two sample images mentioned above were also used as input for JPEG compression and decompression programs. The JPEG program that was used was taken from the Independent JPEG Group [18]. This uses the baseline profile of JPEG. After compression and decompression using JPEG programs cjpeg and djpeg, for different values of quality factor, QF, metrics MSE, PSNR, SSIM were computed and plotted. Also compression ratios were calculated in each case as:

Compression ratio = input file size (in .dat format) / output file size (in .jpg format) (6.0)

Note that quality factor, QF for JPEG is different from quality, q for PIFS. In JPEG, when QF increases, decoded image quality increases with 0 for worst quality and 100 for best quality. Whereas in PIFS, quality of decoded image decreases with increase in q, since q in PIFS is the average mean square error tolerable per pixel.

The results are as below:

Figure 18: Compression ratio versus QF for JPEG.

Figure 19: MSE versus QF for JPEG.

Figure 20: PSNR versus QF for JPEG.

Figure 21: SSIM versus QF for JPEG.

It is observed that the PIFS algorithm can reach very high compression ratios compared to JPEG without extreme degradation of the image (in terms of PSNR/MSE and SSIM relative to JPEG). However at low compression ratios, JPEG performs much better than PIFS in terms of image quality.

This can also be seen in the images generated below, for six different compression ratios using JPEG and PIFS schemes. In images in figures 22 to 29, it is observed that images compressed using JPEG are better than those obtained by using PIFS. However at high compression ratios such as in figures 30and 31, although degradation in image quality is inevitable in both PIFS and JPEG, it is observed that PIFS compressed images appear much better than JPEG. Tables showing quality metrics for all six compression ratios are also given below for comparison.

Figure 22: Compressed balloons.pgmimage using JPEG at compression ratio = 10:1

Figure 23: Compressed balloons.pgm image using PIFS at compression ratio = 10:1

Compression Standard / MSE / PSNR (in dB) / SSIM / QF / q
JPEG / 0.6524 / 49.9856 / 0.9887 / 86 / -
PIFS / 5.7599 / 40.5266 / 0.9653 / - / 1.9

Table 1: Quality metrics for JPEG and PIFS compressed balloons.pgm images at compression ratio = 10:1

Figure 24: Compressed balloons.pgm image using JPEG at compression ratio = 20:1

Figure 25: Compressed balloons.pgm image using PIFS at compression ratio = 20:1

Compression Standard / MSE / PSNR (in dB) / SSIM / QF / q
JPEG / 1.9090 / 45.3228 / 0.9728 / 53 / -
PIFS / 6.8777 / 39.7564 / 0.9530 / - / 4.5

Table 2: Quality metrics for JPEG and PIFS compressed balloons.pgm images at compression ratio = 20:1

Figure 26: Compressed balloons.pgm image using JPEG at compression ratio = 30:1

Figure 27: Compressed balloons.pgm image using PIFS at compression ratio = 30:1

Compression Standard / MSE / PSNR (in dB) / SSIM / QF / q
JPEG / 3.9671 / 42.1461 / 0.9499 / 26 / -
PIFS / 9.1227 / 38.5296 / 0.9354 / - / 6.8

Table 3: Quality metrics for JPEG and PIFS compressed balloons.pgm images at compression ratio = 30:1

Figure 28: Compressed balloons.pgm image using JPEG at compression ratio = 40:1

Figure 29: Compressed balloons.pgm image using PIFSat compression ratio = 40:1

Compression Standard / MSE / PSNR (in dB) / SSIM / QF / q
JPEG / 7.4093 / 39.4330 / 0.9199 / 15 / -
PIFS / 10.8771 / 37.7657 / 0.9213 / - / 8.8

Table 4: Quality metrics for JPEG and PIFS compressed balloons.pgm images at compression ratio = 40:1

Figure 30: Compressed balloons.pgm image using JPEG at compression ratio = 60:1

Figure 31: Compressed balloons.pgm image using PIFSat compression ratio = 60:1

Compression Standard / MSE / PSNR (in dB) / SSIM / QF / q
JPEG / 27.4239 / 33.7495 / 0.8239 / 5 / -
PIFS / 14.4388 / 36.5355 / 0.8977 / - / 12.8

Table 5: Quality metrics for JPEG and PIFS compressed balloons.pgm images at compression ratio = 60:1

Figure 32: Compressed balloons.pgm image using JPEG at compression ratio = 68:1

Figure 33: Compressed balloons.pgm image using PIFS at compression ratio = 68:1

Compression Standard / MSE / PSNR (in dB) / SSIM / QF / q
JPEG / 48.6423 / 31.2607 / 0.7883 / 3 / -
PIFS / 15.7656 / 36.1537 / 0.8904 / - / 14.3

Table 6: Quality metrics for JPEG and PIFS compressed balloons.pgm images at compression ratio = 68:1

Compression and decompression times were calculated by running the PIFS and JPEG compression and decompression programs along with the Unix ‘time’ command in the terminal and noting down the user-mode execution times. These values depend upon the processor on which the programs were run and the below values correspond to time taken on a 2.4GHz Intel Core 2 Duo Processor:

Figure 34: Compression time versus q for PIFS.

Figure 35: Decompression time versus q for PIFS.

Figure 36: Compression time versus QF for JPEG.

Figure 37: Decompression time versus QF for JPEG.

It can be observedfrom the above plots that the compression and decompression times for PIFS are highly asymmetric.

Scale/Resolution independence:

Fractal decoding has a unique property called Scale or resolution independence. Normally the decoder starts with an initially random image of the same size as the original. However if the decoder instead starts with an image of say double the size, then the result will be an image that is twice in size of the original. Unlike in JPEG, the decoded image will not have ‘blockiness’ that appears due to simply replicating the nearest pixels of the original image (also known as nearest neighbor interpolation). The FIC decoded image will still contain detail at every scale. This is because the affine transforms used to encode the image do not depend on its resolution.