Image Processing: Image Compression
Introduction Compression & Decompression Steps Information theory: Entropy Information Coding Binary Image Compression Lossless Gray Level Image Compression Lossy Compression of Gray Level Images
Definition of image compression
Image compression is data compression on digital images Its objective is to reduce the quantity of information (number of bytes) used to represent an image needs less storage allows faster transmission The principle consists of reducing redundancies due to data correlation (data redundancy) due to the use of non optimal codes (coding redundancy) Compression : the original image is transformed and encoded into a compressed file Decompression : the compressed file is decoded and the original image is reconstructed
© 2004 Rolf Ingold, University of Fribourg
Two groups of compression methods
Lossless compression methods : all information is preserved the transformation is reversible suitable for binary and indexed color images
Lossy compression methods : some information is lost the transformation is not reversible compression artifacts are introduced suitable for natural gray-level and color images such as photos best image quality at a given compression rate is the main goal
© 2004 Rolf Ingold, University of Fribourg
Compression steps
Transformation perfoms data decorrelation to reduce data redundancy Quantization performs approximation by restricting the set of possible values Encoding assigns optimal codes to eliminate coding redundancy loss of information quantization original image
transformation
encoding
information preserved
© 2004 Rolf Ingold, University of Fribourg
compressed image
Decompression steps
Decoding restores the values representing the data in the tranfomated space Reconstruction recomputes the data of the original image space
compressed image
decoding
reconstruction
original image
For some applications, special requirements may be requested progressive display to preview a low quality image while downloading (stepwise) a better quality version
© 2004 Rolf Ingold, University of Fribourg
Encoding
Goal : encode a sequence of symbols (or values), by minimizing the length of the data assign a binary code to each symbol which is reversible (can be decoded univoquely) Types of encoding schemes used entropy encoding Huffman coding range coding arithmetic coding (encumbered by patents !) dictionary based encoding
© 2004 Rolf Ingold, University of Fribourg
Information entropy
Entropy is a basic concept of information theory Formal definition (by Shanon) : the entropy of a discrete random event x, with possible states 1..n , is defined as
H ( x) = −
n
∑ p(i) log
2
p(i )
i =1
Entropy is a lower bound of the average amount of bits used to represent one event entropy is maximal and equal to log2 n if all states have the same probability entropy decreases with the differences of the states' probalilities entropy is minimal and equal to 0 if only one state can occur example: Bernoulli trial with two state
© 2004 Rolf Ingold, University of Fribourg
Entropy example
We consider an image with 8 grey levels Z in the range 0..7 with known distributions PZ ; the table below computes its entropy Z
PZ
-log2(PZ)
-PZ log2 (PZ)
0
0.19
2.396
0.455
1
0.25
2.000
0.500
2
0.21
2.252
0.473
3
0.16
2.644
0.423
4
0.08
3.644
0.292
5
0.06
4.059
0.244
6
0.03
5.059
0.152
7
0.02
5.644
0.113
Entropy :
2.652
© 2004 Rolf Ingold, University of Fribourg
Infomation coding principles
An encoder translates a sequence of symbols (describing the states of an event sequence) into a string of bits The decoder interpretes this string of bits to reconstruct the sequence of symbols Compression is achieved by using codes having variable lengths short codes are used for frequent symbols long codes are used for rare symbols Two principles should be observed decoding must be univoque : this is the case for prefix-free codes (no code is a prefix of another one) each bit string must be usable : this is achieved if each bit string either has a code as prefix, or it is the prefix of a code Horn adresses of binary trees meet both requirements
© 2004 Rolf Ingold, University of Fribourg
Information coding example
Evaluation of a variable length code used to represent the gray levels of the previous example of a Z
FreqZ
code 1
l1
FreqZ . l1
l2
FreqZ . l2
0
19
000
3
57
11
2
38
1
25
001
3
75
01
2
50
2
21
010
3
63
10
2
42
3
16
011
3
48
001
3
48
4
8
100
3
24
0001
4
43
5
6
101
3
18
00001
5
30
6
3
110
3
9
000001
6
18
7
2
111
3
6
000000
6
12
size :
© 2004 Rolf Ingold, University of Fribourg
300
code 2
size :
281
Huffman codes
Huffman codes are optimal prefix-free codes associated to symbols Huffman codes can be built by the following algorithm using a forest of binary trees and a heap heap = new Heap(); for each Symbol s do heap.add(new Node(s, freq(s), null, null); while (heap.size > 1) { node1 = heap.extraxt(); node2 = heap.extraxt(); f1 = node1.getFreq(); f2 = node2.getFreq() heap.add(new Node(null, f1+f2, n1, n2); } result = heap.extract(); the code of each symbol corresponds to its Horn address of the result tree
© 2004 Rolf Ingold, University of Fribourg
Huffman codes example
The Huffman codes associated with the previous example is built with the binary tree below 1
p0=0.19
1
0.40 p1=0.25 p2=0.21
1
0 1
p3=0.16 1
p4=0.08 1
p5=0.06 p6=0.03 p7=0.02
z0 → 11 z4 → 0001
1 0
0
0.05
0.11
0
z1 → 01 z5 → 00001
© 2004 Rolf Ingold, University of Fribourg
0.19
0
0.35
0
0.60
z2 → 10 z6 → 000001
0
1.00
z3 → 001 z7 → 000000
Arithmetic coding
Arithmetic coding is an optimal entropy encoding technique in which the entire message is encoded as a single number between 0 and 1 (with very precision) Example: consider 4 symbols with the following probabilities A(50%), B(20%), C(20%), D(10%) each symbol correespond to an intevall: A(0.0-0.5), B(0.5-0.7), C(0.7-0.9), D(0.9-1) the message ACBADA is encoded as 0.0+0.5*(0.7+0.2*(0.5+0.2*(0.0+0.5*(0.9+0.1*(0.0+0.5*(0)))))) = 0.409 Many companies own patents in the US and other countries on algorithms used for implementing an arithmetic encoder and decoder ! Range coding is an equivalent technique believed to not be covered by any company's patents !
© 2004 Rolf Ingold, University of Fribourg
Dictionary based coders
Dictionary based coders have initialy been developed to compress text Performance is similar to entropy based coders Principles sequences of symbols are encoded according to entries in a dictionary dictionaries may be static or dynamic to avoid transmitting the dictionary, it is possible to automatically build a dictionary of previously seen strings The following decoders belong to this family LZW : used by the Unix compress utility and GIF files, but protected by several patents LZ77, LZ78 : ancestors of LZW, free of patents DEFLATE : combining LZ77 and Huffman codes, used for ZIP and gzip file formats
© 2004 Rolf Ingold, University of Fribourg
Limitation of entropy encoding
Entropy encoding is suitable for removing coding redundancy Images generally don't have much coding redundancy entropy encoding alone does not allow to compress image data significantly ! The role of preliminary transformations is to convert data redundancy into coding redundancy, which can later be eliminated by entropy coding
© 2004 Rolf Ingold, University of Fribourg
Binary Image Compression
Types of transformations used for binary images run length encoding scan-line differences quadtrees
© 2004 Rolf Ingold, University of Fribourg
Run-Length Encoding (RLE)
RLE is a very simple form of lossless data compression Lengths of horizontal runs (sequences of same pixel values) are stored as a single number Example : Original image using 120 bits
RLE representation: 5, 3, 16, 4, 4, 9, 3, 4, 4, 4, 9, 3, 4, 5, 4, 8, 3, 4, 6, 14, 4 using 21x5=105 bits (assuming five bits are used for each number)
© 2004 Rolf Ingold, University of Fribourg
2D-Run-Length Encoding
Principle : apply RLE on scanline differences Example :
ABS, 5, 3, ABS, FL, REL, -1, 0, ABS, 9, 3, ABS, FL, REL, 0, 0, REL, 0, 0, ABS, FL, REL, +1, +1, REL, 0, 0, ABS, FL, ABS, 6, 14, ABS, FL
0 1010 011 0 0000 1 10 0 0 11011 011 0 0000 1 0 0 1 0 0 0 0000 1 10 10 1 0 0 0 0000 0 1011 1111010 0 0000
code table ABS REL FL 0 -1 +1 0 1 2 3 4 5 6 6+...
0 1 0000 0 10 11 0001 001 010 011 100 1010 1011 11...
size of the compressed encoding : 13 + 18 + 11 + 13 + 17 = 72 bits
© 2004 Rolf Ingold, University of Fribourg
Quadtree Data Structure
A quadtree is a data structure used to partition recursively an image by subdividing it into four quadrants the process is stoped if a node correspond to a homogeneous square 0
1
A
B
4
C
1
L
2
G
D
E
H
I
J
K
M
P
Q
R
A
B
H
I
A
B
H
I
4
G
J
K
C D E F
G
J
K
M
N
M
N
O P Q R
S
L
L 5
© 2004 Rolf Ingold, University of Fribourg
S
5
O
2
3
N
F
0 L
3
S
Quadtree compression
Quatrees can be used for lossless data compression the quadtree is represented by the list of nodes in pre-order, which is significant to rebuild the tree univoquely Example: Original image using 64 bits
quadtree representation: XXWWXWWWBBXWBBWBXWWXBBWWB using 25x2=50 bits (assuming two bits are used for each symbol)
© 2004 Rolf Ingold, University of Fribourg
Lossless Gray Level Image Compression
Performing lossless gray-level image compression can be done by using binary compression on bit- plane decomposition by decorrelating adjacent pixels and using entropy coding
Performance is limited !
© 2004 Rolf Ingold, University of Fribourg
Bit-plane Decomposition
A binary image with 256 gray levels can be represented by 8 bit-planes a0, a1, ... a7. the higher planes (a7, ... ) are more compressible than the lower
A better compression is achieved with the planes of the Gray code defined as gi = ai ⊕ ai +1
© 2004 Rolf Ingold, University of Fribourg
Entropy Encoding for Gray-Level images
Usually, gray-level images don't have much coding redundancy
Example: the set of 256 gray levels of the 65'536 pixels of the Lena picture have an entropy of 7.394 (correponding to 60'572 bytes)
using Huffman codes, the picture can be compressed into 60'825 bytes (plus 481 bytes for the codebook)
700 600 500 400 300 200 100 50
© 2004 Rolf Ingold, University of Fribourg
100
150
200
Gray-Level Image Transfomations
Transformations of gray-level images are using pixel differences rather than absolute values based on scan lines based on quadtrees global transformations Karhunen-Loève or Hotelling transform (also PCA) Fourier transform, Walsh-Hadamard transform, discrete cosinus transform (DCT), etc. Haar and other wavelets
© 2004 Rolf Ingold, University of Fribourg
Data Correlation for Adjacent Pixels
In gray level images, values of adjacent pixels are highly correlated Example (from the Lena picture) first plot : distribution of (z1, z2) with z1 = f(2x, y), z2=f(2x+1, y) first plot : distribution of (z1, z2-z1) 200
200 100
150
50
100
150
200
250
100 -100
50 -200
50
100
150
200
the distribution of z2-z1 is much narrower than the distribution of z2 !
© 2004 Rolf Ingold, University of Fribourg
Decorrelation by Adjacent Pixel Differences
Replacing pixel values by pixel differences narrows the distribution reduces the entropy
Example: the pixel differences of the Lena picture have an entropy of 5.277 (needing globaly 43'321 bytes) 6000 5000 4000 3000 2000 1000 -100
© 2004 Rolf Ingold, University of Fribourg
-50
0
50
100
Decorrelation by Quadtrees
Pixel differences can be generalized to 2 dimensions by considering quadtrees each node is characterized by the mean value of all its leaves (representing pixels) and stores the gray-level difference to its father Example : for the Lena picture, the quatree of differences has an entropy of 4.710
© 2004 Rolf Ingold, University of Fribourg
Lossy Compression of Gray-Level Images
Lossy compression consists of transforming spatial data in order to reduce data correlation quantization of the values in transformed space encode the data by eliminating code redundancy
Type of transfomations used Karhunen-Loève Transfom (KLT) Fourier Transform, Walsh-Hadamard Transform and Discrete Cosinus Transform (DCT) Wavelets, such as Haar Transform ...
© 2004 Rolf Ingold, University of Fribourg
Karhunen-Loève Transform
The Karhunen-Loève transform (KLT) is the linear transform that optimaly decorrelates the data The KLT is calculated as follows estimate the covariance matrix 1 n 1 n t t t C = E{( x − m)(x − m) } = xk xk − mm m = E{x} = xk n k =1 n k =1
∑
∑
find the eigenvectors (i.e an orthonormal matrix) A such as 0⎞ ⎛ λ1 ⎟ ⎜ λ2 ⎟ ⎜ ACAt = ⎜ ⎟ % ⎟⎟ ⎜⎜ λd ⎠ ⎝0
the transformation y=At(x-m) is the Karhunen-Loève transform The reverse transform is obtained by x=Ay+m
© 2004 Rolf Ingold, University of Fribourg
Karhunen-Loève Transform (cont.)
The Karhunen-Loève transform can be applied on blocks of adjacent pixels The Karhunen-Loève transform can be understood as a linear transform of a set of data dependent basis function these functions are ordered according to decreasing eigenvalues Example : the 64 8x8 basis functions of the Lena Pictures are listed below with their index 1
2
3
4
© 2004 Rolf Ingold, University of Fribourg
5
...
15
16
17
...
62
63
64
Quantization of the KLT data
The KLT is a reversible transform: the original image can be rebuilt from the KLT coefficients For lossy compression, KLT coefficient are quantized, using two complementary principles the set of possible values is restricted to multiples of a quantization interval yˆ = q round( y / q ) only a subset of KLT coefficients are used (the others are put to 0) Entropy coding used for KLT basis encoding KLT coefficients
© 2004 Rolf Ingold, University of Fribourg
Example of KLT compression
Original image compared to 1/4 of KLT coefficients quantized with multiples of 16 (size = 3'834 bytes, compression rate = ~1:17) 1/8 of KLT coefficients quantized with multiples of 16 (size = 2'152 bytes, compression rate = ~1:30)
© 2004 Rolf Ingold, University of Fribourg
Distortion
Two measures are used to characterize distortion mean square error (MSE) is defined as MSE = E[( x − xˆ )2 ] peak signal-to-noise ratio (PSNR) is defined as m2 PSNR = 10 log10 2 E[( x − xˆ ) ] where m represents the peak value (255 for 8-bit gray level image)
© 2004 Rolf Ingold, University of Fribourg