Image Processing: Image Compression

compression artifacts are introduced. ▫ suitable for natural gray-level and color images such as photos. ▫ best image quality at a given compression rate is the ...
413KB taille 9 téléchargements 499 vues
Image Processing: Image Compression

Introduction Compression & Decompression Steps Information theory: Entropy Information Coding Binary Image Compression Lossless Gray Level Image Compression Lossy Compression of Gray Level Images

Definition of image compression ƒ ƒ

ƒ

ƒ ƒ

Image compression is data compression on digital images Its objective is to reduce the quantity of information (number of bytes) used to represent an image ƒ needs less storage ƒ allows faster transmission The principle consists of reducing redundancies ƒ due to data correlation (data redundancy) ƒ due to the use of non optimal codes (coding redundancy) Compression : the original image is transformed and encoded into a compressed file Decompression : the compressed file is decoded and the original image is reconstructed

© 2004 Rolf Ingold, University of Fribourg

Two groups of compression methods ƒ

Lossless compression methods : ƒ all information is preserved ƒ the transformation is reversible ƒ suitable for binary and indexed color images

ƒ

Lossy compression methods : ƒ some information is lost ƒ the transformation is not reversible ƒ compression artifacts are introduced ƒ suitable for natural gray-level and color images such as photos ƒ best image quality at a given compression rate is the main goal

© 2004 Rolf Ingold, University of Fribourg

Compression steps ƒ ƒ ƒ

Transformation perfoms data decorrelation to reduce data redundancy Quantization performs approximation by restricting the set of possible values Encoding assigns optimal codes to eliminate coding redundancy loss of information quantization original image

transformation

encoding

information preserved

© 2004 Rolf Ingold, University of Fribourg

compressed image

Decompression steps ƒ ƒ

Decoding restores the values representing the data in the tranfomated space Reconstruction recomputes the data of the original image space

compressed image

ƒ

decoding

reconstruction

original image

For some applications, special requirements may be requested ƒ progressive display to preview a low quality image while downloading (stepwise) a better quality version

© 2004 Rolf Ingold, University of Fribourg

Encoding ƒ

ƒ

Goal : encode a sequence of symbols (or values), by minimizing the length of the data ƒ assign a binary code to each symbol ƒ which is reversible (can be decoded univoquely) Types of encoding schemes used ƒ entropy encoding ƒ Huffman coding ƒ range coding ƒ arithmetic coding (encumbered by patents !) ƒ dictionary based encoding

© 2004 Rolf Ingold, University of Fribourg

Information entropy ƒ ƒ

Entropy is a basic concept of information theory Formal definition (by Shanon) : the entropy of a discrete random event x, with possible states 1..n , is defined as

H ( x) = −

n

∑ p(i) log

2

p(i )

i =1

ƒ

Entropy is a lower bound of the average amount of bits used to represent one event ƒ entropy is maximal and equal to log2 n if all states have the same probability ƒ entropy decreases with the differences of the states' probalilities ƒ entropy is minimal and equal to 0 if only one state can occur ƒ example: Bernoulli trial with two state

© 2004 Rolf Ingold, University of Fribourg

Entropy example ƒ

We consider an image with 8 grey levels Z in the range 0..7 with known distributions PZ ; the table below computes its entropy Z

PZ

-log2(PZ)

-PZ log2 (PZ)

0

0.19

2.396

0.455

1

0.25

2.000

0.500

2

0.21

2.252

0.473

3

0.16

2.644

0.423

4

0.08

3.644

0.292

5

0.06

4.059

0.244

6

0.03

5.059

0.152

7

0.02

5.644

0.113

Entropy :

2.652

© 2004 Rolf Ingold, University of Fribourg

Infomation coding principles ƒ ƒ ƒ

ƒ

ƒ

An encoder translates a sequence of symbols (describing the states of an event sequence) into a string of bits The decoder interpretes this string of bits to reconstruct the sequence of symbols Compression is achieved by using codes having variable lengths ƒ short codes are used for frequent symbols ƒ long codes are used for rare symbols Two principles should be observed ƒ decoding must be univoque : this is the case for prefix-free codes (no code is a prefix of another one) ƒ each bit string must be usable : this is achieved if each bit string either has a code as prefix, or it is the prefix of a code Horn adresses of binary trees meet both requirements

© 2004 Rolf Ingold, University of Fribourg

Information coding example ƒ

Evaluation of a variable length code used to represent the gray levels of the previous example of a Z

FreqZ

code 1

l1

FreqZ . l1

l2

FreqZ . l2

0

19

000

3

57

11

2

38

1

25

001

3

75

01

2

50

2

21

010

3

63

10

2

42

3

16

011

3

48

001

3

48

4

8

100

3

24

0001

4

43

5

6

101

3

18

00001

5

30

6

3

110

3

9

000001

6

18

7

2

111

3

6

000000

6

12

size :

© 2004 Rolf Ingold, University of Fribourg

300

code 2

size :

281

Huffman codes ƒ ƒ

Huffman codes are optimal prefix-free codes associated to symbols Huffman codes can be built by the following algorithm using a forest of binary trees and a heap heap = new Heap(); for each Symbol s do heap.add(new Node(s, freq(s), null, null); while (heap.size > 1) { node1 = heap.extraxt(); node2 = heap.extraxt(); f1 = node1.getFreq(); f2 = node2.getFreq() heap.add(new Node(null, f1+f2, n1, n2); } result = heap.extract(); ƒ the code of each symbol corresponds to its Horn address of the result tree

© 2004 Rolf Ingold, University of Fribourg

Huffman codes example ƒ

The Huffman codes associated with the previous example is built with the binary tree below 1

p0=0.19

1

0.40 p1=0.25 p2=0.21

1

0 1

p3=0.16 1

p4=0.08 1

p5=0.06 p6=0.03 p7=0.02

z0 → 11 z4 → 0001

1 0

0

0.05

0.11

0

z1 → 01 z5 → 00001

© 2004 Rolf Ingold, University of Fribourg

0.19

0

0.35

0

0.60

z2 → 10 z6 → 000001

0

1.00

z3 → 001 z7 → 000000

Arithmetic coding ƒ

ƒ

ƒ ƒ

Arithmetic coding is an optimal entropy encoding technique in which the entire message is encoded as a single number between 0 and 1 (with very precision) Example: ƒ consider 4 symbols with the following probabilities A(50%), B(20%), C(20%), D(10%) ƒ each symbol correespond to an intevall: A(0.0-0.5), B(0.5-0.7), C(0.7-0.9), D(0.9-1) ƒ the message ACBADA is encoded as 0.0+0.5*(0.7+0.2*(0.5+0.2*(0.0+0.5*(0.9+0.1*(0.0+0.5*(0)))))) = 0.409 Many companies own patents in the US and other countries on algorithms used for implementing an arithmetic encoder and decoder ! Range coding is an equivalent technique believed to not be covered by any company's patents !

© 2004 Rolf Ingold, University of Fribourg

Dictionary based coders ƒ ƒ ƒ

ƒ

Dictionary based coders have initialy been developed to compress text Performance is similar to entropy based coders Principles ƒ sequences of symbols are encoded according to entries in a dictionary ƒ dictionaries may be static or dynamic ƒ to avoid transmitting the dictionary, it is possible to automatically build a dictionary of previously seen strings The following decoders belong to this family ƒ LZW : used by the Unix compress utility and GIF files, but protected by several patents ƒ LZ77, LZ78 : ancestors of LZW, free of patents ƒ DEFLATE : combining LZ77 and Huffman codes, used for ZIP and gzip file formats

© 2004 Rolf Ingold, University of Fribourg

Limitation of entropy encoding ƒ ƒ

ƒ

Entropy encoding is suitable for removing coding redundancy Images generally don't have much coding redundancy ƒ entropy encoding alone does not allow to compress image data significantly ! The role of preliminary transformations is to convert data redundancy into coding redundancy, which can later be eliminated by entropy coding

© 2004 Rolf Ingold, University of Fribourg

Binary Image Compression ƒ

Types of transformations used for binary images ƒ run length encoding ƒ scan-line differences ƒ quadtrees

© 2004 Rolf Ingold, University of Fribourg

Run-Length Encoding (RLE) ƒ ƒ ƒ

RLE is a very simple form of lossless data compression Lengths of horizontal runs (sequences of same pixel values) are stored as a single number Example : ƒ Original image using 120 bits

ƒ RLE representation: 5, 3, 16, 4, 4, 9, 3, 4, 4, 4, 9, 3, 4, 5, 4, 8, 3, 4, 6, 14, 4 using 21x5=105 bits (assuming five bits are used for each number)

© 2004 Rolf Ingold, University of Fribourg

2D-Run-Length Encoding ƒ ƒ

Principle : apply RLE on scanline differences Example :

ABS, 5, 3, ABS, FL, REL, -1, 0, ABS, 9, 3, ABS, FL, REL, 0, 0, REL, 0, 0, ABS, FL, REL, +1, +1, REL, 0, 0, ABS, FL, ABS, 6, 14, ABS, FL

0 1010 011 0 0000 1 10 0 0 11011 011 0 0000 1 0 0 1 0 0 0 0000 1 10 10 1 0 0 0 0000 0 1011 1111010 0 0000

code table ABS REL FL 0 -1 +1 0 1 2 3 4 5 6 6+...

0 1 0000 0 10 11 0001 001 010 011 100 1010 1011 11...

ƒ size of the compressed encoding : 13 + 18 + 11 + 13 + 17 = 72 bits

© 2004 Rolf Ingold, University of Fribourg

Quadtree Data Structure ƒ

A quadtree is a data structure used to partition recursively an image by subdividing it into four quadrants ƒ the process is stoped if a node correspond to a homogeneous square 0

1

A

B

4

C

1

L

2

G

D

E

H

I

J

K

M

P

Q

R

A

B

H

I

A

B

H

I

4

G

J

K

C D E F

G

J

K

M

N

M

N

O P Q R

S

L

L 5

© 2004 Rolf Ingold, University of Fribourg

S

5

O

2

3

N

F

0 L

3

S

Quadtree compression ƒ

ƒ

Quatrees can be used for lossless data compression ƒ the quadtree is represented by the list of nodes in pre-order, which is significant to rebuild the tree univoquely Example: Original image using 64 bits

ƒ quadtree representation: XXWWXWWWBBXWBBWBXWWXBBWWB using 25x2=50 bits (assuming two bits are used for each symbol)

© 2004 Rolf Ingold, University of Fribourg

Lossless Gray Level Image Compression ƒ

Performing lossless gray-level image compression can be done ƒ by using binary compression on bit- plane decomposition ƒ by decorrelating adjacent pixels and using entropy coding

ƒ

Performance is limited !

© 2004 Rolf Ingold, University of Fribourg

Bit-plane Decomposition ƒ

A binary image with 256 gray levels can be represented by 8 bit-planes a0, a1, ... a7. ƒ the higher planes (a7, ... ) are more compressible than the lower

ƒ

A better compression is achieved with the planes of the Gray code defined as gi = ai ⊕ ai +1

© 2004 Rolf Ingold, University of Fribourg

Entropy Encoding for Gray-Level images ƒ

Usually, gray-level images don't have much coding redundancy

ƒ

Example: ƒ the set of 256 gray levels of the 65'536 pixels of the Lena picture have an entropy of 7.394 (correponding to 60'572 bytes)

ƒ using Huffman codes, the picture can be compressed into 60'825 bytes (plus 481 bytes for the codebook)

700 600 500 400 300 200 100 50

© 2004 Rolf Ingold, University of Fribourg

100

150

200

Gray-Level Image Transfomations ƒ

Transformations of gray-level images are using ƒ pixel differences rather than absolute values ƒ based on scan lines ƒ based on quadtrees ƒ global transformations ƒ Karhunen-Loève or Hotelling transform (also PCA) ƒ Fourier transform, Walsh-Hadamard transform, discrete cosinus transform (DCT), etc. ƒ Haar and other wavelets

© 2004 Rolf Ingold, University of Fribourg

Data Correlation for Adjacent Pixels ƒ ƒ

In gray level images, values of adjacent pixels are highly correlated Example (from the Lena picture) ƒ first plot : distribution of (z1, z2) with z1 = f(2x, y), z2=f(2x+1, y) ƒ first plot : distribution of (z1, z2-z1) 200

200 100

150

50

100

150

200

250

100 -100

50 -200

50

100

150

200

ƒ the distribution of z2-z1 is much narrower than the distribution of z2 !

© 2004 Rolf Ingold, University of Fribourg

Decorrelation by Adjacent Pixel Differences ƒ

Replacing pixel values by pixel differences ƒ narrows the distribution ƒ reduces the entropy

ƒ

Example: ƒ the pixel differences of the Lena picture have an entropy of 5.277 (needing globaly 43'321 bytes) 6000 5000 4000 3000 2000 1000 -100

© 2004 Rolf Ingold, University of Fribourg

-50

0

50

100

Decorrelation by Quadtrees ƒ

ƒ

Pixel differences can be generalized to 2 dimensions by considering quadtrees ƒ each node is characterized by the mean value of all its leaves (representing pixels) ƒ and stores the gray-level difference to its father Example : ƒ for the Lena picture, the quatree of differences has an entropy of 4.710

© 2004 Rolf Ingold, University of Fribourg

Lossy Compression of Gray-Level Images ƒ

Lossy compression consists of ƒ transforming spatial data in order to reduce data correlation ƒ quantization of the values in transformed space ƒ encode the data by eliminating code redundancy

ƒ

Type of transfomations used ƒ Karhunen-Loève Transfom (KLT) ƒ Fourier Transform, Walsh-Hadamard Transform and Discrete Cosinus Transform (DCT) ƒ Wavelets, such as Haar Transform ƒ ...

© 2004 Rolf Ingold, University of Fribourg

Karhunen-Loève Transform ƒ ƒ

The Karhunen-Loève transform (KLT) is the linear transform that optimaly decorrelates the data The KLT is calculated as follows ƒ estimate the covariance matrix 1 n 1 n t t t C = E{( x − m)(x − m) } = xk xk − mm m = E{x} = xk n k =1 n k =1





ƒ find the eigenvectors (i.e an orthonormal matrix) A such as 0⎞ ⎛ λ1 ⎟ ⎜ λ2 ⎟ ⎜ ACAt = ⎜ ⎟ % ⎟⎟ ⎜⎜ λd ⎠ ⎝0

ƒ

ƒ the transformation y=At(x-m) is the Karhunen-Loève transform The reverse transform is obtained by x=Ay+m

© 2004 Rolf Ingold, University of Fribourg

Karhunen-Loève Transform (cont.) ƒ ƒ

ƒ

The Karhunen-Loève transform can be applied on blocks of adjacent pixels The Karhunen-Loève transform can be understood as a linear transform of a set of data dependent basis function ƒ these functions are ordered according to decreasing eigenvalues Example : ƒ the 64 8x8 basis functions of the Lena Pictures are listed below with their index 1

2

3

4

© 2004 Rolf Ingold, University of Fribourg

5

...

15

16

17

...

62

63

64

Quantization of the KLT data ƒ ƒ

ƒ

The KLT is a reversible transform: the original image can be rebuilt from the KLT coefficients For lossy compression, KLT coefficient are quantized, using two complementary principles ƒ the set of possible values is restricted to multiples of a quantization interval yˆ = q round( y / q ) ƒ only a subset of KLT coefficients are used (the others are put to 0) Entropy coding used for ƒ KLT basis encoding ƒ KLT coefficients

© 2004 Rolf Ingold, University of Fribourg

Example of KLT compression ƒ

Original image compared to ƒ 1/4 of KLT coefficients quantized with multiples of 16 (size = 3'834 bytes, compression rate = ~1:17) ƒ 1/8 of KLT coefficients quantized with multiples of 16 (size = 2'152 bytes, compression rate = ~1:30)

© 2004 Rolf Ingold, University of Fribourg

Distortion ƒ

Two measures are used to characterize distortion ƒ mean square error (MSE) is defined as MSE = E[( x − xˆ )2 ] ƒ peak signal-to-noise ratio (PSNR) is defined as m2 PSNR = 10 log10 2 E[( x − xˆ ) ] where m represents the peak value (255 for 8-bit gray level image)

© 2004 Rolf Ingold, University of Fribourg