Chapter 2

Entropy

2.4 Coding (Binary Codes)

Code words

Alphabet (Collection of symbols)

(FLC) FWL (Fixed word length)

(VLC) VWL (Variable word length)

Uniquely decodable code

FLC (Codes)

a1 / 00
a2 / 01
a3 / 10
a4 / 11

Decoding (Simpler)

a4 a3 a4

00 11 01 10 10 11 01

a1 a2a3a2

(Error is localized. Does not affect the other bit stream)

Random bits stream: a1a4 a2 a3 a3a4 a2

00110110101101

Bit position 6 i.e. bit 1 detected as 0.

P.S: These notes are adapted from K.Sayood “Introduction to data compression”, Morgan Kaufmann, 3rd Edition, San Francisco, CA, 2006.

2.4.1 Uniquely Decodable Codes

Ex: Source alphabet

Table 3.1

ai / P(ai) / Binary Code
1 / 2 / 3 / 4
a1 / ½ / 0 / 0 / 0 / 0
a2 / ¼ / 0 / 1 / 10 / 01
a3 / 1/8 / 1 / 00 / 110 / 011
a4 / 1/8 / 10 / 11 / 111 / 0111
Avg. Length / 1.125 / 1.25 / 1.75 / 1.875

4

Avg. Length =  P(ai) n(ai) (bits/symbol) VLC

i=1

4

 P(ai) = 1,

i=1

Serial bit stream

1

0111 01 0 01 011 01 Decoder (Complex)

a4 a2 a1 a2 a3 a2

VLC is highly error sensitive

Random bits stream: a4 a2 a1 a2 a3 a2

1

01110100101101

Bit position 10 i.e. bit 0detected as 1. This can affect detection at the receiver.

Unique Decodability

A given sequence of codewords can be decoded in one and only one way. No Ambiguity.

Code 3 / Code 4
0 / 0
10 / 01
110 / 011
111 / 0111
Instantaneous code / Not instantaneous code

Shannon

N

Entropy = -  P(ai) log2P(ai)

i=1

N = # of symbols

P(ai) = probability of symbol ai

N

 P(ai) = 1, Entropy: Theoretical minimum average bit rate

i=1

Table 2.2 Consider Bitstream

a1 / 0
a2 / 01
a3 / 11

011111111111111111

17 ‘ones’

Uniquely decodable but not instantaneous.

Table 2.3 Consider Bitstream

a1 / 0
a2 / 01
a3 / 10

“01010101010101010”

Not Uniquely decodable

Unique decidability is a must.

Prefix code: No codeword is prefix of any other codeword. This guarantees unique decodability.

UniqueDecodability

Consider two binary codewords ‘a’ and ‘b’

a : k bits long k < n

b : n bits long

If the first ‘k’ bits of ‘b’ are identical to ‘a’ then ‘a’ is called prefix of ‘b’. Last n-k bits are called dangling suffix.

Ex.

a = 010, (‘a’ is prefix of ‘b’) 010

b = 010 11, ‘11’ = Dangling suffix

prefixk = 3, n = 5

2.4.2 Prefix codes

Unique Decodability : Examine dangling suffixes of codeword pairs in which one codeword is prefix of the other. If the dangling suffix is itself a codeword, then the code is not uniquely decodable.

Ex: 2.4.1

Codewords { 0, 01, 11} Uniquely Decodable

Codeword ‘0’ is prefix for ‘01’. Dangling suffix is ‘1’

Codewords {0, 01, 11, 1} Not Uniquely decodable

2.4.2 Prefix Codes (contd.,)

Prefix code : No codeword is prefix of any other code.

Code 2 (Not uniquely decodable) Root Node

a1 / 0
a2 / 1
a3 / 00
a4 / 11

(Fig. 2.4)

Internal node

External node (Leaf)

Protocol A Protocol B

Codes 3 & 4 are uniquely decodable

Code 3Root Node

a1 / 0
a2 / 10
a3 / 110
a4 / 111

a1

a2

a3a4

Code 4

a1 / 0
a2 / 01
a3 / 011
a4 / 0111

a4 is external node

a1, a2, a3 are internal nodes

For any non prefix uniquely decodable code, there is a prefix code with the same codeword lengths.

In a prefix code, codewords are associated only with the external nodes.

For any non-prefix uniquely decodable code, there is always a prefix code with the same codeword lengths.

(Prefix code: No codeword is prefix of any other codeword. This guarantees unique decodability.)

Chapter 3

Huffman Coding

VLC - prefix codes optimum for a given model.

Practical code closest to the entropy. If all probabilities are negative integer powers of two then Huffman code = Entropy.

Ex: 2-1, 2-2,2-3,2-3

N

Entropy= -  Pi log2 Pi(ai)

i=1 N

Minimum Theoretical bit rate to code N symbols,  Pi= 1

i=1

Huffman Code : Practical VLC comes very close to Entropy.

3.2 Huffman Coding

(Optimum prefix code )

1.Symbols that occur more frequently (Higher Probabilities) have shorter codewords than symbols that occur less frequently.

2.Two symbols that occur least frequently will have the same code length.

N

Average bit rate =  niPi (bits / symbol)

i=1

N = # of symbols

ni = bit size for symbol i

Pi = probability of symbol i

N

 Pi= 1, Entropy

i=1

3.Codewords corresponding to the two lowest probability symbols differ only in the last bit.

Two least Probability SymbolsCode word

r(m * 0) 11010010

(m * 1) 11010011

m

 = Concatenation

m = 1101001

Ex. 3.2.1 Design of Huffman Code

Given 1 Rearrange (VLC)

ai / P (ai)
1 / .2
2 / .4
3 / .2
4 / .1
5 / .1
ai / P (ai)
a2 / .4
a1 / .2
a3 / .2
a4 / .1
a5 / .1

5

 P(ai)= 1,

i=1 5

H = Entropy = -  P(ai) log2 P(ai) = 2.122 bits/symbol

i=1

= Minimum Average Theoretical bit rate

( P(ai) is either given or developed experimentally )

Huffman Tree (1.0)

0

01

a2 (.4) 0

0 (.6)

0 1a21

a1 (.2) 0a1 01

a3 000

0a4 0010

a3 (.2) 0 0 (.4)a5 0011

a4 (.1) 0 1 Huffman code is a Prefix code

0 Uniquely Decodable

0 (.2)

a5 (.1) 0 1

(See Fig. 3.2/ p.46)

Protocol

Average bit size = [2 * 0.2 + 1 * 0.4 + 3 * 0.2 + 2 * 4 * 0.1] (bit/symbol)

= 2.2

5

H = Entropy = -  Pi log2 Pi = 2.122 bits/symbol

i=1

Ex. 3.2.1

Average bit length

5

 P(ai) n(ai) = 2.2 bits/symbol

i=1

Redundancy = 2.2 – 2.122

= 0.078 bit/symbol

3.2.1Minimum Variance Huffman Codes (See Fig 3.3)

Always put the combined letter as high in the list as possible in the Huffman tree.

Fig. 3.4

Average bit length = 2.2 bits/symbol

Pi

.2a110

.4a200

.2a311

.1a4010

.1a5011

Min Variance Huffman tree

Buffer design become much simpler. (See pages 44 - 45)

ai / P (ai)
a1 / .2
a2 / .4
a3 / .2
a4 / .1
a5 / .1

Min. Variance Huffman Code

Place the combined letter as high as possible

3.2.1 Minimum Variance Huffman Code

(Fig. 3.2 / p.3 - 5) & (Fig. 3.4 / p.3 - 7)

Both give the same average bit length (2.2 bits/symbol).

Their variances are different.

VLC

Channel

Assume 10,000 symbols/sec (i.e. average bit rate of 22,000 bits/sec)

Minimum Variance Huffman Code

a2 / 00
a1 / 10
a3 / 11
a4 / 010
a5 / 011

0

Buffer : To smooth out the variations in the bit generation rate.

a1 / 01
a2 / 1
a3 / 000
a4 / 0010
a5 / 0011
a1 / 10
a2 / 00
a3 / 11
a4 / 010
a5 / 011

Huffman CodeMin. Variance Code

Assume strings of a4’s a5’s to be transmitted for several seconds.

(10,000 symbols/sec)

Code fromCode from (Min. Variance Code)

Fig 3.2Fig. 3.4

Generates 40,000 bpsGenerates 30,000 bps

(store 18000 bps)(store 8000 bps)

Assume string of a2’s to be transmitted for several secs.

Generates 10,000 bpsgenerates 20,000 bps

(Make up a deficit of 12000 bps)(Make up a deficit of 2000 bps)

Buffer design is simpler based on minimum variance Huffman Code.

Variable bit rate Channel

ai / P (ai)
a1 / .2
a2 / .4
a3 / .2
a4 / .1
a5 / .1

Given Rearrange (VLC) Huffman Tree (Page 3.5) Rearrange

ai / P (ai)
a2 / .4
a1 / .2
a3 / .2
a4 / .1
a5 / .1

Ex. 3.2.1 (Fig. 3.2/p.46)

Average bit size = [2 * 0.2 + 1 * 0.4 + 3 * 0.2 + 2 * 4 * 0.1] = 2.2bits/symbol

5 5

 P(ai)= 1; H = Entropy = -  P(ai) log2 P(ai) = 2.122 bits/symbol

i=1 i=1

Redundancy = 2.2 – 2.122= 0.078 bit/symbol

Minimum Variance Huffman Tree code MVHCRearrange

a2 / 00
a1 / 10
a3 / 11
a4 / 010
a5 / 011
a1 / 10
a2 / 00
a3 / 11
a4 / 010
a5 / 011

p.3-7a

Average bit rate 2.2 bits/symbol. Assume 10,000 symbols/sec channel 22,0000 bps. MVHC makes buffer design easier Minimum variance Huffman Code.

3.2.2Optimality of Huffman Codes.

(VLC) H(s) = -  Pi log2 Pi

i

3.2.3Length of Huffman Codes:

H(s) ≤ l< H(s) + 1,(3.1)

l = Avg. code length for Huffman code

H(s)=- P(ai) log2 P(ai)

i

= Entropy (Min., theoretical Averagebit rate)

(Huffman tree)

(Symbol with High probabilities: bit size is small) and vice versa.

Huffman code is a prefix code. Guarantees unique decodability.

( Page 48)

H(s) ≤ lH < H(s) + Pmax, Pmax≥ 0.5

< H(s) + Pmax+ 0.086, Pmax < 0.5

Pmax =Largest probability of any symbol. See [80]

When alphabet size is small and P(ai) of different ai is skewed, then Pmax can be large Huffman coding becomes inefficient.

Fascimile

200 dpi white dot -> 0.8

400 dpi & black dot -> 0.2

600 dpi (Binary Images)

1000 dpi

(dpi: dots per inch)

3.2.4Extended Huffman codes.

p.49

Ex. 3.2.3

H(s) ≤ R ≤ H(s) + 1/n (3.7)

alphabet size m symbols

(a1, a2, a3, …….am)

Group and code ‘n’ symbols at a time.

Extended alphabet size = mn

One code word for every n symbols.

R = Rate = # of bits/symbol.

Ex. 3.2.3 (Contd.)

p.49Let n = 2

0 a1 .8 Symbol

a1a1 / .64 / 0
a1 a2 / .016 / 10101
a1 a3 / .144 / 11
a2 a1 / .016 / 101000
a2a2 / .0004 / 10100101
a2 a3 / .0036 / 1010011
a3 a1 / .1440 / 100
a3 a2 / .0036 / 10100100
a3a3 / .0324 / 1011

11 a2 .02

10 a3 .18

m = 3

Extended alphabet

Size = mn = 32 = 9

See table 3.11(p.31) for the code.

Avg. codeword length for the extended code is 1.7228 bits/symbol of two alphabets.

(1.7228/2) = 0.8614 bits/alphabet.

Redundancy = 0.384

Avg. code word length = 1.2 bits/symbol

(Entropy = 0.816)

m=3, n=3

(a1, a2, a3)

a1a1a1, a1a1 a2,…………..a3a3a3

Extended alphabet size= 33 =27

By coding blocks of symbols together, redundancy of Huffman codes can be reduced. However alphabet (extended) size grows exponentially & Huffman coding becomes impractical.

a1a1a1

a1a1 a2 m=3, n=4

a1a1 a3alphabet size = 34 = 81

a1 a2 a1

a1 a2a2

“ “ “

a3a3a3

Huffman coding (Variation)

Truncated Huffman coding.

Modified Huffman coding

3.4Adaptive Huffman coding.

Non binary Huffman code

(Ternary code: 0, 1, 2)

3.8.2 Text Compression (Page 74) II-Edition

Using Huffman Coding file size dropped from 70,000 bytes to 43,000 bytes. Higher Compression can be obtained by using the structure. Discussed in Chapters 5 & 6 LZ 77, LZ 78, LZW etc.

3.8.3 Audio Compression (Page 75)

(2 * 16 * 44.1) Kbps

CD- quality audio.fs = 44.1 KHz,

Stereo Channel.16 bit PCM

(Two audio channels)(216 = 65,536 levels)

Estimated Compressed file size = (entropy) * (# of samples in the file)

Huffman Coding Programs: (p.74)

huff_encLossless Schemes:

huff_decJPEG LosslessLOCO

adap_huffJPEG LS

GIF

PNG

FELICS

JPEG-2000

FLACH.264 Intra

Apples’ ALAC or ALEJPEG-XR, HD Photo

Monkey’s Audio, MPEG-4 ALS

Entropy

Group of symbols 1,2,……..,N

ai i = 1,2,…..,N

P(ai) = probability of occurrence of symbol ai

N

 P(ai) = 1

i=1

(Probability Distribution) Given or Developed

Shannon’s fundamental theorem

Entropy:

N

H = -  P(ai) log2 [P(ai)] (p.22)

i=1

Minimum (theoretical) bit rate at which the group of symbols can be transmitted, # of bits/symbol.

Huffman code is a VWL code. Very close to entropy. Practical code.

Contributes to compression.

P (a1) / 1/2
P (a2) / 1/8
P (a3) / 1/8
P (a4) / 1/4

N = 4

ai, i = 1,2,3,4

4

H = Entropy = -  P(ai) log2 P(ai)

i=1

= (1/2) * 1 + 2 * (1/8) * 3 + (1/4) * 2

= (1/2) + (3/4) + (1/2) = 1.75 bits/symbol

Huffman Code

P (a1) / 1/2
P (a4) / 1/4
P (a2) / 1/8
P (a3) / 1/8

(Huffman Tree) Uniquely decodable

a1 / 0
a2 / 110
a3 / 111
a4 / 10