Download - Week 03-Informtion Sources and Source Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
1/25
Dr. M. Arif Wahla
EE Dept
Military College of SignalsNational University of Sciences & Technology (NUST), Pakistan
Class webpage: http://learn.mcs.edu.pk/courses/
Arithmetic Coding
Lemple Ziv Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
2/25
12:18 PM 2
Lecture #3
Arithmetic Coding
Lempel Ziv Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
3/25
For large n,the implementation of Huffman Coding: TheRetired Champion can easily become unwieldy or unduly
restrictive. The problem includes:
The size of the Huffman code table is qn, representing anexponential increase in memory and computational
requirements.
The code table needs to be transmitted to the receiver.
The source statistics are assumed stationary. If there are
changes, an adaptive scheme is required which re-estimatesthe probabilities, and recalculates the Huffman code table.
The solution to this problem is Arithmetic Coding.
Fall 2011 3
Arithmetic Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
4/25
Consider theN-length source message Si1, Si2,,SiN where {Si: i=1,2,,q} are
the source symbols and Sij indicates that the jthcharacter in the message is the
source symbolsi. Arithmetic coding assumes that following probabilities are available.
The goal of arithmetic coding is to assign a unique interval along the unit number
line or probability line [0,1] of length equal to the probability of the givensource message, with its position on the number line
given by the cumulative probability of the given source message.
Fall 2011 4
Arithmetic Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
5/25
Arithmetic coding completely bypasses the idea of
replacing an input symbol with a specific code.
Instead, it takes a stream of input symbols and replaces it
with a single floating point output number.
The longer (and more complex) the message, the more bits
are needed in the output number.
It was not until recently that practical methods were found
to implement this on computers with fixed sized registers.
Fall 2011 5
Arithmetic Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
6/25
Example
If we pick in such a way that it is possible to later decompose bback into
original sequence, the code can be decoded
Fall 2011 6
Arithmetic Coding
}05.0,15.0,3.0,5.0{)(
],,,[ 3210
APP
aaaaA
A
111)(05.0
110)(15.0
10)(3.0
0)(5.0
33
22
11
00
aCp
aCp
aCp
aCp
]195.0[Interval05.0
]95.08.0[Interval15.0
]8.05.0[Interval3.0]5.00[Interval5.0
33
22
11
00
Ip
Ip
IpIp
10intervalinlyingnumbersrealintoencodediseachSuppose iii Aa
...
asofversionsscaledtheaddingby...sequencetheencodeweSuppose
22110
10
b
ss i
1and0betweenintervalin thelyingalsoscale
decreasingllymonotonicaisand toingcorrespondnumbercodetheis iii S
i
-
8/12/2019 Week 03-Informtion Sources and Source Coding
7/25
Construct the code interval to represent a block of symbols
as
Any convenient b within this range is a suitable codeword
representing the entire block of symbols.
Algorithm on next slide
Fall 2011 7
Arithmetic Coding Process
],,[ HbLHLIb
-
8/12/2019 Week 03-Informtion Sources and Source Coding
8/25
Fall 2011 8
Arithmetic Coding Algorithm
1
1
Assume each has been assigned an interval [ , ]
initialize 0, 0 and 1
REPEAT
read next
.
.
1
UNTIL all have been encoded
i
i
i
j j
j j
i
j j l
j j h
i
a A I S S i l hi i
j L H
H La
L L S
H L S
j j
a
-
8/12/2019 Week 03-Informtion Sources and Source Coding
9/25
Encode the
sequence a1a0a0a3a2
Fall 2011 9
Example 1.6.2:
]195.0[Interval05.0
]95.08.0[Interval15.0
]8.05.0[Interval3.0
]5.00[Interval5.0
33
22
11
00
Ip
Ip
Ip
Ip
1 1
______________________________________
i j j j jj a L H L H
10 0 1 1 0.5 0.8a
01 0.5 0.8 0.3 0.5 0.65a
02 0.5 0.65 0.15 0.5 0.575a
33 0.5 0.575 0.075 0.57125 0.575a
24 0.57125 0.575 0.00375 0.57425 0.5748125a
any withinthe final interval will suffice for a codeword
one choice is = 0.5748125
1
1
0 and 1
initialize 0
.
.
1
j j
i
i
j j
j j l
j j h
L H
j
H L
L L S
H L S
j j
-
8/12/2019 Week 03-Informtion Sources and Source Coding
10/25
Fall 2011 10
Arithmetic Coding Algorithm
1
1
Assume each has been assigned an interval [ , ]
initialize 0, 0 and 1
REPEAT
read next
.
.
1
UNTIL all have been encoded
i
i
i
j j
j j
i
j j l
j j h
i
a A I S S i l hi i
j L H
H La
L L S
H L S
j j
a
-
8/12/2019 Week 03-Informtion Sources and Source Coding
11/25
Encode the sequence a1a0a0a3a2
Fall 2011 11
Example 1.6.2:
]195.0[Interval05.0
]95.08.0[Interval15.0
]8.05.0[Interval3.0
]5.00[Interval5.0
33
22
11
00
Ip
Ip
Ip
Ip
57481250ischoiceone
codewordaforsufficewillintervalfinalewithin thany
.57481250.57425000375.0575.057125.04
0.575.571250075.0575.05.03
.5750.5015.065.05.02
65.0.500.38.05.01.80.501100
______________________________________
2
3
0
0
1
11
.b
b
a
a
a
aa
HLHLaj jjjji
-
8/12/2019 Week 03-Informtion Sources and Source Coding
12/25
Fall 2011 12
Decoding Arithmetic Codes - Algorithm
initialize 0 , 1 and H-L
REPEAT
find such that
OUTPUT Symbil
.
.
UNTIL last ymbol have been decoded
i
i
i
i
h
l
L H
i
b LI
a
H L
H L S
L L S
H L
s
:followsasisproceduredecodingthevaluecodeGiven the b
Tricks: Use a special stop symbol for
sequences of variable length
Pay attention to precision in
calculations.
-
8/12/2019 Week 03-Informtion Sources and Source Coding
13/25
Decode b =0.5748125
Solution
Fall 2011 13
Example 1.6.3
i next next next
______________________________________________________________
0 1 1
iL H I H L a
1 1
0 0
0.8 0.5 0.3
0.5 0.8 0.3 0.65 0.5 0.15
0.5 0.65
I a
I a
0 0
3 3
0.15 0.575 0.5 0.075
0.5 0.575 0.075 0.575 0.57125 0.00375
0.57125 0.575 0.0
I a
I a
2 20375 0.5748125 0.57425 0.0005625I a
]195.0[Interval05.0
]95.08.0[Interval15.0
]8.05.0[Interval3.0
]5.00[Interval5.0
33
22
11
00
Ip
Ip
Ip
Ip
ib L I
-
8/12/2019 Week 03-Informtion Sources and Source Coding
14/25
Huffman codes require the knowledge of probabilitydistribution of source symbols, which may not always be
available.
Dictionary codes dynamically construct their owncoding/decoding table on the fly by looking at the present
data stream. Probability distribution is not known.
The strings are coded instead of symbols.
These codes are only efficient for long files.
LZ codes belong to a practical class of dictionary codes.
15
Dictionary Codes and Lempel-Ziv Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
15/25
Lempel-Ziv Codes suffer no significant decoding delay at receiver.
Prior knowledge of decoding table is not required. Requiredinformation is transmitted within the message.
Huffman codes assign variable length code tofixed symbol size,
whereas LZ codes, encode thestring of variable lengthwith fixedcode size.
LZ coding is a mirror image of Huffman coding.
TheLZalgorithm which we will consider in our course is a slightmodification of originalLZW algorithm.
16
Lempel-Ziv Coding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
16/25
Addr-
ess
m
Dictionary
Entry
n ai
0 0 Null
1 0 ao
2 0 a1
m 0 am
M 0 aM
Initializing LZ Algorithm
To define the structure of dictionary
Each entry (n,ai) in dictionary is given an address
m.
ai is a symbol drawn from souce A and n is a
pointer to an other location.
nis represented by a fixed length word ofbbits.
Dictionary contains total number of entries less
than or equal to 2b.
The algo is initialized by constructing first M+1
entries.
The 0 address entry is anull symbol. It is used to let
decoder know the end of string.
Pointer n=0 for first M+1 entries, it points to the
null entry at address 0.
m=m+1points to next blank location in dictionary
-
8/12/2019 Week 03-Informtion Sources and Source Coding
17/25
Initialize pointer n =0 and
m=M+1
1. Fetch next source symbol ai ; where i=0,1,2,,M-1.
2. If the ordered pair is already in dictionary then
Next n = dictionary address of entry ;
elsetransmit n
create new dictionary entry at dictionary address m
m= m+1
n = dictionary address of entry ;
3. Return to step 1.
18
LZ Algorithm
-
8/12/2019 Week 03-Informtion Sources and Source Coding
18/25
Addr
-ess
m
Dic Entry
n ai
0 0 null
1 0 0
2 0 1
19
Present
n
Source
ai
Present
m
Transmit
n
Next
n
Dic Entry
n , ai
0 1 3 22
1 3 2 2, 1240
22 2, 0
5
11
0 1 1, 0160
155
6 5 2 5, 11
70 4
7
4
1
2
4 4, 1
-
8/12/2019 Week 03-Informtion Sources and Source Coding
19/25
0 1 3 2
2 1 3 2 2 2,1
2 0 4 2 1 2,0
1 0 5 1 1 1,0
1 0 6 5
5 1 6 5 2 5,1
2 0 7 4
4 1 7 4 2 4,1
2 1 8 3
3 0 8 3 1 3,0
1 0 9 5
5 1 9 6
6 0 9 6 1 6,0
1 1 10 1 2 1,1
20
Present
n
Source Present
m
Transmit
n
Next
n
Dic Entry
n , ai
Address
m
Dic Entry
n ai
0 0 null
1 0 0
2 0 1
3 2 1
4 2 0
5 1 0
6 5 1
7 4 1
8 3 0
9 6 0
10 1 1
l i di
-
8/12/2019 Week 03-Informtion Sources and Source Coding
20/25
Decoder must construct a dictionary similar to the encoder
We know that encoder doesnt transmit as many code-words as it has source symbols
21
Lempel-Ziv Decoding
-
8/12/2019 Week 03-Informtion Sources and Source Coding
21/25
0 1 3 2
2 1 3 2 2 2,1
2 0 4 2 1 2,0
1 0 5 1 1 1,0
1 0 6 5
5 1 6 5 2 5,1
2 0 7 4
4 1 7 4 2 4,1
2 1 8 3
3 0 8 3 1 3,0
1 0 9 5
5 1 9 6
6 0 9 6 1 6,0
1 1 10 1 2 1,1
22
Present
n
Source Present
m
Transmit
n
Next
n
Dic Entry
n , ai
Address
m
Dic Entry
n ai
0 0 null
1 0 0
2 0 1
3 2 1
4 2 0
5 1 0
6 5 1
7 4 1
8 3 0
9 6 0
10 1 1
L l Zi D di
-
8/12/2019 Week 03-Informtion Sources and Source Coding
22/25
Decoder must construct a dictionary similar to
the encoder
We know that encoder doesnt transmit asmany code-words as it has source symbols
Decoding Operation goes as:
Reception of any code-word means that a new
dictionary entry must be constructed Pointer n for this new entry is the same as the
received codeword
Source symbol aifor this entry is not yet known
because it is the route symbol for next string (not yet
transmitted). Such entry is called partial entry at address m
This entry can fill in the missing symbol aiof
previous entry at address m-1
23
Lempel-Ziv Decoding
Address
m
Dic Entry
n ai
0 0 null
1 0 0
2 0 1
3 2 1
4 2 0
5 1 0
6 5 1
7 4 1
8 3 0
9 6 0
10 1 1
L l Zi D di
-
8/12/2019 Week 03-Informtion Sources and Source Coding
23/25
Source symbol aifor this entry is not yet known because it is the route symbol
for next string (not yet transmitted). Such entry is called partial entry
at address m
This entry can fill in the missing symbol aiof previous entry at address m-1
It can also decode the source string associated with codeword n
Root symbol is the first symbol of the string having pointer 0
The last symbol of the string is the symbol belonging to entry at address
pointed by the pointer of last updated entry. m=m+1should updated probably just after completing the entry at address m
If pointer npoints to the entry having pointer n=0then we decode string
If pointer npoints to the entry having pointer nnonzero, then this non zero
pointer will further connect us to another address. This will continue until we
reach a zero pointer
24
Lempel-Ziv Decoding
L l Zi D di
-
8/12/2019 Week 03-Informtion Sources and Source Coding
24/25
Lempel-Ziv Decoding
25
Address
m
Dictionary
entry
n ai
Address
m
Dic Entry
n ai
0 0 null
1 0 0
2 0 1
3 2 1
4 2 0
5 1 0
6 5 1
7 4 1
8 3 0
9 6 0
10 1 1Decoded
message
3 2,?
4 2, ?
11
5 1,?
0
6 5,?
1
000
0
7 4,?
11
0
8
0 0, null
1 0,0
2 0,1
3,?
H ff C di Effi i
-
8/12/2019 Week 03-Informtion Sources and Source Coding
25/25
Huffman coding require the knowledge of apriori, otherwise we have to
determine the apriorithrough estimation such as: = +
For the source with M alphabets, ther average # of bits/code for the two cases will
be
C f P b bili d E 26
Huffman Coding Efficiency
2
2
2
2 2
1 1 and
1 1
( )
practicaly , so
1
Let the mean sqaured error be and using Lagrange multiplier
1 ( )
i i i i
i i
i i i i i
i i
i i
i i
i
i i
i i
L p l L p lM M
L L L p l l e lM M
l l
L e lM
L l lM
2
0L