exercise in the previous class
p: the probability that symbols are delivered correctlyC:
1
00 → 0000001 → 0101110 → 1010111 → 11110
What is the threshold ( 閾値 ) of pwith which using C is good/bad?
without coding, correct probability = p2 = Awith coding, correct probability = p5 + 5p4(1 – p) = BB > A if p5 + 5p4(1 – p) – p2
= p2(p – 1)(–4p2 + p + 1) > 0(roots = –0.39, 0, 0.64, 1)
Using C is better if 0.64 < p < 1.-0.1
-0.05
0
0.05
0.1
0.15
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9 1
what did we learn?
motivation and models of communication channelssimple examples of linear codes
even parity check codes(a1, …, ak) → (a1, …, ak, p) , p = a1 + … + ak mod 2error detection only
horizontal and vertical parity check (2D) code(a1, a2, a3, a4) →
2
a1 a2
a3 a4
x1
x2
y1 y2 z→ (a1, a2, a3, a4, x1, x2, y1, y2, z)
one-bit error correcting, two-bits error detecting
errata: additional remark (cnt’d)
We expect that 2D codes detect all two-bit errors.If we don’t use the parity of parity, then...
3
0 0 00 0 00 0 0
0 0 00 1 10 1 0
codeword codeword0 0 00 1 00 0 0
0 0 00 1 00 1 0
1-bit err.
to thenearestcodeword
1-bit err.
some two-bit errors are not detected,instead, they are decoded to a wrong codeword.
0 0 00 1 00 0 0
to the nearestcodeword×
additional remark (cnt’d) , corrected
We expect that 2D codes detect all two-bit errors.If we don’t use the parity of parity, then...
4
0 0 00 0 00 0 0
0 0 00 1 10 1 0
codeword codeword0 0 00 1 00 0 0
0 0 00 1 00 1 0
1-bit err.
to thenearestcodeword
1-bit err.
some two-bit errors are not detected,instead, they are decoded to a wrong codeword.
to the nearestcodeword
today’s class
linear codesdefinitionencodingdecoding (error detection and correction)
5
(one of) definition(s) of linear codes
a binary code C is linear C is a vector space for any u, v C, we have u + v C
parity check code with length 3:
6
000 011 101 110000 000 011 101 110011 011 000 110 101101 101 110 000 011110 110 101 011 000
how about parity checkcode with other length?
linearity of parity check codes
C: parity check code for length k data (x1, ..., xk)
... codewords in C are (x1, ..., xk, p) with p = x1 + ... + xk
Theorem: parity check codes are linearProof: confirm that u + v C for any u, v C.
7
u = (u1, ..., uk, p), (p = u1 + ... + uk),v = (v1, ..., vk, q), (q = v1 + ... + vk),u + v = (u1 + v1, ..., uk + vk, p + q).
because p + q = u1 + ... + uk + v1 + ... + vk = (u1 + v1) + ... + (uk + vk),p + q is a valid parity bit for (u1 + v1, ..., uk + vk). u + v is a codeword
Theorem: 2D codes are linear (proof omitted)
computation of parity bits
parity check codes, 2D codes...for information bits (x1, x2, ..., xk),
parity bits are defined by linear equationsp = a1x1 + a2x2 + ... + akxk (a1, a2, ..., ak {0, 1})∈
8
(x1, ..., xk, p) p = x1 + ... + xk
ab
ii
b
ii
b
ijiaj
a
jjiai xprxqxp
111)1(
1)1( ,,
(x1, ..., xab, p1, ..., pb, q1, ..., qa, r)
In both codes, parity bits are sum of some of information bits.
linear parity bits make the code linear
Lemma: for a linear equation f(x1, x2, ..., xk) = a1x1 + a2x2 +...+ akxk,
f(u1, u2, ..., uk) + f(v1, v2, ..., vk) = f(u1 + v1, u2 + v2, ..., uk + vk)
Theorem:If parity bits are defined by linear equations, then C is linear.
9
codeword (u1, u2, ..., uk, ..., f(u1, u2, ..., uk) , ...)codeword (v1, v2, ..., vk, ..., f(v1, v2, ..., vk) , ...) +)
(u1 + v1, u2 + v2, ..., uk + vk, ..., f(u1, u2, ..., uk) + f(v1, v2, ..., vk), ...)
(u1 + v1, u2 + v2, ..., uk + vk, ..., f(u1 + v1, u2 + v2, ..., uk + vk), ...)=codeword
example of linear code construction
k = 3: information bits are (x1, x2, x3).
determine parity bits as you like...p1 = x1 + x2
p2 = x2 + x3
10
⇒ the codeword is (x1, x2, x3, p1, p2)
00000 00101 01011 01110 10010 10111 11001 1110000000 00000 00101 01011 01110 10010 10111 11001 1110000101 00101 00000 01110 01011 10111 10010 11100 1100101011 01011 01110 00000 00101 11001 11100 10010 1011101110 01110 01011 00101 00000 11100 11001 10111 1001010010 10010 10111 11001 11100 00000 00101 01011 0111010111 10111 10010 11100 11001 00101 00000 01110 0101111001 11001 11100 10010 10111 01011 01110 00000 0010111100 11100 11001 10111 10010 01110 01011 00101 00000
codewords for unit vectors
Assume that...we define m parity bits for k-bit information x1, x2, ..., xk,
with the j-th parity bit defined bypj = aj,1x1 + aj,2x2 + ... + aj,kxk (aj,i = 0 or 1)
the codeword for the unit vector (0, ...,0, 1, 0, ..., 0) is ...ci = (0, ..., 1, ..., 0, a1,i, a2,i, ..., am,i)
11
^^ ^i1 k
c1 = 1 0 0 1 0c2 = 0 1 0 1 1c3 = 0 0 1 0 1
p1 = x1 + x2
p2 = x2 + x3
example in p.10: k = 3, m = 2
(a1,1 a1,2 a1,3) = (1 1 0)(a2,1 a2,2 a2,3) = (0 1 1)
transposition (転置
)
“basis” in a linear code
a vector space has a basis ( 基底 ) b1, b2, ..., bk:
any u in the space is written asu = u1b1 + u2b2 + ... + ukbk (ui {0, 1})∈
12
Theorem:Codewords c1, c2, ..., ck for unit vectors constitute the basis of C.
any codeword c C is written asu = u1c1 + u2c2 + ... + ukck (ui {0, 1})∈
basis of the p.8 example
k = 3: information bits are (x1, x2, x3)p1 = x1 + x2
p2 = x2 + x3
13
c1 = 1 0 0 1 0c2 = 0 1 0 1 1c3 = 0 0 1 0 1
codewords0000000101010110111010010101111100111100
= 0・ 10010 + 0・ 01011 + 0・ 00101= 0・ 10010 + 0・ 01011 + 1・ 00101= 0・ 10010 + 1・ 01011 + 0・ 00101= 0・ 10010 + 1・ 01011 + 1・ 00101= 1・ 10010 + 0・ 01011 + 0・ 00101= 1・ 10010 + 0・ 01011 + 1・ 00101= 1・ 10010 + 1・ 01011 + 0・ 00101= 1・ 10010 + 1・ 01011 + 1・ 00101
generator matrix
if the j-th parity bit is pj = aj,1x1 + aj,2x2 + ... + aj,kxk, then
ci = (0, ..., 1, ..., 0, a1,i, a2,i, ..., am,i)
c1, c2, ..., ck is a basis of C
any codeword u is written as u = u1c1 + u2c2 + ... + ukck
14
kmkk
m
m
k
mk
aaa
aaa
aaa
uuu
pppuuu
,,2,1
2,2,22,1
1,1,21,1
21
2121
100
010
001
generator matrix of C
encoding = multiplication of the generator matrix and a vector
the structure of the generator matrix
15
kmkk
m
m
aaa
aaa
aaa
,,2,1
2,2,22,1
1,1,21,1
100
010
001
the generator matrix
k × k identity matrix
row vectors= codewords for unit vector
transposition of the coefficientsof the linear equations of parity bits
example
16
parity check code p = x1 + x2 + x3
.
1100
1010
1001
)()( 321321
uuupuuu
2D code
.
110101000
101100100
110010010
101010001
p1 = x1 + x2
p2 = x3 + x4
q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
x1 x2
x3 x4
p1
p2
q1 q2 r
encoding
17
encoding of a 2D code
110101000
101100100
110010010
101010001
)(
)(
4321
21214321
uuuu
rqqppuuuu
to encode 0111...
.101011110
110101000
101100100
110010010
101010001
)1110(
the codeword is 011110101
hardware encoder
18
110101000
101100100
110010010
101010001
)(
)(
4321
21214321
uuuu
rqqppuuuu
u1u2u3u4
u1 u2 u3 u4 p1 p2 q1 q2 r
data
codeword
XOR
summary of the first half
A code is linear if and only if... it is a vector spaceparity bits are defined by linear equations
codewords of unit vectors is a basis of the code
the generator matrix ...contains basis codewords as its row vectorsgives mathematical principle for encoding an encoder is realizable by a combinatorial circuit
( 組み合わせ回路 )
19
condition for a vector to be a codeword (1)
consider a 2D code
20
p1 = x1 + x2
p2 = x3 + x4
q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
.
110101000
101100100
110010010
101010001
)()( 432121214321
xxxxrqqppxxxx
a vector (x1 x2 x3 x4 p1 p2 q1 q2 r) is a codeword
if and only if
condition for a vector to be a codeword (2)
21
p1 = x1 + x2
p2 = x3 + x4
q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
x1 + x2 – p1 = 0
x3 + x4 – p2 = 0
x1 + x3 – q1 = 0
x2 + x4 – q2 = 0
x1 + x2 + x3 + x4 – r = 0
in binary world, x – y = x + y
x1 + x2 + p1 = 0 x3 + x4 + p2 = 0x1 + x3 + q1 = 0 x2 + x4 + q2 = 0x1 + x2 + x3 + x4 + r = 0
parity check equations
(x1 x2 x3 x4 p1 p2 q1 q2 r) is a codeword
move the lhs to rhs...
condition for a vector to be a codeword (3)
22
x1 + x2 + p1 = 0 x3 + x4 + p2 = 0x1 + x3 + q1 = 0 x2 + x4 + q2 = 0x1 + x2 + x3 + x4 + r = 0
(x1 x2 x3 x4 p1 p2 q1 q2 r) is a codeword
0
0
0
0
0
100001111
010001010
001000101
000101100
000010011
9
8
7
6
5
4
3
2
1
x
x
x
x
x
x
x
x
x
Is (x1 x2 x3 x4 x5 x6 x7 x8 x9) a codeword?transpose the vector,multiply to this matrix from the right,and see if the result is 0 or not.
zero ⇒ it’s a codewordnonzero it’s not a codeword⇒
parity check matrix
parity check matrix for error detection
consider a 2D codeIs 110101101 a codeword? ... yes
23
Is 011011010 a codeword? ... no
0
0
0
0
0
101101011
100001111
010001010
001000101
000101100
000010011
T
0
0
1
0
0
010110110
100001111
010001010
001000101
000101100
000010011
T
generator and check matrices
24
kmkk
m
m
aaa
aaa
aaa
,,2,1
2,2,22,1
1,1,21,1
100
010
001
definition of parity bitsp1 = a1,1x1 + a1,2x2 + ... + a1,kxk p2 = a2,1x1 + a2,2x2 + ... + a2,kxk
:pm = am,1x1 + am,2x2 + ... + am,kxk
generator matrix
identity coefficients transposed
a1,1x1 + a1,2x2 + ... + a1,kxk + p1 = 0a2,1x1 + a2,2x2 + ... + a2,kxk + p2 = 0
:am,1x1 + am,2x2 + ... + am,kxk + pm = 0
k rows
100
010
001
,2,1,
,22,21,2
,12,11,1
kmmm
k
k
aaa
aaa
aaa
parity check matrix
identitycoefficients
n - k row
s
example of matrices
25
2D code : n = 9, k = 4, m = n - k = 5.
110101000
101100100
110010010
101010001
x1 x2
x3 x4
p1
p2
q1 q2 r
100001111
010001010
001000101
000101100
000010011
generator matrix parity check matrix
p1 = x1 + x2
p2 = x3 + x4
q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
syndrome
For a parity check matrix H and a vector v,if HvT = 0, then v Cif HvT 0, then v C
the vector HvT is called the syndrome ( シンドローム ) of v:if the syndrome of v is zero, then v Cif the syndrome of v is nonzero, then v C
The syndrome is more useful,because it contains the information of errors.
26
syndrome and error
send a codeword u to a binary symmetric channelan error vector e is added to the codeword in the channelthe received vector is v = u + e
27
ue
v = u + e
noise
codeword received
if e = 0 (no error),then the syndrome of v
is...HvT = HuT = 0
if e 0 (error occurs), then the syndrome of v is...HvT = H(u + e)T = HuT + HeT = HeT
the syndrome is solely determined from e, independently from u
... if you see the syndrome, then you can say what e is.
error patterns determine the syndrome
28
100001111
010001010
001000101
000101100
000010011
H
• 000000000 is sent, 000100000 is received...H(0 0 0 1 0 0 0 0 0)T = (0 1 0 1 1)T.
• 110000110 is sent, 110100110 is received...H(1 1 0 1 0 0 1 1 0)T = (0 1 0 1 1)T.
⇒ if the syndrome is (0 1 0 1 1), then the fourth bit is in error
independent from the sent codeword
error correction
if you know the correspondence betweenerror patterns and syndromes, then you can correct errors.
29
received syndrome
error pattern
decodingresult
v = u + e
computethe syndrome:
s = HvT s
table of errors / syndromes...............
...............e
u
one-bit error
Let n be a codeword, and let hi be the i-th column vector of H:
30
nhhh 21H
.
0
1
0
H 21
in
T hhhhe
the syndrome of a one-bit error e = (0 0 ... 0 1 0 ... 0) is...
only one-bit error at the i-th symbol position⇔ syndrome equals to the i-th vector of H
(the table of errors / syndromes not needed)
example of error correction
2D code
31
000000000,000101011,001001101,001100110,010010011,010111000,011011110,011110101,
100010101,100111110,101011000,101110011,110000110,110101101,111001011,111100000.
codewords
100001111
010001010
001000101
000101100
000010011
H
paritycheckmatrix
• if 101001000 is received...⇒ the syndrome is H(1 0 1 0 0 1 0 0 0)T = (1 0 0 0 0)T
⇒ this is the fifth column of H⇒ the fifth-bit is in error, 101011000 must be sent
• if 101100110 is received...⇒ the syndrome is H(1 0 1 1 0 0 1 1 0)T = (1 0 1 0 1)T
⇒ this is the first column of H⇒ the first-bit is in error, 001100110 must be sent
parity check matrix and the ability of codes (1)
one-bit error at the i-th symbol position⇔ syndrome equals the i-th vector of H
if several column vectors in H are the same, thendifferent error patterns result in the same syndromethe error pattern is not uniquely determined
32
parity check code p = x1 + x2 111H
p1 = x1 + x2
p2 = x2 + x3
the example code in p. 8
10110
01011H
erro
r cor
recti
onN
OT
poss
ible
parity check matrix and the ability of codes (2)
one-bit error at the i-th symbol position⇔ syndrome equals the i-th vector of H
if all column vectors in H are different, thendifferent error patterns result in different syndromesthe error pattern is uniquely determined
33
erro
r cor
recti
onpo
ssib
le
100001111
010001010
001000101
000101100
000010011
H2D code
summary
definition of linear codesvector space, parity bits defined by linear equations
generator matrixmatrix of basis codewordscontributes to the encoding
parity check matrixrepresents constraints among symbolsdetermines syndrome
error correction and error detection
34
exercise
Consider an “odd” parity check code C whose codewords are(x1, …, xk, p) with p = x1+…+xk+1. Is C a linear code?
Construct a 2D code for 6-bit information (a1, ..., a6) as follows.
determine the generator and parity check matricesencode 011001 using the generator matrixcorrect an error in the sequence 110111001010
35
a1 a2 a3a4 a5 a6
p1p2
q1 q2 q3 r
(a1, ..., a6)→ (a1, ..., a6, p1, p2, q1, q2, q3, r)