exercise in the previous class 0.05 0.1 p c 00 → 00000 10...
TRANSCRIPT
exercise in the previous class
p: the probability that symbols are delivered correctly
C:
1
00 → 00000 01 → 01011 10 → 10101 11 → 11110
What is the threshold (閾値) of p with which using C is good/bad?
without coding, correct probability = p2 = A
with coding, correct probability = p5 + 5p4(1 – p) = B
B > A if p5 + 5p4(1 – p) – p2
= p2(p – 1)(–4p2 + p + 1) > 0
(roots = –0.39, 0, 0.64, 1)
Using C is better if 0.64 < p < 1. -0.1
-0.05
0
0.05
0.1
0.15
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9 1
what did we learn?
motivation and models of communication channels
simple examples of linear codes
even parity check codes
(a1, …, ak) → (a1, …, ak, p) , p = a1 + … + ak mod 2
error detection only
horizontal and vertical parity check (2D) code
(a1, a2, a3, a4) →
2
a1 a2
a3 a4
x1
x2
y1 y2 z
→ (a1, a2, a3, a4, x1, x2, y1, y2, z)
one-bit error correcting, two-bits error detecting
errata: additional remark (cnt’d)
We expect that 2D codes detect all two-bit errors.
If we don’t use the parity of parity, then...
3
0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 1 0
codeword codeword
0 0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 1 0
1-bit err.
to the nearest codeword
1-bit err.
some two-bit errors are not detected, instead, they are decoded to a wrong codeword.
0 0 0 0 1 0 0 0 0
to the nearest codeword
additional remark (cnt’d),corrected
We expect that 2D codes detect all two-bit errors.
If we don’t use the parity of parity, then...
4
0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 1 0
codeword codeword
0 0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 1 0
1-bit err.
to the nearest codeword
1-bit err.
some two-bit errors are not detected, instead, they are decoded to a wrong codeword.
to the nearest codeword
today’s class
linear codes
definition
encoding
decoding (error detection and correction)
5
(one of) definition(s) of linear codes
a binary code C is linear
C is a vector space
for any u, v C, we have u + v C
parity check code with length 3:
6
000 011 101 110
000 000 011 101 110
011 011 000 110 101
101 101 110 000 011
110 110 101 011 000 how about parity check code with other length?
linearity of parity check codes
C: parity check code for length k data (x1, ..., xk)
... codewords in C are (x1, ..., xk, p) with p = x1 + ... + xk
Theorem: parity check codes are linear
Proof: confirm that u + v C for any u, v C.
7
u = (u1, ..., uk, p), (p = u1 + ... + uk), v = (v1, ..., vk, q), (q = v1 + ... + vk), u + v = (u1 + v1, ..., uk + vk, p + q).
because p + q = u1 + ... + uk + v1 + ... + vk = (u1 + v1) + ... + (uk + vk), p + q is a valid parity bit for (u1 + v1, ..., uk + vk). u + v is a codeword
Theorem: 2D codes are linear (proof omitted)
computation of parity bits
parity check codes, 2D codes...
for information bits (x1, x2, ..., xk),
parity bits are defined by linear equations
p = a1x1 + a2x2 + ... + akxk (a1, a2, ..., ak ∈ {0, 1})
8
(x1, ..., xk, p) p = x1 + ... + xk
ab
ii
b
ii
b
ijiaj
a
jjiai xprxqxp
111)1(
1)1( ,,
(x1, ..., xab, p1, ..., pb, q1, ..., qa, r)
In both codes, parity bits are sum of some of information bits.
linear parity bits make the code linear
Lemma: for a linear equation f(x1, x2, ..., xk) = a1x1 + a2x2 +...+ akxk,
f(u1, u2, ..., uk) + f(v1, v2, ..., vk) = f(u1 + v1, u2 + v2, ..., uk + vk)
Theorem:
If parity bits are defined by linear equations, then C is linear.
9
codeword (u1, u2, ..., uk, ..., f(u1, u2, ..., uk) , ...)
codeword (v1, v2, ..., vk, ..., f(v1, v2, ..., vk) , ...) +)
(u1 + v1, u2 + v2, ..., uk + vk, ..., f(u1, u2, ..., uk) + f(v1, v2, ..., vk), ...)
(u1 + v1, u2 + v2, ..., uk + vk, ..., f(u1 + v1, u2 + v2, ..., uk + vk), ...) =
codeword
example of linear code construction
k = 3: information bits are (x1, x2, x3).
determine parity bits as you like...
p1 = x1 + x2
p2 = x2 + x3
10
⇒ the codeword is (x1, x2, x3, p1, p2)
00000 00101 01011 01110 10010 10111 11001 11100 00000 00000 00101 01011 01110 10010 10111 11001 11100 00101 00101 00000 01110 01011 10111 10010 11100 11001 01011 01011 01110 00000 00101 11001 11100 10010 10111 01110 01110 01011 00101 00000 11100 11001 10111 10010 10010 10010 10111 11001 11100 00000 00101 01011 01110 10111 10111 10010 11100 11001 00101 00000 01110 01011 11001 11001 11100 10010 10111 01011 01110 00000 00101 11100 11100 11001 10111 10010 01110 01011 00101 00000
codewords for unit vectors
Assume that...
we define m parity bits for k-bit information x1, x2, ..., xk,
with the j-th parity bit defined by
pj = aj,1x1 + aj,2x2 + ... + aj,kxk (aj,i = 0 or 1)
the codeword for the unit vector (0, ...,0, 1, 0, ..., 0) is ...
ci = (0, ..., 1, ..., 0, a1,i, a2,i, ..., am,i)
11
^ ^ ^ i 1 k
c1 = 1 0 0 1 0 c2 = 0 1 0 1 1 c3 = 0 0 1 0 1
p1 = x1 + x2
p2 = x2 + x3
example in p.10: k = 3, m = 2
(a1,1 a1,2 a1,3) = (1 1 0) (a2,1 a2,2 a2,3) = (0 1 1)
transp
ositio
n (転置
)
“basis” in a linear code
a vector space has a basis (基底) b1, b2, ..., bk:
any u in the space is written as
u = u1b1 + u2b2 + ... + ukbk (ui ∈ {0, 1})
12
Theorem:
Codewords c1, c2, ..., ck for unit vectors constitute the basis of C.
any codeword c C is written as
u = u1c1 + u2c2 + ... + ukck (ui ∈ {0, 1})
basis of the p.8 example
k = 3: information bits are (x1, x2, x3)
p1 = x1 + x2
p2 = x2 + x3
13
c1 = 1 0 0 1 0 c2 = 0 1 0 1 1 c3 = 0 0 1 0 1
codewords 00000 00101 01011 01110 10010 10111 11001 11100
= 0・10010 + 0・01011 + 0・00101 = 0・10010 + 0・01011 + 1・00101 = 0・10010 + 1・01011 + 0・00101 = 0・10010 + 1・01011 + 1・00101 = 1・10010 + 0・01011 + 0・00101 = 1・10010 + 0・01011 + 1・00101 = 1・10010 + 1・01011 + 0・00101 = 1・10010 + 1・01011 + 1・00101
generator matrix
if the j-th parity bit is pj = aj,1x1 + aj,2x2 + ... + aj,kxk, then
ci = (0, ..., 1, ..., 0, a1,i, a2,i, ..., am,i)
c1, c2, ..., ck is a basis of C
any codeword u is written as u = u1c1 + u2c2 + ... + ukck
14
kmkk
m
m
k
mk
aaa
aaa
aaa
uuu
pppuuu
,,2,1
2,2,22,1
1,1,21,1
21
2121
100
010
001
generator matrix of C
encoding = multiplication of the generator matrix and a vector
the structure of the generator matrix
15
kmkk
m
m
aaa
aaa
aaa
,,2,1
2,2,22,1
1,1,21,1
100
010
001
the generator matrix
k × k identity matrix
row vectors = codewords for unit vector
transposition of the coefficients of the linear equations of parity bits
example
16
parity check code p = x1 + x2 + x3
.
1100
1010
1001
)()( 321321
uuupuuu
2D code
.
110101000
101100100
110010010
101010001
p1 = x1 + x2
p2 = x3 + x4 q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
x1 x2
x3 x4
p1
p2
q1 q2 r
encoding
17
encoding of a 2D code
110101000
101100100
110010010
101010001
)(
)(
4321
21214321
uuuu
rqqppuuuu
to encode 0111...
.101011110
110101000
101100100
110010010
101010001
)1110(
the codeword is 011110101
hardware encoder
18
110101000
101100100
110010010
101010001
)(
)(
4321
21214321
uuuu
rqqppuuuu
u1 u2 u3 u4
u1 u2 u3 u4 p1 p2 q1 q2 r
data
codeword
XOR
summary of the first half
A code is linear if and only if...
it is a vector space
parity bits are defined by linear equations
codewords of unit vectors is a basis of the code
the generator matrix ...
contains basis codewords as its row vectors
gives mathematical principle for encoding
an encoder is realizable by a combinatorial circuit
(組み合わせ回路)
19
condition for a vector to be a codeword (1)
consider a 2D code
20
p1 = x1 + x2
p2 = x3 + x4 q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
.
110101000
101100100
110010010
101010001
)()( 432121214321
xxxxrqqppxxxx
a vector (x1 x2 x3 x4 p1 p2 q1 q2 r) is a codeword
if and only if
condition for a vector to be a codeword (2)
21
p1 = x1 + x2
p2 = x3 + x4 q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
x1 + x2 – p1 = 0
x3 + x4 – p2 = 0
x1 + x3 – q1 = 0
x2 + x4 – q2 = 0
x1 + x2 + x3 + x4 – r = 0
in binary world, x – y = x + y
x1 + x2 + p1 = 0
x3 + x4 + p2 = 0 x1 + x3 + q1 = 0
x2 + x4 + q2 = 0
x1 + x2 + x3 + x4 + r = 0
parity check equations
(x1 x2 x3 x4 p1 p2 q1 q2 r) is a codeword
move the lhs to rhs...
condition for a vector to be a codeword (3)
22
x1 + x2 + p1 = 0
x3 + x4 + p2 = 0 x1 + x3 + q1 = 0
x2 + x4 + q2 = 0
x1 + x2 + x3 + x4 + r = 0
(x1 x2 x3 x4 p1 p2 q1 q2 r) is a codeword
0
0
0
0
0
100001111
010001010
001000101
000101100
000010011
9
8
7
6
5
4
3
2
1
x
x
x
x
x
x
x
x
x
Is (x1 x2 x3 x4 x5 x6 x7 x8 x9) a codeword?
transpose the vector, multiply to this matrix from the right, and see if the result is 0 or not.
zero ⇒ it’s a codeword nonzero ⇒ it’s not a codeword
parity check matrix
parity check matrix for error detection
consider a 2D code
Is 110101101 a codeword? ... yes
23
Is 011011010 a codeword? ... no
0
0
0
0
0
101101011
100001111
010001010
001000101
000101100
000010011
T
0
0
1
0
0
010110110
100001111
010001010
001000101
000101100
000010011
T
generator and check matrices
24
kmkk
m
m
aaa
aaa
aaa
,,2,1
2,2,22,1
1,1,21,1
100
010
001
definition of parity bits p1 = a1,1x1 + a1,2x2 + ... + a1,kxk p2 = a2,1x1 + a2,2x2 + ... + a2,kxk : pm = am,1x1 + am,2x2 + ... + am,kxk
generator matrix
identity coefficients transposed
a1,1x1 + a1,2x2 + ... + a1,kxk + p1 = 0 a2,1x1 + a2,2x2 + ... + a2,kxk + p2 = 0 : am,1x1 + am,2x2 + ... + am,kxk + pm = 0
k row
s
100
010
001
,2,1,
,22,21,2
,12,11,1
kmmm
k
k
aaa
aaa
aaa
parity check matrix
identity coefficients
n -
k row
s
example of matrices
25
2D code: n = 9, k = 4, m = n - k = 5.
110101000
101100100
110010010
101010001
x1 x2
x3 x4
p1
p2
q1 q2 r
100001111
010001010
001000101
000101100
000010011
generator matrix parity check matrix
p1 = x1 + x2
p2 = x3 + x4 q1 = x1 + x3
q2 = x2 + x4
r = x1 + x2 + x3 + x4
syndrome
For a parity check matrix H and a vector v,
if HvT = 0, then v C
if HvT 0, then v C
the vector HvT is called the syndrome (シンドローム) of v:
if the syndrome of v is zero, then v C
if the syndrome of v is nonzero, then v C
The syndrome is more useful,
because it contains the information of errors.
26
syndrome and error
send a codeword u to a binary symmetric channel
an error vector e is added to the codeword in the channel
the received vector is v = u + e
27
u e
v = u + e
noise
codeword received
if e = 0 (no error),
then the syndrome of v is...
HvT = HuT = 0
if e 0 (error occurs), then the syndrome of v is...
HvT = H(u + e)T = HuT + HeT = HeT
the syndrome is solely determined from e, independently from u
... if you see the syndrome, then you can say what e is.
error patterns determine the syndrome
28
100001111
010001010
001000101
000101100
000010011
H
• 000000000 is sent, 000100000 is received... H(0 0 0 1 0 0 0 0 0)T = (0 1 0 1 1)T. •110000110 is sent, 110100110 is received... H(1 1 0 1 0 0 1 1 0)T = (0 1 0 1 1)T.
⇒ if the syndrome is (0 1 0 1 1), then the fourth bit is in error
independent from the sent codeword
error correction
if you know the correspondence between
error patterns and syndromes, then you can correct errors.
29
received syndrome
error pattern
decoding result
v = u + e
compute the syndrome:
s = HvT s
table of errors / syndromes
.....
..... ..... ..... ..... ..... e
u
one-bit error
Let n be a codeword, and let hi be the i-th column vector of H:
30
nhhh 21H
.
0
1
0
H 21
inT
hhhhe
the syndrome of a one-bit error e = (0 0 ... 0 1 0 ... 0) is...
only one-bit error at the i-th symbol position ⇔ syndrome equals to the i-th vector of H (the table of errors / syndromes not needed)
example of error correction
2D code
31
000000000, 000101011, 001001101, 001100110, 010010011, 010111000, 011011110, 011110101,
100010101, 100111110, 101011000, 101110011, 110000110, 110101101, 111001011, 111100000.
codewords
100001111
010001010
001000101
000101100
000010011
H
parity check matrix
• if 101001000 is received... ⇒ the syndrome is H(1 0 1 0 0 1 0 0 0)T = (1 0 0 0 0)T ⇒ this is the fifth column of H ⇒ the fifth-bit is in error, 101011000 must be sent
• if 101100110 is received... ⇒ the syndrome is H(1 0 1 1 0 0 1 1 0)T = (1 0 1 0 1)T ⇒ this is the first column of H ⇒ the first-bit is in error, 001100110 must be sent
parity check matrix and the ability of codes (1)
one-bit error at the i-th symbol position
⇔ syndrome equals the i-th vector of H
if several column vectors in H are the same, then
different error patterns result in the same syndrome
the error pattern is not uniquely determined
32
parity check code p = x1 + x2 111H
p1 = x1 + x2
p2 = x2 + x3
the example code in p. 8
10110
01011H
erro
r co
rrec
tio
n
NO
T p
oss
ible
parity check matrix and the ability of codes (2)
one-bit error at the i-th symbol position
⇔ syndrome equals the i-th vector of H
if all column vectors in H are different, then
different error patterns result in different syndromes
the error pattern is uniquely determined
33
erro
r co
rrec
tio
n
po
ssib
le
100001111
010001010
001000101
000101100
000010011
H2D code
summary
definition of linear codes
vector space, parity bits defined by linear equations
generator matrix
matrix of basis codewords
contributes to the encoding
parity check matrix
represents constraints among symbols
determines syndrome
error correction and error detection
34
exercise
Consider an “odd” parity check code C whose codewords are
(x1, …, xk, p) with p = x1+…+xk+1. Is C a linear code?
Construct a 2D code for 6-bit information (a1, ..., a6) as follows.
determine the generator and parity check matrices
encode 011001 using the generator matrix
correct an error in the sequence 110111001010
35
a1 a2 a3
a4 a5 a6
p1
p2
q1 q2 q3 r
(a1, ..., a6) → (a1, ..., a6, p1, p2, q1, q2, q3, r)