information and coding theory introduction. lecture times tuesdays thursdays 10.02. 12:3012.02....
TRANSCRIPT
Information and Information and Coding TheoryCoding Theory
IntroductionIntroduction
Lecture times
Information theory
One of few fields with identifiable beginning:
A Mathematical Theory of CommunicationBell Systems Technical JournalC.Shannon, 1948
Claude Elwood ShannonThe Mathematical Theory of CommunicationC.Shannon and W.Weaver, 1949
IT courses become very popular in universities, until the subject becometoo broad. The goodness of term “IT” disputable (communication?).
First applications: space communications, military.
End of the road?~ 1971 - lack of suitable hardware~ 2001 - in some cases we already have achieved theoretical limits
Information theory
http://www.vf.utwente.nl/~neisser/public/reject.pdf
Information theory
Error correcting codes
There is no single “discoverer”, but the first effective codes are due to R.Hamming (around 1950).
Richard Hamming
Some other popular codes:
- Golay codes (Voyager spacecrafts, around 1980)- Reed-Solomon codes (CDs, DVDs, DSL, RAID-6, etc)- BCH (Bose & Chaudhuri & Hocquenghem) codes
The course will be more oriented towards ECC than IT (so,expect more algebra and not that much of probability theory :)
Applications of IT and/or ECC
Applications of IT and/or ECCVoyager 1Launched 05.09.1977Now 127 AU from Earth
Voyager 2Launched 20.08.1977Now 103 AU from Earth
Error correction:(24,12,8) Golay codeViterbi-decoded convolutional code, rate 1/2, constraint length k=7Later concatenation with (255,223) Reed-Solomon codes over GF(256) added
Applications of IT and/or ECC
CD (1922): (32,28) + (28,24) RS codesCD-ROM (1989): The same as above + (26,24) + (45,43) RS
codesDVD (1995): (208,192) + (182,172) RS codesBlue-ray Disc (2006): (248,216) + (62,30) RS codes (LDC +
BIS), “picket” encoding
Applications of IT and/or ECC
Error correction can be drive specific.
Initially mostly based on Reed-Solomon codes.
From 2009 increased use of LDPC (low density parity-check codes) with performance close to Shannon’s limit.
Applications of IT and/or ECC
One of the first modems that employed error correction and reached 9600 bps transfer rate.
Introduced in 1971.
Priced around “only” $11000.
Origins of information theory
"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point"
Shannon, C.E. (1948), "A Mathematical Theory of Communication", Bell System Technical Journal, 27, pp. 379–423 & 623–656, July & October, 1948.
Information transmission
Noiseless channel
Noiseless channel
Are there any non-trivial problems concerning noiselesschannels? E.g. how many bits we need to transfer a particular piece of information?
All possible n bit messages, each with probability1/2n Noiseless channel
Receiver
Obviously n bits will be sufficient.Also, it is not hard to guess that n bits will be necessary to distinguishbetween all possible messages.
Noiseless channel
All possible n bit messages.
Msg. Prob.000000... ½111111... ½other 0
Noiseless channel
Receiver
n bits will still be sufficient.However, we can do quite nicely with just 1 bit!
Noiseless channel
All possible n bit messages, the probability of message i being pi.
Noiseless channel
Receiver
n bits will still be sufficient.If all pi > 0 we also will need n or more bits for some messages, sincewe need to distinguish all of them. But what is the smallest averagenumber of bits per message we can do with?
Derived from the Greek εντροπία "a turning towards" (εν- "in" + τροπή "a turning").
Binary entropy function
Entropy of a Bernoullitrial as a function of success probability, often called the binary entropy function,Hb(p).
The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.
[Adapted from www.wikipedia.org]
Encoding over noiseless channels
The problem.Given set M of n messages, message mi with probability pi, find a code (mapping from M to {0,1}*), such that an average numberof bits for message transmission is as small as possible (i.e. codethat minimizes W = pi c(mi), where c(mi) is number off bits used for encoding of mi).
What we know about this?
- it turns out for any code we will have W E- there are codes that (up to extent) can approach E arbitrarily close- some codes we will have a closer look at: Huffman codes, Shannon codes, arithmetic codes
Noisy channel
In practice channels are always noisy (sometimes this could be ignored).There are several types of noisy channels one can consider.We will restrict attention to binary symmetric channels.
Noisy channel
Some other types of noisy channels.
Binary erasure channel
Noisy channel
Noisy channel - the problemAssume BSC with probability of transmission error p.
In this case we assume that we have already decided on the optimal string of bits for transmission - i.e. each bit could have value 1 or 0 with equal probabilities ½.
We want to maximize our chances to receive a message without errors, to do this we are allowed to modify the message that we have to transmit.
Usually we will assume that message is composed of blocks of m bits each, and we are allowed to replace a given m bit blockwith an n bit block of our choice (likely we should have n m :)
Such replacement procedure we will call a block code.
We also would like to maximize the ratio m/n (code rate).
Noisy channel - the problemIf p > 0, could we guarantee that message will be received without errors?
With probability pn any number of bits within each blockcould be corrupted...
If we transmit just unmodified block of m bits, the probability of error is 1(1p)m. Can we reduce this?
Repetition code:Replace each bit with 3 bits of the same value (0000, 1111).We will have n = 3m and probability or error 1((1p)3 +3p(1p)2)m = 1(13p2 +2p3)m.
Note that 1p < 13p2 +2p3, if 0 < p < ½.
Repetition code R3
Probability of error of transmission of single bit using no codingand R3.
Repetition codes Rn
R3 - the probability of unrecoverable error is 3p2 2p3
For RN we have:
Can we design something better thanrepetition codes?
Hamming code [7,4]
G - generator matrix
A (4 bit) message x is encoded as xG, i.e.if x = 0110 then c = xG = 0110011.
Decoding?
- there are 16 codewords, if there are no errors, we can just find the right one...- also we can note that the first 4 digits of c is the same as x :)
Hamming code [7,4]
What to do, if there are errors?
- we assume that the number of errors is as small as possible - i.e.we can find the code word c (and the corresponding x) that isclosest to received vector y (using Hamming distance)
- consider vectors a = 0001111, b = 0110011 and c = 1010101,-- if y is received, compute ya, yb and yc (inner products), e.g.,for y = 1010010 we obtain ya = 1, yb = 0 and yc = 0.-- this represents a binary number (100 or 4 in example above) andwe conclude that error is in 4th digit, i.e. x = 1011010.Easy, bet why this method work?
Hamming code [7,4]
Parity bits of H(7,4)
No errors - all pi-s correspond to di-sError in d1,...,d3 - a pair of wrong pi-sError in d4 - all pairs of pi-s are wrongError in pi - this will differ from error insome of di-sSo:- we can correct any single error- since this is unambiguous, we should be able to detect any 2 errors
Hamming code [7,4]
a = 0001111, b = 0110011 and c = 1010101
H - parity check matrix
Why it does work? We can check that without errors yH = 000and that with 1 error yH gives the index of damaged bit...
General case: there always exists matrix for checking orthogonalityyH = 0. Finding of damaged bits however isn’t that simple.
Block codes
- the aim: for given k and n correct as many errors as possible- if minimal distance between codewords is d, we will be able tocorrect up to d1 /2 errors. - in principle we can chose any set of codewords, but it is easier towork with linear codes- decoding still could be a problem- even more restricted and more convenient are class of cyclic codes
Some more complex approaches
- we have formulated the lossless communication problem in terms ofcorrection of maximal number of bits in each block of [n,k] code andwill study the methods for constructing and analyzing such codes
- errors quite often occur in bursts...
- it is possible to “spread out” individual blocks (interleaving)
- it turns out that better work methods that just try to minimize transmission errors (without guarantees regarding number of bits)
- there are recently developed methods/resources that allows to use such codes efficiently in practice and they are close to “optimal”-- low-density parity-check codes (LDPC)-- turbo codes
Limits of noisy channels
Given [n,k] code, we define rate of the code as R = k/n. The aim is to get R as large as possible for a given error correctioncapacity.Assume BSC with error rate p. Apparently there should be somelimits how large the value of R could be achieved.
A bit more about entropy.
Conditional entropy
Mutual information
Binary entropy function
Limits of noisy channels
Given [n,k] code, we define rate of the code as R = k/n. The aim is to get R as large as possible for a given error correctioncapacity.Assume BSC with error rate p. Apparently there should be somelimits how large the value of R could be achieved.
Channel capacity
For BSC there is just a “fixed distribution” defined by p.
pppp log)1log()1(1C
Shannon Channel Coding Theorem
Shannon’s original proof just shows that such codes exist. With LDPC and turbo codes it is actually possible to approach Shannon’slimit as close as we wish.
Some recent codes
LDPC (Low Density Parity Check) codes
Turbo codes- interleaving (try to combine several reasonably good codes)- feedback (decoding of the next row depends from errors in previous ones)
Convolutional codes
Shannon Channel Coding Theorem
Main topics covered by the course
Transmission over noiseless channels (data compression)•Notion of entropy, its relation to data compression•Optimal compression codes (Huffman code)•Heuristic compression methods (Lempel-Ziv code)
Transmission over noisy channels (transmission error correction)•Notion of entropy, its relation to channel capacity and theoretical possibilities for error correction (Shannon information theory)•Practical methods for transmission error correction (block error correction codes)
– As little “reminding” of finite field and linear algebra as will be needed to discuss this topic :)
– Definition and basic properties of block error correcting codes– Hamming codes correcting single errors– Multiple error correction – BCH and Reed-Solomon codes– Some applications of error correction (e.g. error correction on CDs)
Plans for nearest future :)
We will need to start with some facts and results from algebra; to make the most mathematical parts somewhat less intense I propose to mix different subjects a bit:
12.02 (Thu) 10:30Block codes and linear block codes. Some examples.Groups, fields, vector spaces, codes - basic definitions.
17.02 (Tue) 12:30Codes - syndrome decoding, some more definitions, again something from algebra :)
19.02 (Thu) 10:30Entropy - basic definitions, its relation to data compression.
Requirements
• 5-6 homeworksIn principle these are short-term - up to 2 weeks deadlineRequirement a bit relaxed this year, stillat least half of homeworks must be submitted before the exam session starts80% of grade
• ExamIn written form and will consist of practical exercises and, probably, some theoretical questionsA home-take exam - you get questions and bring it back within 2-3 days20% of grade
• To qualify for grade 10 you may be asked to cope with some additional question(s)/problem(s)
Academic honesty
You are expected to submit only your own work!
Sanctions:
Receiving a zero on the assignment (in no circumstances a resubmission will be allowed)
No admission to the exam and no grade for the course
Textbooks
Vera Pless
Introduction to the Theory of Error-Correcting Codes
Wiley-Interscience, 1998 (3rd ed)
Course textbook
Textbooks
W.Cary Huffman, Vera Pless
Fundamentals of Error-CorrectingCodes
Cambridge University Press, 2003
Textbooks
Neil J. A. Sloane, Florence Jessie MacWilliams
The Theory of Error-Correcting Codes
North Holland, 1983
Textbooks
Todd K. Moon
Error Correcting CodingMathematical Methods and Algorithms Wiley-Interscience, 2005
Textbooks
David J. C. MacKay
Information Theory, Inference and Learning Algorithms
Cambridge University Press, 2007 (6th ed)
http://www.inference.phy.cam.ac.uk/mackay/itila/
Textbooks
Somewhat “heavy” on probabilities-related material, but very “user friendly” and recommended for entropy-related topics.
Textbooks
Juergen Bierbrauer
Introduction to Coding Theory
Chapman & Hall/CRC, 2004
Textbooks
Norman Biggs
Introduction Codes: An Introduction to Information Communication and Cryptography
Springer, 2008
Textbooks
Thomas M. Cover, Joy A. Thomas
Elements of Information Theory Wiley-Interscience, 2006 (2nd ed)
Web page
http://susurs.mii.lu.lv/juris/courses/ict2015.html
It is expected to contain:
• short summaries of lectures• announcements• power point presentations• homework problems• frequently asked questions (???)• other useful information
Contact information
Juris Vīksna
Room 421, Rainis boulevard 29
phone: 67213716