chapter 1: introduction to information...

Post on 18-Oct-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Chapter 1: Introduction to Information Theory

Yayu Gao

Dian GroupSchool of Electronic Information and Communications

Email: yayugao@hust.edu.cnHomepage: http://122.205.5.5:8084/~yayugao/

Course Site: http://122.205.5.5:8084/~yayugao/infotheory_2019.html

April 20th, 2019

Chapter 1: Introduction to Information Theory

What is information?

What is information theory?

Information theory in communication

What is information?

I Eliminate redundancy?

What is information?

I Extract essential characteristics?

What is information?

I Symbolic presentation?

What is information?

I What do you see in the above drawing?

What is information?

What is information?

I Inspiration?

How to define information?

I No universal definition

I Lack of complete, clear, universally recognized concept ofinformation

I Reason:I Do not completely understand the nature of information

I Characteristics of informationI Syntax: presentationI Semantics: meaningI Pragmatics: utility

Understand information at different levels

I Broad viewI Syntax + Semantics + Utility

I Technical viewI Syntax + Semantics

I Statistic viewI SyntaxI Formulated by mathematicsI Statistic characteristics in presentation

A snapshot of Shannon’s information theory

I Statistic informationI Advantages:

I Clear definitionI Statistic property independent of presentation

I Entropy

H(X ) = −∑

p(x) log p(x)

Long-before Shannon’s information theory

I 1775 B.C., Greek letters

I 1400 B.C., Chinese Oracle

I 800 B.C, Making a Fool of Seigneurs by Lighting False SignalFire

Talking Drums in West Africa

I Message: ”Come back home.”

I Translated by the drummers as: “Make your feet come back the way they went,make your legs come back the way they went, plant your feet and your legsbelow, in the village which belongs to us.”

Talking Drums in West Africa

I Tonal languagesI You are happy ↗ vs. You are happy ↘.

I alambaka boili [- – ] = he watched the riverbankI alambaka boili [—- - ] = he boiled his mother-in-law

I The spoken languages of Africa elevated tonality to a crucialrole. The drum language employed tone and only tone.I Songe (the moon) → songe li tange la manga (the moon looks

down at the earth)I Koko (the fowl) → koko olongo la bokiokio (the fowl, the little

one that says kiokio)

Development of modern communication technologies

I In 1838, Morse codes

I In 1875, Telephone invented by Bell

I In 1877, Gramophone invented by Edison

I In 1901, Wireless telegraph invented by Marconi

I In 1927, Two radio programs broadcasted by NBC

I In 1938, Radio show “world war” caused panic

I In 1939, Television broadcast

I In 1944, First computer invented in Harvard U.

Technical preparation before Shannon’s information theory

I Telegraph(Morse, 1830’s);

I Telephone(Bell, 1876);

I Wireless Telegraph(Marconi, 1887);

I AM Radio(early 1900’s);

I SSB modulation(Carson, 1922);

I Television(1925-1927);

I Telex(1931);

I FM Radio(Armstrong, 1936);

I Pulse modulation(PCM) (Reeves, 1937-1939);

I Vocoder(Dudley, 1939);

I Spread Spectrum(1940’s)

Long before Shannon’s information theory

I In 1924, Nyquist found max ratevs. logN [2]

I Are Morse codes optimal?

I What is the gain with the optimalMorse codes?

I How to design the optimal codes?

I In 1928, Nyquist sampling theorem[3]

Technical preparation before Shannon’s information theory

I In 1928, Hartley introduced “rate of communication”,“inter-symbol interference” and “capacity of a system totransmit information” [4]I The point of view developed is useful in that it provides a

ready means of checking whether or not claims made for thetransmission possibilities of a complicated system lie within therange of physical possibility.

I Hartley: “Capacity” requires a quantitative metric ofinformationI H = n log s, where n: times of selection, s: number of selected

symbols

I “information is out of the selection of a limited possibilities”

Comments on Nyquist’s and Hartley’s work

I Contribution:I Definition of informationI Metric of information quantity

I Limitation:I Ignore noiseI Ignore the randomness of source symbols

Impact after Shannon’s information theory

I The Internet (Ethernet, DSL, WiFi, telephone line modems, etc.)

I The Web (Google, EBay, Amazon, MySpace, Facebook, YouTube, etc.)

I Cellular Communications (GSM, IS-95, cdma2000, WCDMA, 3GPP, etc.)

I The Personal Computer (Mac, Windows, etc.)

I Bluetooth

I VHS, CD, DVD, BlueRay

I Digital TV & radio (DirectTV, XM Radio, Sirius, Digital CATV, HDTV,TiVo)

I JPEG, MPEG, mp3

I GPS

I DSP chips; Flash Memory

I MIMO, MUD

I TCM, Turbo codes, Raptor codes, Fountain codes

Major challenges for establishing a communication theory

I How much information is transmitted?I Quantify information carried by symbols

I How to evaluate the performance of communication systems?I Transmission efficiency in communication systemsI Accuracy of information transmission

I Nature of real-world scenarios?I Noise interference

I Core problems: efficiency vs. reliability

I Pioneering work by Claude E. Shannon and Norbert Wiener

Efficiency vs. Reliability

I f u cn rd ths, u cn bcm a sec & gta gd jb w hi pa!

I If you can read this, you canbecome a secretary and get a goodjob with high pay!

Shannon’s perspective on information

I The fundamental problem of communication is that ofreproducing at one point either exactly or approximately amessage selected at another point.

I Frequently the messages have meaning; that is they refer to orare correlated according to some system with certain physicalor conceptual entities. These semantic aspects ofcommunication are irrelevant to the engineering problem.

I The significant aspect is that the actual message is oneselected from a set of possible messages.

I The system must be designed to operate for each possibleselection, not just the one which will actually be chosen sincethis is unknown at the time of design.

What is the goal of communication?

I The essence of communication is to transmit information.

I Fundamental question: How to quantify information?I Pioneering work by Shannon’s information theory

How much information a message contains?

I Related to the cost of thetransmission?

I Related to the length of themessage?

I Related to how surprising themessage is.

I The essence of information is toeliminate uncertainty.I The more uncertainty is

eliminated after receiving themessage, the more information istransmitted.

How to measure information?

I Define a quantitative measure of “information”I Consider statistical informationI Uncertainty is eliminated by information.

I Mathematical tool:I Uncertainty can be described by probability theory, stochastic

process, and so on.

I Modeling steps:

1. investigate the properties of information2. model them in probabilities

Step 1: Investigate the properties of information

I Property 1 Information contained in events ought to be defined interms of some measure of the uncertainty of the events.

I Property 2 Less certain events ought to contain more informationthan more certain events.

I Property 3 The information of unrelated/independent events takenas a single event should equal the sum of the information of theunrelated events.

Step 2: Model the properties of information in probabilities

I Firstly, we investigate the measure of information I (a) forsingle outcome (event) a.

I (a) =?

I Property 1I Information contained in events ought to be defined in terms

of some measure of the uncertainty of the events.

I ModelingI A nature measure of uncertainty of event a is the probability of

a, P(a).I Define the information in terms of P(a).

I (a) = f (P(a)).

Step 2: Model the properties of information in probabilities

I Property 2I Less certain events ought to contain more information than

more certain events.

I ModelingI Inversely proportional to the probability

P(a) > P(b)⇒ I (a) < I (b)

I Non-linear mapping from probability to information

P(a) = 1⇒ I (a) = 0

P(a) = 0⇒ I (a) =∞

Step 2: Model the properties of information in probabilities

I Property 3I The total information of independent events should be the sum

of the information of each event

I ModelingI Suppose a and b are two independent events. P(a) = p1,

P(b) = p2.I (a, b) is considered together as a single event:

P(a, b) = p1 · p2.

I (a, b) = I (a) + I (b)

f (p1 · p2) = f (p1) + f (p2)

I What is f ?

Information of event and source

I Information of an eventI Suppose the event x with probability p(x), its self-information

is defined by

I (x) = − log [p(x)] = log [1

p(x)].

I The only form that satisfies all the properties of information.

I Information of a sourceI Suppose the source as random variable X with a probability

mass function p(x).I The average information or entropy is defined by

H(X ) = −∑

p(x) log p(x).

I More details of these concepts in source entropy.

What is information theory?

I The major issue in information theory is to discovermathematical laws in communicating or manipulatinginformation.

I The information theory sets up quantitative measures ofinformation, and of the capacity of various systems totransmit, store, and otherwise process information.I Information of an event: I (A)I Information of a source: H(X )

What is information theory?

I The classical information theory was publishedin the landmark paper “A MathematicalTheory of Communication” in 1948 by ClaudeE. Shannon (1916-2001).

I Before 1948, what do we know about communication systems?

I A technique.I No theoretical framework.I Intuition: The faster you transmit, the more errors you have.

I After 1948, what do we know about communication systems?

I Quantify information.I Model of a communication system.I Limits of communications...,

Goals of this course

I An understanding of the intrinsic properties of transmission ofinformation

I The relation between coding and the fundamental limits ofinformation transmission in the context of communications

I NOT a comprehensive introduction to the field of informationtheory

I NO touch in a significant manner on important topics such asmodern coding methods and complexity

Applications in communication

I Main application area: coding

I Three coding theorems invented by Shannon:I Source coding theoremI Channel coding theoremI Rate distortion theorem

I Practical methods have been invented and implemented afterShannon:I Source coding: Huffman codes (compact), Lempel-Ziv

(compress, gzip)I Channel coding: error-correcting codes (Hamming,

Reed-Solomon, convolutional, Trellis, Turbo)I Rate-distortion coding: vocoders, minidiscs, MP3, JPEG,

MPEG

Relationship with other fields

Comments on Shannon’s theory

I Limitations of Shannon’s Information TheoryI Model the information source by a sample space, and describe

the outcomes by probabilities.I How about otherwise? Uncountable space, unknown

probabilities, . . .I Does not involve subjective ideas.

I Other approaches in information theoryI Noise theory, signal filter and detection, statistical detection

and prediction, modulation, . . .

I In this course, we only focus on Shannon’s theory.

Information theory in communication

I Typical model of a communication system

Block diagram of communication systems

I The transmission and process of information incommunication systems

Source coding vs. Channel coding

I Source CodingI Core problem: efficiencyI Efficiency: having an average code length that is as small as

possibleI Example: to use shorter code for the English letters which

appear frequently, so as to reduce the average code length

I Channel CodingI Core problem: reliabilityI Reliability: to cope with the errors in the transmissionI Example: to send the same sequence multiple times, so as to

recover from the errors in channel

Reliability vs. Efficiency

I The eternal issues of information theoryI Lose reliability to achieve higher efficiencyI Lose efficiency to achieve higher reliability

I Balance between efficiency and reliabilityI Efficiency:

I digital case : send as few symbols as possibleI analog case : reduce the time that the channel or the

bandwidth is used

I Reliability:I digital case : reduce the error probability as small as possibleI analog case : reduce the noise as much as possible

Summary

What is information?

What is information theory?

Information theory in communication

Reference

T. M. Cover and J. A. Thomas, Elements of Information Theory (2ndEdition), Hoboken, N. J. : J. Wesley, 2006.

H. Nyquist, “Certain factors affecting telegraph speed,” Bell Syst. Tech.J., vol. 3, pp. 324–352, Apr. 1924.

H. Nyquist, “Certain topics in telegraph transmission theory,” AIEETrans., vol. 47, pp. 617–644, Apr. 1928.

R. V. L. Hartley, “Transmission of information,” Bell Syst. Tech. J., vol. 7,pp. 535–563, July 1928.

top related