the price of uncertainty in communication brendan juba (washington u., st. louis) with mark...
TRANSCRIPT
The Price of Uncertainty in Communication
Brendan Juba (Washington U., St. Louis)with Mark Braverman (Princeton)
≈
≈
SINCE WE ALL AGREE ON A PROB. DISTRIBUTION OVER WHAT I MIGHT SAY, I CAN COMPRESS IT TO: “THE
9,232,142,124,214,214,123,845TH MOST LIKELY MESSAGE.
THANK YOU!”
3
1.Encodings and communication across different priors
2.Near-optimal lower bounds for different priors coding
4
Coding schemes
BirdChicken Cat Dinner Pet LambDuck Cow Dog
“MESSAGES”
“ENCODINGS”
6
Ambiguity
BirdChicken Cat Dinner Pet LambDuck Cow Dog
7
Prior distributions
BirdChicken Cat Dinner Pet LambDuck Cow Dog
Decode to a maximum likelihood message
8
Source coding (compression)
• Assume encodings are binary strings• Given a prior distribution P, message m,
choose minimum length encoding that decodes to m.
FOR EXAMPLE, HUFFMAN CODES AND SHANNON-FANO (ARITHMETIC) CODES
NOTE: THE ABOVE SCHEMES DEPEND ON THE PRIOR.
9
SUPPOSE ALICE AND BOB SHARE THE SAME ENCODING SCHEME, BUT DON’T SHARE THE SAME PRIOR…
P Q
CAN THEY COMMUNICATE??HOW EFFICIENTLY??
10
THE CAT.THE ORANGE CAT.THE ORANGE CAT WITHOUT A HAT.
11
Closeness and communication
• Priors P and Q are α-close (α ≥ 1) if for every message m,αP(m) ≥ Q(m) and αQ(m) ≥ P(m)
• Disambiguation and closeness together suffice for communication:
If for every m’≠m, P[m|e] > α2P[m’|e], then:Q[m|e] ≥ 1/αP[m|e] > αP[m’|e] ≥ Q[m’|e]
SO, IF ALICE SENDS e THEN MAXIMUM LIKELIHOOD DECODING
GIVES BOB m AND NOT m’…
“α2-disambiguated”
12
Construction of a coding scheme
(J-Kalai-Khanna-Sudan’11, Inspired by B-Rao’11)
Pick an infinite random string Rm for each m,Put (m,e) E e is a prefix of R⇔ m.
Alice encodes m by sending prefix of Rm s.t.m is α2-disambiguated under P.
Gives an expected encoding length of at mostH(P) + 2log α + 2
13
Remark
Mimicking the disambiguation property of natural language provided an efficient strategy for communication.
14
1.Encodings and communication across different priors
2.Near-optimal lower bounds for different priors coding
15
Our results
1. The JKKS’11/BR’11 encoding is near optimal– H(P) + 2log α – 3log log α – O(1) bits necessary
(cf. achieved H(P) + 2log α + 2 bits)
2. Analysis of positive-error setting [Haramaty-Sudan’14]:If incorrect decoding w.p. ε is allowed—– Can achieve H(P) + log α + log 1/ε bits– H(P) + log α + log 1/ε – (9/2)log log α – O(1) bits
necessary for ε > 1/α
16
An ε-error coding scheme.
(Inspired by J-Kalai-Khanna-Sudan’11, B-Rao’11)
Pick an infinite random string Rm for each m,Put (m,e) E e is a prefix of R⇔ m.
Alice encodes m by sending the prefix of Rm
of length log 1/P(m) + log α + log 1/ε
17
AnalysisClaim. m is decoded correctly w.p. 1-εProof. There are at most 1/Q(m) messages with Q-probability greater than Q(m) ≥ P(m)/α.
The probability that Rm’ for any one of these m’ agrees with the first log 1/P(m) + log α + log 1/ε ≥ log 1/Q(m)+log 1/ε bits of Rm is at most εQ(m).
By a union bound, the probability that any of these agree with Rm (and hence could be wrongly chosen) is at most ε.
18
Length lower bound 1—reduction to deterministic encodings
• Min-max Theorem: it suffices to exhibit a distribution over priors for which deterministic encodings must be long
19
Length lower bound 2—hard priorslog.prob.
≈0
-log α
-2log α
m*
S
Lemma 1: H(P) = O(1)
α-close
α-close
Lemma 2
α2α
20
Length lower bound 3—short encodings have collisions
• Encodings of expected length < 2log α – 3log log α
encode m1 ≠ m2 identically with nonzero prob.• With nonzero probability over choice of P & Q,
m1,m2 S ∈ and m* {∈ m1,m2}• Decoding error with nonzero probability☞Errorless encodings have expected length ≥ 2log
α-3log log α = H(P)+2log α-3log log α-O(1)
21
Length lower bound 4—very short encodings often collide
• If the encoding has expected length < log α + log 1/ε – (9/2)log log α
m* collides with (ε log α) α other messages∼ ∙• Probability that our α draws for S
miss all of these messages is < 1-2ε • Decoding error with probability > ε ☞Error-ε encodings have expected length ≥
H(P) + log α + log 1/ε – (9/2)log log α – O(1)
22
Recap. We saw a variant of source coding for which (near-)optimal solutions resemble natural languages in interesting ways.
23
The problem. Design a coding scheme E so that for any sender and receiver with α-close prior distributions, the communication length is minimized.
(In expectation w.r.t. sender’s distribution)
Questions?