addressing and hashing question: for a network of computers it is desirable to be able to send a...

25
Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is on its way. Model the network as a graph G. We can assign an address for each vertex. The address is from {0,1} k . The distance of two vertices in the graph is equal to the Hamming distance of the addresses. This is equivalent to regarding G as an induced subgraph of the hypercube H . This is 1 1 1 1 * * 2 1 0 * 1 * 3 * 0 0 0 1 4 0 0 1 * * 5 0 0 0 0 0 1 2 3 4 5

Upload: pauline-richard

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Addressing and hashing

• Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is on its way.

• Model the network as a graph G. We can assign an address for each vertex. The address is from {0,1}k. The distance of two vertices in the graph is equal to the Hamming distance of the addresses.

• This is equivalent to regarding G as an induced subgraph of the hypercube Hk. This is impossible for K3.

1 1 1 1 * * 2 1 0 * 1 * 3 * 0 0 0 14 0 0 1 * *5 0 0 0 0 0

1 23

4

5

Page 2: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Introduce a new alphabet {0,1,*}k and form addresses by taking n-tuples from it.

The distance between two addresses is defined to be the number of places where one has a 0 and the other a 1.

For an addressing of a graph G, we require that the distance of any two vertices in G is equal to the distance of their addresses.

Denote by N(G) the minimum value of n for which there exists an addressing of G with length n.

Page 3: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

For a tree we can do without the stars. By induction, it is clear for a tree of two vertices, which is trivial addressing of length 1.

Suppose we can address trees with k vertices. If x0, ..,xn are the vertices of the tree T and x0 has degree 1, consider the addressing for T/{x0} and let ai be the address of xi.

Suppose x0 is connected to x1. Then address x0 as (1, a1) and change the other addresses to (0, ai) for i=1,..,k.

Thus N(T) |V(T)|-1.

Address a tree

Page 4: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Address a complete graph Km

Consider the identity matrix of size m-1. Replace the zeros above the diagonal with stars and add a new row of zeros.

N(Km) m-1.

1 * * * 1 1 1 1 * * 0 1 * * 2 1 0 * 1 * 0 0 1 * 3 * 0 0 0 1 0 0 0 1 4 0 0 1 * * 0 0 0 0 5 0 0 0 0 0

12

3

4

5

Page 5: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

x1 1 1 1 * * x2 1 0 * 1 *x3 * 0 0 0 1x4 0 0 1 * *x5 0 0 0 0 0

From the first column we have (x1+x2)(x4+x5).From the 2nd column we have (x1)

(x2+x3+x4+x5).From the 3rd column we have (x1+x4)(x3+x5).From the 4th column we have (x2)(x3+x5).From the 5th column we have (x3)(x5).

Sum them up and have: dij xi xj . What does this mean?

12

3

4

5

Page 6: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Thm 9.1. Let n+, respectively n-, be the number of positive, respectively negative, eigenvalues of the distance matrix (dij) of the graph G. Then N(G) max {n+, n-}.

Pf: From the previous example:(x1+x2)(x4+x5)

can be represented by xAx/2, where x=(x1, x2, x3 ,x4 ,x5) and A is the following matrix of rank 2 and trace 0:

00011 Therefore it has one positive and one negative eigenvalue.

00011 Since (dij) is the sum of the matrices corresponding to the

00000 quadratic forms, it can have at most n positive eigenvalues.

11000 11000

Thm 9.2. N(Km) = m-1.

Thm 9.3. If T is a tree on n vertices, then N(T)=n-1.

Page 7: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

The conjecture that N(G)|V(G)|-1 for all connected graphs G was proved by Peter Winkler in 1983. ($200)

Pick a vertex x0, then construct a spanning tree T by BFS, and number the vertices by DFS.

Define P(i):={j: xj is on the path from x0 to xi in T}.

Define ij :=max {P(i) P(j)), i’:=max{P(i)\{i}}.

x0

x1

x2

x3

x4

x5

x6

x7

x8

x9

Page 8: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Define i ~ j P(i) P(j) or P(j) P(i). Define c(i,j) := dT(xi,xj) - dG(xi,xj).

Lemma 9.4. (i) c(i,j) =c(j,i) 0; (ii) if i ~ j, then c(i,j) =0. (iii) if not (i ~ j), then c(i, j’) c(i,j) c(i,j’)+2.

Pf: (i) is trivial. (ii) dG(xi,xj) |dG(xi,x0) - dG(xj,x0)|= dT(xi,xj).

(Why?)

(iii) By |dG(xi,xj) - dG(xi,xj’)| 1 and dT(xi,xj) =1+ dT(xi,xj’). dT(xi,xj) - dG(xi,xj) =1+ dT(xi,xj’) - dG(xi,xj).

x0

xj

xi

Page 9: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

c(i,j) = dT(i,j) - dG(i,j) Example:

C(0,1)=0, C(0,2)=0, C(0,3)=0, C(0,4)=0, C(0,5)=0

C(1,2)=0, C(1,3)=1, C(1,4)=1, C(1,5)=1C(2,3)=1, C(2,4)=3, C(2,5)=3C(3,4)=0, C(3,5)=0C(4,5)=0

0

1

3

2

4 5

a0=(0,0,0,0,0)a1=(1,0,0,0,0)a2=(1,1,0,*,0)a3=(*,0,1,0,0)a4=(*,*,1,1,0)a5=(*,*,1,1,1)

a0=(0,0,0,0,0)a1=(1,0,0,0,0)a2=(1,1,0,0,0)a3=(0,0,1,0,0)a4=(0,0,1,1,0)a5=(0,0,1,1,1)

Page 10: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Let n=V(G)-1. Let ai {0,1,*}n be the address of xi.

ai =(ai(1),..., ai(n)), where

i

1 if j P(i),

c(i,j)-c(i,j')=2, or

a ( j) * if c(i,j)-c(i,j')=1, i<j, c(i,j) even, or

c(i,j)-c(i,j')=1, i>j, c(i,j) odd,

0 otherwise.

j>i

i0

j’

j<i

Page 11: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Thm 9.5. d(ai, ak)= dG(xi, xk).

Pf: Assume i < k. (1) Suppose i~k, i.e. i is in P(k):

For j P(i), ak(j) =1 and ai(j) = 1. Thus for j P(k)\P(i), ak(j) =1 and ai(j) 1.

For these j, c(i,j)=0, hence ai(j) =0. Then dG(xi, xk)=|P(k)\P(i)|.

x0

xk

xi

xj

Page 12: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Thm 9.5. d(ai,ak)= dG(xi,xk).

(2) Suppose not (i~k), i.e. i is not in P(k): Exactly one of {ai(j), ak(j)} equals 1 when j is in P(i)-P(k) or P(k)-P(i). Need to prove that there are exactly c(i,k) *’s in these

coordinates. So this will yield d(ai, ak) = |P(k)\P(i)| + |P(i)\P(k)| - c(i,k) = dT(xi,xk) – c(i,k) = dG(xi,xk). Consider c(i,k) c(i,k’) c(i,k’’) ... c(i, ik) =0 c(i,k) c(i’,k) c(i’’,k) ... c(ik, k) =0. We will obtain one * in ai for each even m with 0< m

c(i,k) and one * in ak for each odd m with 0< m c(i,k).

For each even m with 0< m c(i,k), let j be the unique vertex such that c(i,j) m and m > c(i,j’).

Note m may not be in the list but j does exist! Each j is distinct, because c changes at most 2 with

each step. The DFS ordering guarantees i < j for each j in P(k) -

P(i). Thus ai(j) = * for j in P(k) - P(i) corresponding to some

even m.

x0

xk

xi

Page 13: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Similarly, for each odd m with 0< m c(i,k), let j be the unique vertex such that c(j, k) m and m > c(j’, k).

Each j is distinct because c changes at most 2 with each step.

The DFS ordering guarantees j < k for each j in P(i) - P(k).

Thus ak(j) = * for j in P(i) - P(k) corresponding to some odd m.

Thus we’ve counted the *’s in P(i)-P(k) and P(k)-P(i). The number of * is exactly the number of even integers plus the number of odd integers between 1 and c(i,k), which together equals c(i,k).

Thm 9.6. N(G) |V(G)| - 1.

x0

xk

xi

Page 14: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Def: An ABD(k,w), associative block design, is a set of b:= 2w elements of {0,1,*}k with the following properties: if the elements are the rows of a b × k matrix C, then

(i) each row of C has k-w stars; (ii) each column of C has b(k-w)/k stars; (iii) any two distinct rows have distance at least 1.Ex: ABD(4, 3) * 0 0 0

0 * 1 0

0 0 * 1

0 1 0 * * 1 1 1

1 * 0 1 1 1 * 0 1 0 1 *

This implies each k-bit 0-1 vector has distance 0to exactly one row of C.

å 0 0 1 0 0 represents 24 =16 0-1 vectors.

So an ABD(k,w) uses 2w k-bit 0-1- vectors to indicate 2k k-bit 0-1 vectors.

Page 15: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Thm 9.7. If an ABD(k,w) exists, then:(1) it has exactly bw/(2k) 0’s and bw/(2k) 1’s in

each column;(2) for each k-bit x it has exactly C(w, u) rows

which agrees with x in u positions;(3) the parameters satisfy w2 2k(1-1/b);(4) for any row, the number of rows with stars in

the same position is even.Pf: Let C be the ABD(k,w).

Page 16: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

(i) each row of C has k-w stars; (ii) each column of C has b(k-w)/k stars; (iii) any two distinct rows have distance at least 1.

(1) It has exactly bw/(2k) 0’s and bw/(2k) 1’s in each column.

Pf: Each row in C represents 2(k-w) k-bit 0-1 vectors. Consider the j-th column. There are 2(k-1) k-bit 0-1

vectors with the j-th bit as 0. How many rows of C are needed to cover these 2(k-1) 0-1 vectors? Each row in C with the j-th bit as zero can cover 2(k-w) 0-

1 vectors and each row in C with the j-th bit as * can cover

2(k-w-1) vectors. Thus (# of 0 in the j-th col) 2(k-w) + (# of * in the j-th

col) 2(k-w-1) = 2(k-1) . (# of 0 in the j-th col) = 2(w-1) - b(k-w)/2k = b/2 - b(k-

w)/2k = bw/2k. Similarly, for 1.

Page 17: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

(i) each row of C has k-w stars; (ii) each column of C has b(k-w)/k stars; (iii) any two distinct rows have distance at least 1.

(2) For each k-bit x, it has exactly C(w, u) rows which agrees with x in u positions.

Pf: Let x be any k-bit 0-1 vector. Denote ni the

number of rows of C which agree with x in i positions.

There are C(k, l) k-bit vectors which agree with x in exactly l positions.

Therefore C(k,l) = i ni C(k-w, l-i). l C(k,l)zl = l i ni C(k-w, l-i) zl = i ni zi l C(k-w, l-i) zl-i. I.e., (1+z)k = (1+z)k-w ni zi. This proves ni =C(w, i).

Page 18: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

(i) each row of C has k-w stars; (ii) each column of C has b(k-w)/k stars; (iii) any two distinct rows have distance at least 1.

(3) The parameters satisfy w2 2k(1-1/b). Pf: • The sum of the distances between pairs of

rows of C is k(bw/2k)2 by (1). Why?

• Since any two rows have distance at least 1, this sum is at least C(b,2).

• Thus k(bw/2k)2 C(b,2)= b(b-1)/2. (bw)2 (4k) b(b-1)/2. w2 2k (1-1/b).

Page 19: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

(i) each row of C has k-w stars; (ii) each column of C has b(k-w)/k stars; (iii) any two distinct rows have distance at least 1.

(4) For any row, the number of rows with stars in the same position is even.

Pf: • Consider a row of C. Count k-bit vectors which

have 0’s in the positions where the row has *’s.

• e.g. For (***1100), there are 24 (000_ _ _ _) vectors with 0’s in the *-positions. In general, 2w.

• Let R = (# of rows with the same *-pattern). Then these R rows indicate R 0-1 vectors with 0’s in the *-positions.

• Each row with a different star pattern represents an even number of such vectors whereas a row with the same *-pattern represents exactly one such vector. Why?

• Thus R + (even number) = 2w, which implies R is even.

**0001**1010**0101**1100

0011**0**00111**01

Page 20: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

(i) each row of C has k-w stars; (ii) each column of C has b(k-w)/k stars; (iii) any two distinct rows have distance at least 1.

Thm 9.8. Let C be an ABD(k,w) with w >3.(1) If two rows of C agree in all but one position, then C(w, 2) k. (2) Otherwise w2 > 2k.Proof of (1): Let c1 and c2 be two rows of C differing

only in position 1. Then all the other rows of C must differ from c1 in some other position.

Thus, b - 2 ≤ (w-1) (bw/2k).kb – 2k ≤ (w-1)bw/2. k ≤ C(w, 2) (b/(b-2))= C(w, 2) (1+ 2/(b-2)),. k ≤ C(w, 2) because 2C(w,2)/(b-2) <1.

0 * * 0 1 11 * 0 0 1 *0 _ _ _ _ _1 _ _ _ _ _* _ _ _ _ _

Page 21: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Thm 9.8. Let C be an ABD(k,w) with w >3.(1) If two rows of C agree in all but one position, then C(w, 2) k, i.e., w(w-1) 2k. (2) Otherwise w2 > 2k.Proof of (2) : Let c1 and c2 be two rows of C with the

same star–pattern differing in more than one position.

Count the sum of the distances from c1 to the other rows, which is w(bw/2k), by Thm 9.7.1, and at least (b-2) + 2 = b.

Thus, w(bw/2k) b, i.e., w2 2k. Need to show the equality does NOT hold.

If the equality holds, it implies c1 and c2 have distance 2 and the other rows have distance 1 to c1 and c2. And this holds for any pair of the same star-pattern.

0 * * 1 1 *1 * * 0 1 *0 _ _ _ _ _1 _ _ _ _ _* _ _ _ _ _

Page 22: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Assume c1 = (* * *.... 00....000) and

c2 = (* * *.... 00....011) to be such a pair. The other bw/(2k) – 1 rows ending in a ‘1’ would

end with ’01’. Why? Similarly, there were bw/(2k) – 1 rows ending in

’10’. Thus, there exist two rows with distance at least

2. But with w(bw/2k) = b, we must have bw/(2k) – 1

= 1, which implies bw = 4k, i.e., bw=2w2. 2w=b=2w, which is impossible for w > 3. Therefore w2 > 2k.

Page 23: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Cor. An ABD(8,4) does not exist.Pf: By Thm 9.8.Open problem: The smallest open case is the question whether an ABD(12, 6)

exists?

Thm 9.9. If an ABD(ki, wi) exists for i=1, 2, then an ABD(k1k2, w1w2) exists. * 0 0 0 Pf: Assume w2 > 0. Partition the rows of ABD(k2, w2)

into two0 * 1 0 classes R0 and R1 of equal size.

0 0 * 1 In ABD(k1, w1) replace each star by a row of k2 stars, each 0

0 1 0 * by a row from R0 and each 1 by a row from R1 in all possible

* 1 1 1 ways. Then check it is an ABD as claimed. (Exercise!)

1 * 0 1 1 1 * 01 0 1 *

Cor: An ABD(4t, 3t) exists.

Page 24: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Thm 9.10. Let w >0. Suppose an ABD(k, w) exists, k = s 2h, s odd. Then an ABD(k, w+is) exists for i=0,..., (k-w)/s.

Pf: It suffices to consider i=1. Let C be the ABD(k, w) and define a matrix A of the

same size by requiring aij=1 if Cij= and aij=0 otherwise.

Then A has (k-w) 1’s in each rows and b(k-w)/k 1’s in each column.

By Thm 7.3, there is a matrix A’ from A with (k-w-s) 1’s in each rows and b(k-w-s)/k 1’s in each column. Note that bs/k=2w-h is integral. Thus the matrix A’’=A-A’ has s 1’s in each rows and bs/k=2w-h 1’s in each column.

In a row of C, replace by – if the occurs in a position where A’’ has a ‘1’. This produces an ABD(k, w+s).

Use k-tuple of 0,1,*, – to represent all possible 0, 1, * by replacing each – by 0 or 1.

Page 25: Addressing and hashing Question: For a network of computers it is desirable to be able to send a message from A to B without B knowing that a message is

Thm 9.11. If an ABD(k, w) exists and R≥1 is a number such that Rk and Rw are integers, then an ABD(Rk, Rw) exists.

Pf: It suffices to show that ABD(k+l, w+m) exists for (k+l)/(w+m)=k/w and gcd(l,m)=1.

Let k = s 2h, s odd. By (ii) of the ABD’s definition, s|w.

From (k+l)/(w+m)=k/w, we have w(k+l)=(w+m)k, i.e., wl=mk.

Since gcd(l,m)=1, we have l is a power of 2. Consider the l l circulant matrix with a row of l-m

stars and m minus signs as the first row. Since l|b, adjoin a column of b/l copies of this

circulant to the matrix C of ABD(k,w). The larger matrix is an ABD(k+l, w+m).

2w

k l

l

b/l

...