entropy estimation for optical pufs based on context-tree ...13 entropy estimation for optical pufs...

Entropy estimation for optical PUFs based on context-treeweighting methodsCitation for published version (APA):Ignatenko, T., Willems, F. M. J., Schrijen, G. J., Skoric, B., & Tuyls, P. T. (2007). Entropy estimation for opticalPUFs based on context-tree weighting methods. In P. Tuyls, B. Škoric, & T. Kevenaar (Eds.), Security with noisydata : on private biometrics, secure key storage and anti-counterfeiting (pp. 217-234). Springer.https://doi.org/10.1007/978-1-84628-984-2_13

DOI:10.1007/978-1-84628-984-2_13

Document status and date:Published: 01/01/2007

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 23. May. 2021

https://doi.org/10.1007/978-1-84628-984-2_13

https://doi.org/10.1007/978-1-84628-984-2_13

https://research.tue.nl/en/publications/entropy-estimation-for-optical-pufs-based-on-contexttree-weighting-methods(d5321531-476b-48dd-aa9d-14ed979e6e78).html

13

Entropy Estimation for Optical PUFsBased on Context-Tree Weighting Methods©

Tanya Ignatenko, Frans Willems, Geert-Jan Schrijen, Boris Skorié,and Pim Tuyls

4

In this chapter we discuss estimation of the secrecy rate of fuzzy sources-—more specifically of optical physical unclonable functions (PUFs)—usingcontext-tree weighting (CTW) methods [291]. We show that the entropyof a stationary 2-D source is a limit of a series of conditional entropies [6]and extend this result to the conditional entropy of one 2-D source givenanother one. Furthermore, we show that the general CTW-method approachesthe source entropy also in the 2-D stationary case. Moreover, we generalizeMaurer’s result [196] to the ergodic case, thus showing that we get realisticestimates of the achievable secrecy rate. Finally, we use these results to estiniate the secrecy rate of speckle patterns from optical PUFs.

13.1 Generating a Shared Secret Key

Consider a source that generates seciuences of random variables from a finitealphabet. A shared secret key can be produced by two terminals if theseterminals observe dependent sequences and at least one of the terminalsis allowed to transmit a message to the other one. Although the transmitted message is public, it needs not reveal information about the secret keythat is generated. This concept was described by Maurer [196] when herealized that the secrecy capacity of a broadcast channel could be significantly enhanced if a public feedback link from the (legitimate) receiver to thetransmitter was present. A little later, Ahlswede and Csiszflr [4] investigatedsimilar problems and called the situation, in which the terminals observe dc

pendent sequences, the source-type model; see Fig. 13.1. There, an encoder

© Figures 13.1, 13.2, 13.3, 13.5 and 13.7 appeared in T. Ignatenko, C. J. Schrijen,B. Skoric, P. Tuyls, and F. I\l. J. Willems. Estimating the secrecy rate of physicaluncloneable functions with the context-tree weighting method. In Proceedings ofthe IEEE International Symposium on Information Theory 2006, pages 499—503,Seattle, WA, July 2006. (on CD-ROM).

7

13 Entropy Estimation for Optical PUFs Based on CTW Methods 219

forms a secret S after observing a sequence X” (X1 , X2 XAI) of symbols from the finite alphabet X. The encoder sends a public helper-messageM e M = {1, 2,. .. , MI} to a decoder. The decoder observes the sequenceyN = (Y1, Y2, . . . , YN) of symbols from the finite alphabet Y and produces

an estimate S of the secret S using the helper-message Al. It was assumedin [196] and [4] that the sequence pair (X1v, yN) is independent and iclentically distributed (i.i.d.); that is, P[XN = xN,yN = ~N] EI~~for some distribution {Q(z, y), z ~ X, y e Y}.

In the described model, the terminals want to produce as much key information as possible. The probability that the estimated secret S is not equalto the secret S should be close to zero and the information that the helper-message reveals about the secret should also be negligible. Finally, we areinterested in the number of helper-message bits that are needed. More formally, a secrecy rate R~ is achievable if for all a > 0, for all large enough N.there exist encoders and decoders such that

H(S) ≥ N(R8 —a).

P~S]<a.

Theorem 13.1. The largest possible achievable secrecy rate B3 is equal toI(X; Y). Moreover, for all a> 0, for all larqe enouqh N a helper rate

-~log9 M~H(XIY)+a

suffices to achieve R3 = I(X; Y). Both I(X; Y) and H(XIY) are based on thejoint distribution {Q(x,y),x ~ X,y ~ Y}.

For a detailed proof of Theorem 13.1, see [196] and [4]. Here, we only providea sketch. The achievability proof relies on random binning of the space XN(i.e., partitioning the set of typical XN_sequences into codes for the channelfrom X to Y). There are roughly 2ATH(~~ ~ such codes. The index of the code

containing xN is sent to the decoder. All of these codes contain approximately2ATI(X;~’) codewords. The decoder now uses ~N to recover xN. If we define the

secret to be the index of xN within the code, the code-index reveals practicallyno information about this index.

13.2 Physical Unclonable Functions

Measured responses from PUFs, which were already discussed in Chapters 1and 12, are a good example of dependent random secluences. A typical PUFbased authentication and key-agreement protocol involves an enrollment measurement of a challenge response pair (CRP) and a verification measurementof the same CRP. (See Chapter 16.) Since these measurements are separatedin time, and often performed using different hardware, there is inevitably somemeasurement noise, caused, for example, by differences in temperature, moisture, and calibration. We identify the enrollment response with the sequenceXN in Fig. 13.1, and the verification response with the sequence yN,

The Maurer scheme guarantees that the helper data reveal only a negligibleamount of information about the extracted key. There is no guarantee, on theother hand, that the information revealed about the PUF response XN is alsosmall. This may pose a problem that an attacker could mimic or reproducethe PUF based on the information leaked from helper data. However, theunclonability properties of the PUF prevent this attack. Therefore. PUFsare very suitable as a source of common randomness for the Maurer scheme.

In this chapter we concentrate on optical PUFs. This type of PUF consistsof a transparent material with randomly distributed scatterers. Different challenges are obtained by shining a laser beam under different angles onto thePUF. These challenges lead to speckle patterns (responses) that are recordedby a camera. We have analyzed data that was acquired with the experimental setup described in Section 15.3.1 and the glass samples of Section 15.3.2.For this setup, the measurement noise is mainly caused by inaccuracies inrepositioning the samples. We have investigated five PUFs (labeled “A,” “B,”“C,” “D,” and “E”), and for each of these five PUFs, we have considered twochallenges (laser angles, labeled “0” and “1”). We mapped each speckle pattern to a 2-D binary image by Gabor filtering and thresholding as proposedby Pappu [221]; see Section 16.5. Each of the 10 challenges resulted in twobinary images: an enrollment image X and a verification image Y. Our aim isto find out how much secret key information can be extracted from the imagepairs (X,Y) and how this matches the results obtained with the algorithmdescribed in Section 16.4.

13.3 Entropy of a Two-Dimensional Stationary Process

In order to find out how large the mutual information is between an enrollmentimage and a verification image for optical PUFs, we consider the 2-D process

218 T. Ignatenko et al.

S S

1~I

X~v yN

Fig. 13.1. Generating a shared secret key. © 2006 IEEE

T(S: Al) <Na. (13.1)

(13.2)

220 T. Ignatenko et al. 13 Entropy Estimation for Optical PUFs Based on CTW Methods 221

{ X,,,~ : (v, Ii) ~ Z2} (also called random field) and assume that it is stationary(homogeneous); that is,

IP[X7- XT] = JP[XT+(3 si) = XT],

for any template T. any shift (s,,, 8k), and any observation XT. A template is aset of coordinate-pairs (i.e., T c Z2). Moreover, T+ (s,,, s,,) denotes the set ofcoordinate pairs resulting from a coordinate pair from T, to which the integershift pair (sr, s,~) is added. We assume that all symbols take values from thefinite alphabet X. If we first define, for positive integers L,

X,,~

H(L)(X) ~

XL,L

then the entropy of a 2-D stationary process can be defined’ as

H(~)(X) ~ urn H(L)(X).L—oo

It follows from the stationarity of the stochastic process X, the chain rule forentropies, and the fact that conditioning can only decrease entropy that

IX,, ... ~~

NH( ... I_(1v+1)H(X1~,~+,) ~XM,1

/ X,,~, X,,~ ... X,,~ ~ /X1,, ... X,,~

=NHI ... I-RI~~ X~1,, ... XM,N) ~XSI,I ... XM,N

<0.

Lemma 13.1. The limit defined in relation (19.5) exists.

Proof. Using inequality (13.6) for (M,N) = (L,L) and, subsequently, atransposed version of this inequality for (M, N) = (L, L + 1), it follows thatH(L+,)(X) — H(L)(X) < 0. Hence, the sequence H(p(X) is a non-increasingnon-negative sequence in L. This concludes the proof.

The definition of entropy in relation (13.5) focuses on block entropies.We will show next that the entropy of a stationary 2-D process can alsobe expressed as a limit of conditional entropies. To this end, we define theconditional entropy

G(L)(X) ~ H(XL,LIX,,,,. . . ,X,,~_,,. .~.,XL,L_,).

A visualization of this definition is presented in Fig. 13.2.

In information theory the entropy of a stationary process is usually denoted byH(X). however, in cryptography this notation is used for mm-entropy. Therefore, to avoid confusion we use the notation H()(X).

Fig. 13.2. The symbol XL,L and the symbols on which it is conditioned in relation (13.7). © 2006 IEEE

Proof. From stationarity and the fact that conditioning never increases entropy, it follows that the sequence G(L) (X) is non-increasing in L. SinceG(L)(X) ≥ 0, the proof follows.

In order to demonstrate that the limits (13.5) and (13.8) are eciual, wefirst observe (using the chain rule, stationarity, and the fact that conditioningnever increases entropy) that

H(L) (X) = ~ H(X,,,, IX’,’ X1.L,. . . ,X~,,,.. . ,~v=I k=1

On the other hand, it follows (using similar arguments) that

H(fl) +j( + L - 1)G(L)(X)

(j + 2L — 2)2

where H(fl) corresponds to all the symbols in the horseshoe region~ seeFig. 13.3. These observations yield

(13.3)

11

L—1L ... 2L—1

L—1L

(13.4)Lemma 13.2. The limit

(13.5)exists.

G(~)(X) g lirn G(L)(X)L—~cc

X,,N

XJ~J,N

(13.8)

(13.6) ≥ G~(X). (13.9)

(13.7)

(13.10)

H(~)(X) = him H(j+2L_2)(X) <G(L)(X).j—~oo

(13.11)

Theorem 13.2. The limits H(~)(X) and G(~)(X) are equal; that is.

G(~)(X) = H(~)(X). (13.12)

Proof. Follows directly from (13.9) and (13.11).

222 T. Ignatenko et a!. 13 Entropy Estimation for Optical PUFs Based on CTW Methods 223

Fig. 13.3. Horseshoe region in a square of size (j + 2L — 2)2. © 2006 IEEE

Our arguments are a generalization of the arguments for (1-D) stationarysources that can be found in Gallager [119]. Moreover, they a~e only slightlydifferent from those given by Anastassiou and Sakrison [6], who first showedthat in the 2-D case, the block—entropy limit equals the conditional-entropylimit.

We conclude that the entropy of a 2-D stationary process can be computedby considering the conditional entropy of a single symbol given more and moreneighboring symbols.

13.4 Conditional Entropy of a Two-DimensionalStationary Process Given a Second One

Next, we consider the 2-D joint process ~ : (v. h) E Z2}. We assumethat it is stationary: that is,

xyT] = IP[XYT+(8~,,) ZyT].

for any template T, any shift (sr, sa), and any observation xy~1-. Again, we assume that the X symbols and Y symbols take values from the finite alphabetsX and Y, respectively.

We may consider the joint entropy H~ (XY) of the joint process XY andthen obviously Theorem 13.2 holds. We can then compute this joint entropyby considering conditional entropies.

It also makes sense to look at the conditional entropy H(~)(XIY) and tofind out whether a theorem similar in style to Theorem 13.2 can be proved forthis situation. This turns out to be possible if we define, for positive integers L,

We first observe that since conditioning never increases entropy, the followinginequality holds

/Xl,l ... Xi,~ Y1,1 ... Y1,L+1 ~ /Xi,i ... ...

HI ... ... I≤HI~xLl ... XL,L YL+1,1 ... YL+1,L+1) ~xL,l ... X~ ~ ...

Lemma 13.3. The limit in relation (13.15,) exists.

(13.16)

Proof. The proof that the seciuence H(L)(x~Y) is non-increasing in L follows from arguments similar to the ones used to show that H (L) (X) is non-increasing (see proof of Lemma 13.1) and from inequality (13.16).

In order to demonstrate that the conditional entropy H(~)(xIY) can beexpressed as a limit of entropies of a single symbol conditioned on surroundingx symbols and Y symbols, we define

G(L)(x~Y) ~ H(xL,LIxl,l.. . .~ . . . ,x~1,. ..

Proof. It is easy to see that G(L+l)(X~Y) ~ G(L)(xIY) using arguments asin the proof of Lemma 13.2, from which the proof follows.

I IFig. 13.4. The symbol XL,L and its conditioning symbols. Note that the X symbolsare drawn on top of a square with the Y symbols.

L—1

j+L—1

Y2L—1,2L—1).

For a visualization, we refer to Fig. 13.4.

Lemma 13.4. The limit

(13.13)

(13.17)

exists.

G(~)(xIY) = hm G(L)(xIY)L—co

(13.18)

/ x1,1 ... xl,L Y1,1 ... Y1,Lii . . . .

H(L)(xIY) = ~H : : :

xL,1 ... XL,L YL,j ... YL.L

and define the conditional entropy of a 2-D joint stationary process x~ as

1 L

(13.14)

2L—

x—s3’inbols

L—1L

2L — 1 I

Y-symbols

V

H(~)(xIY) ~ iimH(L)(xIY). (13.15)

224 T. Ignatenko et a!. 13 Entropy Estimation for Optical PUFs Based on CTW Methods 225

Fig. 13.5. Edge region in a square of size (j + 2L — 2)2. © 2006 IEEE

In order to demonstrate that the limits (13.15) and (13.18~ are equal, weobserve that (according to the same arguments as used for relations (13.9)and (13.10))

H(L)(X~Y) ≥ G(L)(XIY),

H(L1) +j2G(p(X~Y)<— (j+2L—2)2

where H(E) corresponds to the X symbols in the edge region; see Fig. 13.5.Hence, we obtain

Proof. The proof follows from relations (13.19) and (13.21).We conclude that, in the stationary case, also the conditional entropy

of one 2-D process X given a second 2-D process Y can be computed byconsidering the conditional entropy of a single X symbol given more andmore “causal” neighboring X symbols, and more and more “non-causal” 2

neighboring Y symbols.

13.5 Mutual Information Estimation: Convergence

We estimate the mutual information I(~)(X; Y) either by estimatingH(~)(X), H(~)(Y), and H(~)(XY) or by estinmating H(~)(X) and H(~)(XIY)(or, equivalently H(~)(Y) and H(~)(YIX)) using CTW methods. CTW is a

2 “Causal” symbols are past symbols, whereas “non-causal” symbols might be both

future and past symbols with respect to a certain symbol.

universal data compression method that achieves optimal redundancy behavior for tree sources. It weights the coding distributions corresponding to allbounded memory tree source models and realizes an efficient coding distribution for unknown models with unknown parameters. This weighted distribution can be used for sequential data compression. In a sequential scheme, thecodeword is constructed by processing the source symbols one after the other.

The basic CTW method was proposed [293]. In [294], it was shown how todeal with more general context structures (which is necessary to determine,e.g., H~(XtY)). In [291], it was shown that the CTW method approachesentropy in the 1-D ergodic case. The following theorem applies to the 2-D case.

Theorem 13.4. For joint processes XY. the general CTW method achievesentropy H(~)(XY), as well as H(~)(X) and H(~)(Y), and conditional entropies H(m)(XIY) and H(~)(YIX) in the 2-D ergodic case.

Proof. From Theorems 13.2 and 13.3 we conclude that we can focus on conditional entropies of a single symbol (or pair of symbols). These are the entropies that the CTW method achieves when the observed image gets largerand larger and more and more context symbols become relevant. It is important to use the right ordering of the context symbols. Therefore, the symbolsfor L = 2 should be included first, then those for L = 3, and so on. The restof the proof is similar to that in [291].

13.6 The Maurer Scheme in the Ergodic Case

Section 13.1 contains the theorem on the amount of secret-key material thatcan be generated from a pair of correlated sequences in the i.i.d. setting. Thecoding strategy outlined there is actually Slepian-Wolf coding [256], as wasobserved by Ahlswede and Csiszár [4]. Cover [67] proved that the SlepianWolf result does not only hold for i.i.d. sequences, but it carries over to theergodic case. Therefore, we can generalize Theorem 13.1 to the ergodic case;see Theorem 13.5. Using the ideas of Cover, one can prove achievability. Theconverse given by Maurer [196] also applies to the ergodic case.

Theorem 13.5. Theorem 13.1 also holds for the ergodic case if we replaceI(X;Y) by I(m)(X;Y) = H~(X) + H~(Y) — H(~)(XY) and H(XIY) byH(m)(XIY) H~yj~(XY) — H(00)(Y).

13.7 Context-Tree Weighting Methods

Consider a source that has produced a sequence .. . ~ Xt_1 so far, Thenat time t, this source generates a new symbol Xt. The context for this symbolXj in the basic context-tree method [293] consists of the previous D symbolsXt_D Tt_2,~ For our purposes we need more flexibility in choosing

L—1

j

L—1

(13.19)

(13.20)

H(~)(X~Y) = lim H(j+2L2)(X~Y) <G(L)(X~Y).j-~c

(13.21)

Theorem 13.3. The limits H(0~)(XJY) and G(~~)(X}Y) are equal; that is.

G(~)(XIY) = H(~)(XIY). (13.22)


the context symbols, however. This flexibility is provided in [294], where fourweighting methods are described. Here, we consider the two simplest classes—class IV and class III. To be more specific, we denote the context symbols forsymbol x~ by ZtI, Z12,. . . , Z~D. Observe that each of these symbols could be anysymbol available at both the encoder and decoder while encoding/decodingxt; for example, the basic context-tree method corresponds to the assignmentZtcj = ~ On the other hand, if there is a “side-information” sequenceavailable, then we could take z~j = Yt+d—1, but also combinations of both pastz symbols and past and/or future y symbols are possible.

In a class IV method, it is assumed that the actual probability of the nextsymbol Xt being 1 is based on the first d context symbols Ztl, Zt2, . . . ,

where d depends on the context z~, z12,. .. , Z~D that occurred; for example,if the source model corresponds to the tree in Fig.13.6(a) and the contextZtl, Zt2, . . . , ~ at time t is 011€. . . c, the probability ~oii of the next symbolXt being 1 can be found in the leaf 011. We have denoted a “don~t care”context symbol by E here. The subscript 011 refers to the values 0, 1, and 1of the context symbols Zj 1, Zt2, and zt3. respectively.

Class III models can also be described using a tree. However, the orderingof the context symbols is not fixed as in class IV. For a source model corresponding to the tree in Fig. 13.6(b), when the context zti, Zf 2,. . , Z~D at timet is cOOls. . . e, the probability 9~ of the next symbol Xt being 1 can be foundin leaf 010. Note that the superscript 243 denotes the context ordering; thatis, first Zt2 is used, then Zt4, and, finally, Zt3. The subscript 010 now refers tothe values 0, 1, and 0 of these context symbols Zt2, Zt4, and Zt3, respectively.

Context-weighting methods are based on splitting up the observations corresponding to the nodes in a data structure. In class IV methods, this splittingis done by first splitting according to the first context symbol, then splittingis done using the second context symbol, and so on: see Fig. 13.6(a). In class

III, each splitting operation can be performed according to all of the contextsymbols that have not been used in previous splittings; see Fig. 13.6(b).

For both model classes, a CT\’V encoder (implicitly) specifies the context structure and corresponding parameters to a decoder. This results in anincreased codeword length or redundancy. The redundancy based on the parameter specification is called parameter redundancy. Specifying the contextstructure leads to model redundancy. It will be clear that class III methodsare more general than class IV methods. Since they adapt better to the source,the performance of class III methods should therefore be better. Indeed theso-called parameter redundancy is smaller for class III than for class IV. butsince class HI is richer than class IV, its model redundancy is also larger. Itdepends on the length of the source sequence which of the two effects willdominate. For small lengths, class IV methods will outperform the class IIImethod. For large lengths, the effect of model redundancy becomes negligibleand a class III method gives a smaller codeword length,

13.8 Analysis of Speckle Patterns

We use the methods that were described in the previous sections to estimatethe mutual information between noisy speckle measurements. From [108] itis known that the two-point intensity correlations in a speckle pattern aretranslation invariant. Therefore, we may conclude that a speckle pattern canbe modeled as a stationary process. Moreover, the process is also ergoclic clueto the statistical properties of speckle patterns; namely the spatial distributionof intensities is the same as the PUF ensemble distribution of intensities [136].Therefore, the methods given in the previous sections are applicable.

The secrets are extracted from pre-processecl speckle patterns. Preprocessing includes Gabor filtering at 45°, thresholding, and subsampling(see Section 16.5). As X- and Y-sequences we use 64 x 64 binary images. Anexample of a pair X. Y is depicted in Fig. 13.7. We observe that the enrollmentand verification images differ slightly due to the measurement noise. Moreover, we see that application of a 45° Gabor filter results in diagonal stripes.These stripes are caused by the high Gabor component correlations perpenclicular to the direction of the filter [287]. Since the correlation decreases withdistance, it is natural to consider positions for context candidates as shownin Fig. 13.8. This template turns out to have a good balance between performance and complexity. We have also considered a larger template. However, using this larger template did not lead to smaller entropy estimates.We can calculate mutual information with two alternative formulas: eitherby estimating it as I(~)(X: Y) = H(~)(X) + H~(Y) — H(~)(XY) or asI(~)(X;Y) H(~)(X) — H(~)(XIY). Note that for each of the entropiesinvolved in the formulas, we have to compress an image (or a pair of images)using a CTW method. In what follows we describe in more detail the analysisthat we have conducted.

.~243 2430010 0011 ~oio

voo ,123ml

(a) Class IV

Fig. 13.6. Example of class IV and class III modlels.

(b) Class III


0.22880.33940.28510.26660.33840.27630.32520.29900.27700.3285

(b) Class III0.2211 0.22460.3414 0.34160.2899 0.29060.2704 0.27590.3458 0.34030.2816 0.27860.3313 0.32920.3068 0.30430.2778 0.27840.3228 0.3240

0.25220.36440.31270.28370.36700.29640. 344 70.32360.29350.3463

13.8.1 Class IV Analysis

1. The basic approach that we have used is based on the template shown inFig. 13.8. This template contains four context positions. Using the classIV method, we have determined codeword lengths .)~(X) and ~(Y) and thejoint codeword length ?~(XY). Note that A(XY) results from compressinga quaternary image, since both symbols in a XY symbol pair are binary.Using the symmetric mutual information formula, we computed a mutualinformation estimate for each of tIme 10 experiments (“AU,” “Al,” “BO,”etc.). Table 13.1(a) lists these estimates in the column labeled “bas.”Table 13.2(a) shows the results for the corresponding entropy estimatesH(X), H(Y), and H(XY). The mutual information averaged over the10 experiments turns out to be 0.2911 bit/pixel. Figure 13.9 shows thecodeword lengths for experiment AU.

2. The second approach is based on the assumption that the statistics ofbinarized Gabor-filtered speckle patterns are symmetric; that is, the probability of a binary symbol x given context C1, C2, c3, and C4 is the same asthe probability of 1 — x given 1 — c1, 1 — c2, 1 — c3, and 1 — a4, respectively.There are good reasons for this assumption. Although the statistics of theoriginal, unfiltered speckle pattern are not symmetric under dark~—*brightreversal (due to the exponential intensity distribution [136]), the binarization of the Gabor coefficients discards most of the asymmetry-relatedeffects.

The symmetry assumption reduces the number of parameters that needto he handled by the CTW method and, therefore, should result in morereliable estimates of the enfropy and, consequently, more reliable estimatesof the mutual information. From comparison of the column “sym” andcolumn “bas” in Table 13.2(a) we conclude that the symmetry assumption

Table 13.1. Mutual information estimates

Exp J bas sym sym+lar j sym+con(a) Class IV

AU 0.2236 0.2224 0.2231 0.2469Al 0.3348 0.3393 0.3386 0.3586BU 0.2782 0.2825 0.2824 0.3075B1 0.2664 0.2722 0.2731 0.2769CU 0.3271 0.3408 0.3368 0.3567Cl 0.2735 0.2834 0.2794 0.2919DO 0.3233 0.3310 0.3293 0.3407Dl 0.2951 0.3078 0.3046 0.3183EU 0.2699 0.2742 0.2748 0.2869El 0.3192 0.3203 0.3193 0.3378Ave. 0.2911 0.2974 0.2961 0.3122Std. 0.0352 0.0374 0.0365 0.0369

Fig. 13.7. Images X (left) and Y (right) resulting from experiment AU with Caborangle ~ = 450~ © 2006 IEEE

I I4

12I 3

Fig. 13.8. Template showing four context symbols and their ordering. Note thatthe ordering is only important for class IV. The arrow indicates the direction inwhich the image is processed.

AUAlBUBlCUClDUDlEOEl

Avc~ 0.2964 0.2989 0.2987 0.3184Stcl. 0.0363 0.0385 0.0366 0.0376

r


11(X) 11(Y) I1(XY) I1(X~Y)

Exp bas sym sym+larj has syrn syrn+larI bas syni sym+larI sym(b) Class IV

AU 05194 0.5125 0.5135 0.5241 0.5181 0.5193 0.8198 0.8081 0.8097 0.2656Al 0.5213 0.5142 0.5154 0.5189 0.5119 0.5126 0.70.54 0.6868 0.6895 0.1557BO 0.5289 0.5216 0.5229 0.5284 0.5217 0.5230 0.7791 0.7609 0.7635 0.2141Bl 0.5188 0.5122 0.5126 0.5219 0.5161 0.5 170 0.7743 0.7561 0.7565 0.2353CO 0.5238 0.5173 0.5166 0.5116 0.5056 0.5041 0.7083 0.6822 0.6839 0.1606Cl 0.5404 0.5339 0.5327 0.5384 0.5321 0.5318 0.8053 0.7826 0.7851 0.2420DO 0.5305 0.5253 0.5246 0.5273 0.5228 0.5236 0.7345 0.7171 0.7190 0.1846Dl 0.5260 0.5192 0.5188 0.5194 0.5126 0.5117 0.7503 0.7241 6.7259 0.2009EO 0.5291 0.5223 0.5235 0.5346 0.5285 0.5294 0.7938 0.7767 0.7780 0.2355El 0.5492 0.5420 0.5423 0.5296 0.5232 0.5234 0.7596 0.7449 0.7465 0.2042Ave. 0.5287 0.5221 0.5223 0.5254 0.5193 0.5196 0.7630 0.7439 0.7458 0.2098Std. 0.0096 0.0096 0.0093 0.0079 0.0080 0.0085 0.0390 0.0412 0.0411 0.0358

(a) Class IIIAU 0.5177 0.5113 0.5136 0.5219 0.5167 0.5187 0.8108 0.8068 0.8077 0.2591Al 0.5196 0.5133 0.5157 0.5163 0.5103 0.5116 0.6965 0.6823 0.6857 0.1488BO 0.5270 0.5207 0.5234 0.5270 0.5208 0.5234 0.7688 0.7516 0.7562 0.2080Bl 0.5171 0.5114 0.5123 0.5208 0.51.56 0.5176 0.7713 0.7566 0.7541 0.2276CO 0.5223 0.5166 0.5174 0.5100 0.5045 0.5040 0.6939 0.6753 0.6811 0.1496Cl 0.5395 0.5332 0.5331 0.5365 0.5310 0.5312 0.7997 0.7826 0.7857 0.2368DO 0.5289 0.5244 0.5254 0.5265 0.5223 0.5246 0.7301 0.7155 0.7207 0.1796Dl 0.5245 0.5185 0.5185 0.5183 0.5123 0.5122 0.7438 0.7240 (1.7204 0.1948EU 0.5265 0.5209 0.5232 0.5323 0.5274 0.5291 0.7818 0.7705 0.7738 0.2274El 0.5478 0.5416 0.5434 0.5270 0.5219 0.5233 0.7463 0.7407 0.7427 0.1952ave 0.5271 0.5212 0.5226 0.5236 0.5183 0.5196 0.7543 0.7406 0.7434 0.2027std 0.0098 0.0098 0.0096 0.0078 0.0080 0.0085 0.0398 0.0122 0.0410 0.0365

leads to improved (smaller) entropy estimates for all Gabor images. Thisimplies that the symmetry assumption is reasonable. The correspondingestimates of the mutual information are listed in the column “syrn” inTable 13.1(a). From Table 13.1(a) we see that the average of the 10 “sym”estimates is larger than the average found using the basic approach. Morespecifically, 9 out of 10 estimates are larger than for the basic approach.

3. In the third approach, we have increased the template size from four tosix context symbols; see Fig. 13.10. Just as in the previous approach. weassumed symmetry of the statistics. The resulting entropy estimates (column “sym+lar”) show that we do not gaul from increasing the templatesize.


Fig. 13.9. Codeword lengths )~(X), ~(Y), )‘~(XY), and )~(X) + A(Y) — .)~(XY) as afunction of the number of processed positions.

4 6125

I I3IX~I I IFig. 13.10. Template showing increased number of context symbols and theirordering.

Table 13.2. Entropy estimates 3000

2500

2000

0)C0

~000000

1500

1000

500

02000 2500

Compression step

4. In the fourth approach, we have determined the mutual information usingthe conditional formula I(X;Y) H(X) — H(XIY). To determine thecodeword length .)~(XIY), we selected seven context symbols in total fromboth the X and Y images. The resulting template is shown in Fig. 13.11.Again, we assumed that the statistics are symmetric. This method leads tohigher mutual information estimates than the estimates based on H(X) +H(Y) — H(XY); see the column labeled “sym+con.”

13.8.2 Class III Analysis

The same analysis was performed using the class III CTW method. We usedthe same context positions as earlier, but note that the ordering is irrelevant

F

x~ jjI5I4~

k A

2~ 6

_____

Fig. 13.11. Tertiplate showing the context symbols and their ordering for computation of )~(X~Y). The current position (X) in the X image corresponds to position1 in the Y image.

now. Tables 13.2(b) and 13.1(b) describe the results of the class III analysis.Just as in class IV, the estimates based on the symmetry assumption are morereliable than those obtained from the basic approach. Moreover, for class III,a larger template does not improve the estimates and also here the conditionalformula leads to the highest mutual information estimates.

From the entropy Table 13.2, we may conclude that the entropy estimatesfor class III are smaller and, consequently, more reliable than the estimates forclass IV. Therefore, we have more confidence in the mutual information estimates obtained from class III weighting methods than from class IV methods.The difference between corresponding estimates is always quite small. Thesesmall differences can be explained by noting that the template ordering wasoptimized to perform well for class IV methods.

Remark 13.1. Looking at the entropy estimates in Table 13.2, we notice thatfor both class IV and class HI models, H(XY) — H(Y) > H(XJY). From thiswe conclude that the conditional entropy estimate from ~.(XfY) is more reliable than the estimate from .\(XY) —A(Y). As a consequence, the conditionalformula for mutual information does lead to more accurate estimates than thesymmetric formula.

13.9 Conclusions


In the present chapter we have focused on estimating the secrecy rate from450 Gabor images. It is obvious that similar estimates can be found for images

that result from 1350 Gabor filtering. The 45° and 1350 Gabor images are veryweakly correlated to each other [287], representing almost independent data,but their statistics are equivalent and, therefore, it is possible to compressboth images using the same context tree [150]. The estimates obtained in thisway are, in principle, more reliable than estimates based only on 45° Gaborimages.

The secrecy rate estimates obtained in this chapter are significantly larger(approximately by a factor of 7) than those obtained with the schemes proposed in Chapter 16. This indicates that there is still much room for improvement in designing secret-key extraction techniciues.

Finally, we mention that techniques like the ones that we have applied herecan be used to estimate the identification capacity of biometric systems [292]:see also Chapter 4.

We have used CTW methods to estimate the secrecy rate of binarized Gaborfiltered speckle patterns obtained from optical PUFs. Several alternativeapproaches lead to the conclusion that secrecy rates of 0.31 bit/pixel are possible. This corresponds to 0.7 bit per average speckle area. (The average specklewidth is approximately 12 pixels in the unfiltered image; see Section 15.3.1).

Class III gives more reliable and slightly higher estimates of the secrecyrate than class IV, since it is based on a richer model class than class IV.In theory, our methods only converge to the entropy for asymptotically largeimages if there is no bound on the context size. Note that we have definitivelynot reached this situation here.


entropy estimation for optical pufs based on context-tree ...13 entropy estimation for optical pufs...

Documents