ieee transactions on information forensics and … · ignatenko [18] analyzed the information loss...

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 1

Efficient and Privacy-Preserving Cryptographic KeyDerivation from Continuous Sources

Enrique Argones Rúa, Aysajan Abidin, Roel Peeters, Jac Romme

Abstract—The procedure for extracting a cryptographic keyfrom noisy sources, such as biometrics and Physically Un-cloneable Functions (PUFs), is known as Fuzzy Extractor (FE).Although FE constructions deal with discrete sources, most noisysources are continuous. In the continuous case, it is requiredto transform the source to a discrete one. We introduce a (i)model-based uncoupling construction that deals directly with thecontinuous noisy source and produces helper data uncouplingthe discrete representation from the noisy source, guaranteeingthe diversity of the discrete representation, and making it morerobust; and a (ii) strengthened uncoupled fuzzy extractor, suitablefor privacy-preserving applications, that integrates an additionalfixed authentication factor and obtains a key uncoupled to thenoisy sources and unlinkable helper data. We present optimalmodel-based uncoupling constructions for Gaussian sources.Specifically, we show how to extract (i) one or multiple bitsfrom single Gaussian source, (ii) one bit from several unreliableGaussian sources; and provide a general procedure to obtain anoptimal uncoupled FE from Gaussian source(s). Our experimentsshow that the proposed constructions achieve much highersecurity levels for wide operational scenarios, approximatelydoubling the obtained effective key length without affecting falserejection rates.

I. INTRODUCTION

W ITH the Internet of Things comes an ever growingnumber of devices that need to communicate securely

with each other. To set up secure communication among allthese devices, solutions increasingly make use of noisy sourcessuch as biometrics (including behaviometrics) to authenticateusers; Physically Uncloneable Functions (PUFs) for derivinginherent device-unique keys without the need for programmingthese devices; and sensor inputs to derive short lived sessionkeys, e.g., to exchange data with devices that are in the samephysical environment.

The construction for extracting cryptographic keys fromnoisy sources is known as Fuzzy Extractor (FE) [1]. In general,an FE consists of generation and reproduction procedures. Inthe generation procedure, one derives a stable string from aninherently noisy discrete source, which is indistinguishablefrom a random string, and helper data. In the reproductionprocedure, one recovers the same string from the same sourceand the helper data computed during the generation procedure.

Enrique Argones Rúa, Aysajan Abidin, and Roel Peeters, are with theimec—Computer Security and Industrial Cryptography, Departement Elek-trotechniek - ESAT, KULeuven. Jac Romme is with imec, The Netherlands.

This work was supported by the Flemish government through the FWOSBO project SPITE S002417N and by imec through the Security & PrivacyCentre projects on Biometrics & Authentication and Secure Distance Bound-ing.

Manuscript received March, 22, 2019.

However, FEs do not deal with continuous sources, which arethe most common in the real world.

In this paper, we first propose a model-based uncouplingconstruction, which extracts from a continuous source a dis-crete representation and metadata useful for reproduction. Thisconstruction is a generalisation of the Helper Data Schemesproposed by Groot et al. [2]. The metadata is essential for en-suring the maximum reliability of the discrete representation,which is uncoupled form the continuous source. We propose acomplete procedure for ensuring minimum number of errors inthe reproduction procedure when sampling from the genuinesource(s), but full entropy when sampling from the generalpopulation in the case of Gaussian sources.

Note that the metadata extracted within the model-based un-coupling construction discloses identifiable information aboutthe Gaussian source: it is only guaranteed that a boundedamount of entropy remains hidden, disclosing the rest aboutthe continuous source. For the derivation of one-time sessionkeys, e.g., based on sensor data, which can be assumed to bepublic and sufficiently different from one realisation to another(due to highly fluctuating sources), the information leakage onthe continuous source is not a concern. The metadata couldthus remain unprotected. However, the information leakagebecomes a privacy concern when the signal itself may beconsidered as sensitive information, e.g., biometrics, or ingeneral when the FE is used to derive cryptographic keysbased on non-variant noisy inputs. In these cases, the metadataenables cross-linking devices (PUFs) or even users (biomet-rics) [3] among different applications. For these cases, wepropose to use a second factor in the strengthened uncoupledfuzzy extractor to ensure complete unlinkability among thestored information in different instances of templates from thesame noisy sources. This construction is consistent with theexisting standard recommendations for high security accesscontrol systems [4]. The uncoupling between the discreterepresentation and the continuous noisy source ensures themaximum diversity of the obtained discrete representations,enabling revocation and renewal of the obtained templates.These properties make the strengthened uncoupled fuzzy ex-tractor suitable for privacy-preserving applications.

Concretely, our contributions are as follows:• The formalization of the Strengthened Uncoupled Fuzzy

Extractor, which (i) obtains a discrete representation ofthe biometric model uncoupled from the biometric modelby means of the Model-based Uncoupling Construction,then (ii) protects the produced metadata, linkable withthe continuous source, using a fixed authentication factorto avoid the disclosure of any information about the


protected template, and (iii) derives helper data whichallows verification and does not disclose informationabout the continuous source.

• Introduction and theoretical analysis of optimal Model-based Uncoupling Constructions for Gaussian sources,using zero-leakage metadata extraction procedures forreliable Gaussian sources and optimal feature combina-tion for unreliable ones, allowing maximum reliabilityof the discrete representation for genuine attempts whileensuring uniform distribution for impostor attempts.

• Description of the complete design process of a strength-ened uncoupled fuzzy extractor for Gaussian features,supported by the previous model-based uncoupling con-structions, allowing to close the gap from Gaussian fea-tures to fuzzy extractors.

Finally, we conducted an empirical evaluation in differentuse cases of the proposed constructions for key agreementbased on shared noisy radio frequency (RF) channel repre-sentations. This evaluation shows the advantages of using theproposed constructions over using uniform quantization, whichis nowadays the most extended approach, in terms of effectivekey length and False Rejection Rate (FRR).

II. RELATED WORK

Since the publication of Juels and Wattenberg’s fuzzy com-mitment scheme [5], there have been many advances for ob-taining cryptographic keys from noisy sources. Dodis et al. [1]formalized the well-known FE scheme. However, a degrada-tion in performance is commonly observed when integratingthe fuzzy extraction in verification chains. The extraction ofrobust biometric features suitable for the construction of FEshas been the focus of some works in biometrics, such asthe work by Tong et al. [6]. Another example can be foundin [7], where eigenmodel features are used within a fuzzycommitment scheme for online signature verification, but stillthe degradation in performance is observable even after thefusion of features before the quantization. Billeb et al. [8]proposed to use the same type of features for speaker veri-fication, and then apply feature selection based on reliabilitymeasure for the inclusion of the obtained binary features inthe FE. Van der Veen et al. [9] proposed the use of a fuzzycommitment scheme for face verification, also highlighting theimportance of selecting the most reliable bits before includingthem in the FE. This trend can be observed in other workson biometric template protection for other modalities such asiris [10]. The use of sophisticated error correcting codes suchas Turbo-Codes was explored by Maiorana et al. [11] for thefeatures extracted in [7].

FEs have also become increasingly important in PUF-basedkey generation. One challenging task in IoT and/or embeddeddevices is the generation and storage of cryptographic keys.Such devices have little protection against unauthorized accessto their non-volatile memory. PUFs together with an FE canoffer a solution to the above challenge. The idea is to use anFE to first derive a cryptographic key and helper data fromthe PUF response (which is typically a bitstring), and thenusing the helper data together with a noisy PUF response to

recover the key again. Note that depending on the number ofchallenges that the PUF can process, it is categorized as eitherstrong PUF or weak PUF. While the former can process largenumber (ideally, exponential in the length of the response)of challenges, the latter can only support a small number ofchallenges (usually, only one challenge). As such, strong PUFsare mostly used for authentication, while weak PUFs are forkey derivation. There has been a lot of research on PUF-basedkey generation using FEs [12], [13], [14], [15], [16], [17].However, these PUF-based key derivation solutions only useFEs to derive a key from the (possibly) noisy PUF response,and do not take into account the physical measurementsthat produce the PUF response. Therefore, our parametrizedbinarisation approach can be applied to quantize the physicalmeasurements inside weak PUFs to reduce bit error rates.

Ignatenko [18] analyzed the information loss due to thehelper data in FEs, although her work mainly focused ondealing with quantized biometrics. The combination withother authentication factors was analyzed from an informationtheoretic point of view in [19], again dealing with fixedquantization schemes for the biometrics. Verbitskiy et al. [20]analyzed the problem of fuzzy extraction from continuoussources, pointing out that the distribution of the continuousfeatures can be used to derive uniformly distributed keys.Groot et al. [2] showed a zero-leakage helper data schemefor fuzzy extraction for general distributions of continuousfeatures, which is of practical interest for the construction ofFEs when dealing with reliable continuous features, thoughthey do not cover the case of dealing with unreliable features,where a combination of these is needed.

III. STRENGTHENED UNCOUPLED FUZZY EXTRACTOR

This section formally introduces the strengthened uncoupledfuzzy extractor and presents a practical implementation basedon symmetric encryption, universal one-way hash functionsand error correcting codes, as defined in Appendix A. Belowwe recall necessary concepts and naming conventions:

A population O = O1, . . . ,OP is a set of a variablenumber of individuals (persons or objects) that share somecommon measurable characteristics.

Given a population O of individuals, a random charac-terization Q of an individual O ∈ O is a random variableobtained as a noisy measurement of some characteristics ofthe individual which are common among the individuals inthe population: Q = measure(O) ∈ Q, where Q is themeasurements’ range. We denote by q ← Q a realization ofthe random characterizarion, or sample. Let M be a modelspace, in which a model M ∈ M of an individual O ∈ O is aset of random parameters describing a random characterizationof O. We denote by m← M a realization of the model M , ortemplate.

A. Formal Constructions

Now we present the formal constructions, first those dealingwith the models of the noisy sources (the individuals withina population), and then the strengthened uncoupled fuzzy


extractor, integrating the fuzzy extraction, the fuzzy modelcharacterisations, and the fixed authentication factor.

Definition III.1 (Model estimation function). A model es-timation function model : QE → M maps a vector of Eenrolment samples

[q1, . . . , qE

]from a random characteri-

zation Q of an individual O to a template of the individ-ual m = model

(q1, . . . , qE

).

Definition III.2 (Model-based verifier). A function verM :M×QV → 0, 1 mapping a template mi of an individual Oi

and a vector q j =[q1j , . . . , q

Vj

]of V verification samples

from Q j = measure(O j) to a decision on whether Oi and O j

are the same is called a (Q,M,V, FAR, FRR)-model-basedverifier, characterized by the following error probabilities:

FRR = PrverM,QV

(mi, q j

)= 0

Oi = O j

,

FAR = PrverM,QV

(mi, q j

)= 1

Oi , O j

. (1)

Definition III.3 (Model-based uncoupling construction).Let ME,PM be a construction comprising:

(i) a metadata extractor ME :M × Cn → P, and(ii) a parametrized mapping PM : QV × P → Cn,

where P is the metadata domain. If, for a given model Mof an individual O, a random string C ∈ Cn independentfrom M , where C is a finite alphabet, and the metadata PM,C =

ME (M,C), the following equations hold:

I(C, PM,C

)= 0, H

(M

PM,C)≥ n, (2)

where I(·, ·) is the mutual information and H(·|·) is the condi-tional entropy, then ME,PM is a (Q,M,P, C,V, n)-model-based uncoupling construction.

Definition III.4 (Uncoupled model-based verifier). A func-tion verME,PM :M × QV → 0, 1 defined as:

verME,PM(mi, q j

)=

1 if d (c, c) ≤ t,0 otherwise, (3)

where mi ∈ M is a template of an individual Oi , q j =[q1j , . . . , q

Vj

]is a vector of V verification samples from Q j =

measure(O j), c ∈ Cn is a random string indepen-dent from mi , d (·|·) is a distance in Cn, ME,PMis an (Q,M,P, C,V, n)-model-based uncoupling construc-tion, pmi,c = ME (mi, c), and c = PM

(q j, pmi,c

), is called

a (Q,M,P, C,V, n, t, FAR, FRR)-uncoupled model-based veri-fier characterized by the following error probabilities:

FRR = PrverME,PM

(mi, q j

)= 0

Oi = O j

, (4)

FAR = PrverME,PM

(mi, q j

)= 1

Oi , O j

. (5)

Definition III.5 (Strengthened uncoupled fuzzyextractor). The procedures (Gen,Rep) constitutea (FS,Q,M,PS, C,V, n, k, FAR, FRR)-uncoupled extractorwhen the following conditions hold:

1) Gen is a probabilistic generation procedure Gen : FS×M → PS × Ck which on input

(fs,mj

), where mi is a

template from an individual Oi , and fs ∈ FS is a fixedsecret sampled from the random variable FS indepen-dent from Mi , outputs a public string ps ← PS ∈ PS,

and a secret uniformly random string ss ← SS ∼ UCk ,hereafter U? is the uniform distribution over the do-main ?, such that:

(i) I (PS, Mi) = 0.(ii) I (SS, Mi) = 0.

(iii) I (Mi, (FS, PS)) ≤ H (Mi) − n.2) Rep is a function defined as Rep : FS × PS × QV →Ck ⋃ ∅ which takes as input ( fs, ps, q j), and giventhat (ps, ss) = Gen ( fs,mi), where mi ∈ M is a templateof an individual O j , the following equations hold:

(i) PrRep

(fs, ps, q j

)= ∅

Oi = O j

= FRR.

(ii) PrRep

(fs, ps, q j

)= ss

Oi = O j

= 1 − FRR.

(iii) PrRep

(fs, ps, q j

)= ss

Oi , O j

= FAR.

(iv) PrRep

(fs, ps, q j

)= ∅

Oi , O j

= 1 − FAR.

3) PrRep

(fs′, ps′, q j

)= s

< negl(k) if fs′ , fs

or ps′ , fs, where negl(·) is a negligible function.

Theorem 1 (Strengthened uncoupled fuzzy extractor froma model-based uncoupling construction, an error correctingcode, a universal one-way hash function and a perfectly securesymmetric encryption scheme). Let• (KeyGen,ENC,DEC) be a (FS,P ×H ×W,PS)-

perfectly secure symmetric encryption scheme,• (ECC-enc,ECC-dec) be a (C, n, k, t)-error correcting

code,• (ME,PM) be a (Q,M,P, C,V, n)-model-

based uncoupling construction with an associ-ated verME,PM (Q,M,P, C,V, n, t, FAR, FRR)-uncoupledmodel-based verifier,

• andW be a family of universal one-way hash functions.Let H ⊂ W be the collection of functions inW from Ckto H, which is the range of H.

Then, the procedures:• Gen : FS × M → PS × Ck , which takes as input

a fixed secret fs and a model mi of an individual Oi ,samples ss ← UCk and h ← UH , and computes δ =h (ss), c = ECC-enc (ss), pmi,c = ME (mi, c), and ps =ENC fs

(pmi,c

δ h), where ‖ stands for concatenation,

and outputs (ps, ss); and• Rep : FS × PS × QV → Ck , which takes

as input the fixed secret fs, the public string ps,and a vector q j =

[q1j , . . . , q

Vj

]∈ QV of samples

from Q j = measure(O j), with O j an individual frompopulation O, computes pmi,c ‖δ‖h = DEC fs (ps), c =PM

(q j, pmi,c

), s s = ECC-dec (c), and δ = h ( s s), and

outputs s s if δ = δ, ∅ otherwise;constitute an (FS,Q,M,PS, C,V, n, k, FAR′, FRR′)-uncoupled extractor, with FRR′ ∈ (FRR − negl(k), FRR]and FAR′ ∈ [FAR, FAR + negl(k)).

The proof of this theorem follows from the definitions ofthe used constructions.

The strengthened uncoupled fuzzy extractor presented inDefinition III.5, and instantiated using symmetric encryption,error correcting codes, universal one-way hash functions,and model-based uncoupling constructions in Theorem 1, isillustrated in Fig. 1. The Gen procedure uses a secret key fs


ME

RNG

Gen

ECC−enc

Enc

c

fs

h(·)

δ

mipmi ,c

H

hconcatenate

ss ss

ss

ps

split

PM

Rep

ECC−dec

fs

psδ

pmi ,c

c

hq j

h(·)

ssδ

Dec

ss

o ∈ ss, ∅?=

Fig. 1. Gen and Rep procedures of a strengthened uncoupled fuzzy extractorfrom a model-based uncoupling construction, an error correcting code, anuniversal one-way hash function and a perfectly secure symmetric encryptionscheme.

and a template mi of a noisy source Oi as input, and outputsa secret string ss and a public string ps. The secret string israndomly generated, and used to encode a codeword c. Thiscodeword and the template are the inputs to the PE procedureof the model-based uncoupling procedure, which computes themetadata pmi,c . The randomly chosen hash descriptor h, thehash of the secret string δ = h(ss), and the metadata areconcatenated and encrypted using the key fs into the publicstring ps. The Rep procedure uses the public string, the secretkey, and a collection of samples q j from the noisy source O j

as inputs, and outputs the secret string ss only when the noisysources Oi and O j match. In this procedure, the public string isdecrypted, obtaining the metadata, the hash descriptor, and thehash of the secret string. The metadata and the samples fromthe noisy source are input to the PM procedure of the model-based uncoupling procedure, which outputs an estimate of thecodeword c. This estimate is decoded into the estimate of thesecret string s s, and the hash of this estimate is comparedto the hash of the actual secret string in order to release thesecret string.

This construction can be understood as an adaptation ofthe FE aimed at introducing a second authentication fac-tor, a fixed (non-variable) secret key. This key can be aknowledge- or possession-based authentication factor, and itprotects the metadata that drives the parametrized mapping ofthe noisy authentication factor in the model-based uncouplingconstruction. In the case that the fixed authentication factoris compromised, the adversary may only access the mappingmetadata, in contrast to any key agreement protocol based onpre-existing shared secrets, where the security would be totallycompromised. This metadata discloses some information aboutthe noisy authentication factor, that can be used in some cases

for linking the noisy authentication factor through differentservices, and therefore it has to be protected. However, thisdisclosure would not pose a threat to security, since the targetcodeword and the metadata cannot be linked, unlike in theoriginal FE. Most important, the public and secret stringsproduced by this construction are perfectly uncoupled fromthe two authentication factors used (mutual information is 0),and it is trivial to produce unlinkable public and secret stringsby simply changing the key, which is an additional advantagewith respect to the original FE.

The role of the metadata in the model-based uncouplingconstruction is two-fold:

1) It allows to uncouple the target codeword from the noisysource, allowing to produce diverse secret strings for agiven noisy source.

2) It allows to improve the reliability of the parametrizedmapping, resulting in an improved FAR and FRR of theconstruction, and therefore in higher entropies of thesecret string.

The first aspect is straightforward from the definitions,since the codeword is independent from the the noisy source.However, the second is not obvious, since it conveys designinga robust quantization of the noisy source. In the following sec-tions we will show how to design an uncoupled model-basedclassifier for a common family of noisy sources, achievingboth goals simultaneously.

IV. EFFICIENT MODEL-BASED UNCOUPLINGCONSTRUCTION FOR GAUSSIAN SOURCES

One widely used feature representation is based on theprojection of the original features into an eigenspace. Thiskind of representation provides several advantages, includingfeature dimensionality reduction, denoising and decorrelationof the projected features. This approach has been used indifferent biometric modalities. For instance, in the case of facebiometrics, some examples are the eigenface approach [21],or [22]. Eigenspace techniques have also been successfullyapplied to characterize temporal sequences, such as speechor online signatures. In the case of speaker recognition, itsuse is even more prevalent, and approaches such as theeigenvoices [23], joint factor analysis [24] or i-vectors [25]which also rely on an eigenspace transformation, are thecore methods of the state-of-the-art. A similar approach hasbeen also explored in [7] for characterizing online signaturetemplates. These Gaussian features can be defined for the caseof single and multiple sources as follows:• Single Gaussian source: A single Gaussian source (or

feature) is a measurement Qi ∼ N(µ, σ2) from an indi-

vidual Oi in a population O. This population determinesthe a priori distribution of the mean of the Gaussiansources. For simplicity but without loss of generality,let the Gaussian sources within the population O benormalized, making the a priori distribution of the meanto be a standard Gaussian, i.e. µ ∼ N (0, 1).

• Multiple Gaussian sources: Multiple Gaussian sourcesare defined as a set of independent Gaussian sources orfeatures Qi =

Q1

i , . . . ,QFi

characterizing individual Oi


in a population O, where F is the number of featuresin the set. Each feature follows a Gaussian distribution,i.e.: Q f

i ∼ N(µ f , σ

2f

), and we also assume the a priori

distribution of the mean to be a standard Gaussian. Theindependence assumption is justified by the commonpossibility of using feature transformations which provideuncorrelated features from jointly Gaussian features, asexplained above.

We will focus on Gaussian sources with known standard devia-tions, but with means estimated from E enrolment samples. Wewill only consider a model-based classifier which uses a singlesample V = 1 in the verification phase, which is the mostrealistic case. Also, we will constrain ourselves to the mostwidely used binary alphabet C = 0, 1, although equivalentconstructions can be easily derived for other alphabets.

In order to construct an efficient uncoupled model-basedverifier, we need a model-based uncoupling construction thataims at minimizing both FAR and FRR, defined as:

FAR =Prd(c,PM

(q j,ME (mi, c)

) )≤ t

Oi , O j

(6)

FRR =Prd(c,PM

(q j,ME (mi, c)

) )> t

Oi = O j

(7)

Minimizing the FAR can be achieved by ensuringthat PM

(q j,ME (mi, c)

)| Oi , O j ∼ UCn , thus

providing FAR =∑t`=0

(n`

)(|C| − 1)` /|C|n. On the other

hand, minimizing the FRR for a given Gaussian source Oi isequivalent to minimizing the expected Bit Error Rate (BER)for the genuine case (Oi = O j), defined as:

BER = E

d (C,PM (Qi,ME (Mi,C)))n

, (8)

where E· stands for expected value. Finally, we must keepin mind that the metadata PMi,C = ME (Mi,C) cannot discloseinformation about the discrete representation C, which isequivalent to stating that the probability density function (pdf)of PMi,C | C is equal for all the possible values of C.These three conditions, i.e. (a) the minimization of the FAR,(b) the minimization of the BER for genuine matches, and(c) the absence of mutual information between the discreterepresentation and the metadata, will drive the design ofthe proposed model-based uncoupling construction. We willdistinguish two different cases. The first is devoted to obtain asmany reliable bits, i.e., with associated BER ≤ BERmax, froma single Gaussian source. Our second case is devoted to obtaina single reliable bit from a set of Gaussian sources, which isuseful when these features cannot achieve a BER ≤ BERmax.These two derivations will allow us to design a general robustmodel-based uncoupling construction.

A. Efficient model-based uncoupling construction for a singleGaussian source

Let m = µ, σ, E be the template of a single Gaussiansource, defined in terms of a known standard deviation σ, theMaximum Likelihood estimate of the mean µ, and the numberof enrolment samples E used to obtain this estimate.

The optimal zero-leakage helper data, defined in [2], isdenoted as quantile helper data. For fuzzy extraction construc-tions, they propose to use uniform quantization, and the value

of the conditional cumulative distribution function (ccdf) of themean estimate as helper data. This can be incorporated into amodel-based uncoupling construction where the ME and PMfunctions are defined as follows:

ME (m, c) = Fµ |σ,E,µ∈ bin(c) (µ) (9)

PM(q, pm,c

)= argmax

c

fq |pm,c,σ,E,µ∈ bin(c) (q)

, (10)

where Fµ |σ,E,µ∈ bin(c) (µ) is the ccdf of the mean estimatein the uniform quantization case, and fq |pm,c,σ,E,µ∈ bin(c) (q)is the conditional pdf of the verification sample q, giventhat the metadata is pm,c , and in both cases given that themean estimated from E enrollment samples is µ ∈ bin (c).Zero-leakage approaches are defined by using any continuousmonotonic function of the cumulative distribution function.We will theoretically compare this approach with the baselinecase, where uniform quantization is used, but no helper datais incorporated.

Gain in False Rejection Rate of the zero-leakage approach:In order to evaluate the gain provided by the use of the helperdata in the zero-leakage helper data model-based uncouplingconstruction, we will evaluate the BER for the genuine case inour construction, denoted as BER0-leakage

n-bit (σ, E), and we willcompare it with the average BER in the baseline case, denotedas BERbaseline

n-bit (σ, E).Let us denote the set of thresholds defining the uniform

quantization bins as Θ = θi, i ∈ 0, . . . , 2n, where:

θi =

√1 +

σ2

Eerfinv

(i − 2n−1

2n−1

). (11)

Let us denote the cumulative distribution function of thegenuine sample q ∼ N

(µ, σ2) as Fq (q) = 1

2

[1 + erf

(q−µσ√

2

)].

Let us also denote the n-bit Gray encoding of the i-th binas Gn

i , and τi (µ) = Fµ |σ,E,µ∈ bin(Gni ) (µ), defined as:

τi (µ) =erf

(µ/√

2(1 + σ2

E

))− erf

(θi/

√2(1 + σ2

E

))erf

(θi+1/

√2(1 + σ2

E

))− erf

(θi/

√2(1 + σ2

E

)) .

(12)Finally, let us denote the value of the estimated mean in thebin j that would produce τ = τj (µ) = τi (µ) as:

µj(τ) = F−1µσ,E,µ∈ bin

(Gn

j

) (τ) = (13)√2(1 + σ2/E

)erfinv

τ

[erf

(θ j+1/

√2(1 + σ2/E

) )−

erf(θ j/

√2(1 + σ2/E

) )]+ erf

(θ j/

√2(1 + σ2/E

) ).

Then, BER0-leakagen-bit (σ, E) is shown in (15). The asymptotic

case where E = ∞ is shown in (16), where µj(τ) is definedas:

µj(τ) = limE→∞

F−1µσ,E,µ∈ bin

(Gn

j

) (τ) (14)

=√

2erfinvτ

[erf

(θ j+1√

2

)− erf

(θ j√

2

)]+ erf

(θ j√

2

).


In the baseline case, where no helper data is used, theexpression for the average BER for the genuine case, de-noted as BERbaseline

n-bit is shown in (17). The asymptoticcase where E = ∞ is shown in (18). Although thereis not an analytical solution to the integral shown in thisequation, it exhibits nice numerical properties, allowing tonumerically approximate it. Fig. 2 shows BER0-leakage

n-bit (σ, E)and BERbaseline

n-bit (σ, E) for the asymptotic case. This clearlyillustrates the gain provided by the zero-leakage helper dataregarding BER for the genuine case, which is equivalent toprovide a gain in terms of False Rejection Rate. Fig. 3 focuseson the case n = 3, showing BER0-leakage

n-bit (σ, E) for differentnumber of enrolment samples, and BERbaseline

n-bit (σ, E) for theasymptotic case. This figure illustrates the gain provided byusing more samples during the enrolment when using a zero-leakage approach, but it also shows that using this helper dataprovides better error rates than the baseline approach evenwhen the latter has perfect knowledge of the genuine samplesdistribution.

B. Efficient model-based uncoupling construction for multipleunreliable Gaussian sources

In the previous section we have focused on extracting asmany bits as possible from a single Gaussian source. However,if our construction must provide BER ≤ BERmax, it may bethe case that for some Gaussian features BER0-leakage

1-bit (σ, E) >BERmax. This can be inferred from Fig. 2, where it is clear thatthe BER grows with the Gaussian source’s standard deviation.

A feature f ∼ N(µ, σ2) is considered as BERmax-unreliable

for E enrolment samples if the following condition holds:

σ > maxσ′

BER0-leakage1-bit (σ′, E) ≤ BERmax

. (19)

The simplest approach to deal with unreliable Gaussiansources is simply to discard them. However, this approach isnot using all the available fuzzy information, which degradesthe overall performance in terms of key length of the con-struction. In this section, we show how to optimaly combineunreliable Gausian features to obtain reliable bits.

We will focus on designing a general procedure to com-bine F BERmax-unreliable independent Gaussian sources F = f1, . . . , fF , with fi ∼ N

(µi, σ

2i

)into the maximum pos-

sible number n of binary sources B = b1, . . . , bn suchthat maxi∈1,...,n

BERbi

≤ BERmax. In this case, our Gaus-

sian sources are described by the template:

m = (µ1, σ1) , . . . , (µF, σF ) , E , (20)

where E is the number of available enrolment samples, µi isthe estimate of the mean of the Gaussian source fi , and σi

is the standard deviation (we assume it is known). We firstdescribe the combination procedure for obtaining one singlebinary source with minimum BER from two Gaussian sources,then from F Gaussian sources, and then we will explain thegeneral procedure to combine F Gaussian sources into themaximum number of binary sources with a BER below acertain BERmax.

1) One-bit efficient model-based uncoupling constructionfor two unreliable Gaussian sources: We analyze the com-bination of two Gaussian features f1 ∼ N(µ1, σ

21 ) and f2 ∼

N(µ2, σ22 ) into a new one f = f1 + a f2 ∼ N(µ1 + aµ2, σ

21 +

a2σ22 ), when the standard deviations are known, but the means

are estimated from E samples qi1, . . . , q

iE , i.e. µi = 1

E

∑Ej=1 qi

j .The metadata extraction and parametrized mapping functionswould in this case simply defined as:

ME (m, c) = a, PM (q, a) = sign (q1 + aq2) + 12

, (21)

where m = (µ1, σ1) , (µ2, σ2) , E, a ∈ R, q ∈ R2, c ∈ 0, 1,and sign(x) = x/|x | for x , 0, and sign(0) = 0. Our goal isto minimize the BER when qi ∼ N (µi, σi) in (21), i.e., in thegenuine case, where:

BER (µ1, µ2, σ1, σ2, a, E) =∫ +∞

µ1=−∞

∫ +∞

µ2=−∞

12

1 − sign [(µ1 + aµ2)(µ1 + a µ2)] erf©«|µ1 + aµ2 |√

2(σ2

1 + a2σ22) ª®®¬

fµ1,µ2 |µ1,µ2 (µ1, µ2) dµ2 dµ1 ,

(22)

where:

fµ1,µ2 |µ1,µ2 (µ1, µ2) =E

2πσ1σ2e− E(µ1−µ1)2

2σ21 e

− E(µ2−µ2)22σ2

2 . (23)

This equation provides also the solution to the known µ case:

BER (µ1, µ2, σ1, σ2, a,∞) = limE→∞

BER (µ1, µ2, σ1, σ2, a, E)

=12

1 − erf©«|µ1 + aµ2 |

√2√σ2

1 + a2σ22

ª®®¬ =

12

[1 − erf

(|ρ|√

2

)], (24)

where ρ = (µ1 + aµ2)/√σ2

1 + a2σ22 . If for any Gaussian

variable ? ∼ N(µ?, σ2?) we define ρ? = µ?/σ?, and its

estimate as ρ? = µ?/σ?, and s = σ1σ2

, then Equation (22)can be expressed as:

BER (ρ1, ρ2, s, a, E) =∫ +∞

ρ1=−∞

∫ +∞

ρ2=−∞

12

1 − sign [(sρ1 + aρ2)(s ρ1 + a ρ2)] erf©«|sρ1 + aρ2 |√2(s2 + a2) ª®®¬

fρ1,ρ2 |ρ1,ρ2 (ρ1, ρ2) dρ2 dρ1 ,

(25)

where:

fρ1,ρ2 |ρ1,ρ2 (ρ1, ρ2) =E2π

e−E(ρ1−ρ1)2

2 e−E(ρ2−ρ2)2

2 . (26)

It must be noted that the optimal value of a is independentof E , and it is equal to:

aopt = argmina

BER (ρ1, ρ2, s, a, E)

= s

ρ2ρ1

. (27)

However, the BER is bigger when the uncertainty on the valueof µ1 and µ2 is high (small values of E), as it can be observed


BER0-leakagen-bit (σ, E) = E

BER0-leakage

n-bit (µ, µ, σ, E)

=

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

)n

Pr(µ ∈ [θi, θi+1) , q ∈

[µ j−1 (τi (µ)) + µ j (τi (µ))

2,µ j (τi (µ)) + µ j+1 (τi (µ))

2

) q ∼ N(µ, σ2 )

µ ∼ N (0, 1) , µ ∼ N(µ,σ2

E

))=

√E

2nπσ

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

) ∫ µ=+∞

µ=−∞

∫ µ=θi+1

µ=θi

[Fq

(µ j (τi (µ)) + µ j+1 (τi (µ))

2

)− Fq

(µ j−1 (τi (µ)) + µ j (τi (µ))

2

)]e− E (µ−µ)2

2σ2 dµe−µ22 dµ

(15)

BER0-leakagen-bit (σ,∞) = lim

E→∞

E

BER0-leakage


=

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

)n

Pr(µ ∈ [θi, θi+1) , q ∈

[µ j−1 (τi (µ)) + µ j (τi (µ))

2,µ j (τi (µ)) + µ j+1 (τi (µ))

2

) q ∼ N(µ, σ2 )

µ ∼ N (0, 1)

)=

1n√

2π

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

) ∫ µ=θi+1

µ=θi

[Fq

(µ j (τi (µ)) + µ j+1 (τi (µ))

2

)− Fq

(µ j−1 (τi (µ)) + µ j (τi (µ))

2

)]e− µ

22 dµ (16)

BERbaselinen-bit (σ, E) = E

BERbaseline


=

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

)n

Pr(µ ∈ [θi, θi+1) , q ∈

[θ j, θ j+1

) q ∼ N(µ, σ2 )

µ ∼ N (0, 1) , µ ∼ N(µ,σ2

E

))=

√E

2nπσ

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

) ∫ µ=+∞

µ=−∞

∫ µ=θi+1

µ=θi

[Fq

(θ j+1

)− Fq

(θ j

) ]e− E (µ−µ)2

2σ2 dµe−µ22 dµ (17)

BERbaselinen-bit (σ,∞) = lim

E→∞E

BERbaseline


=

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

)n

Pr(µ ∈ [θi, θi+1) , q ∈

[θ j, θ j+1

) q ∼ N(µ, σ2 )

µ ∼ N (0, 1)

)=

1n√

2π

2n−1∑i=0

2n−1∑j=0

dH(Gn

i ,Gnj

) ∫ µ=θi+1

µ=θi

[Fq

(θ j+1

)− Fq

(θ j

) ]e− µ

22 dµ (18)

10-2

10-1

100

σ

10-14

10-12

10-10

10-8

10-6

10-4

10-2

BER

n-bits(σ,E

=∞)

BERbaseline1-bit (σ, E = ∞)

BER0-leakage1-bit (σ, E = ∞)

BERbaseline2-bits (σ, E = ∞)

BER0-leakage2-bits (σ, E = ∞)







Fig. 2. BERbaselinen-bit (σ, E) and BER0-leakage

n-bit (σ, E) for the asymptoticcase E = ∞ as a function of the standard deviation of the genuinedistribution σ.

in Fig. 4. Also, when a = aopt we obtain the following rotation-invariant expression, independent of s:

BERopt (ρ1, ρ2, E) =∫ +∞

ρ1=−∞

∫ +∞

ρ2=−∞

12

1 − erf©«ρ1 ρ1 + ρ2 ρ2√

2(ρ2

1 + ρ22) ª®®¬

E2π

e−E(ρ1−ρ1)2

2 e−E(ρ2−ρ2)2

2 dρ2 dρ1.

(28)

Fig. 3. BERbaseline3-bits (σ, E = ∞) and BER0-leakage

3-bits (σ, E) for E ∈1, 2, 4, 8, 16,∞ as a function of the standard deviation of the genuinedistribution σ.

This rotation invariance allows us to simplify this expressionin terms of a more simple integral. Let ρ =

√ρ2

1 + ρ22. Then

the BERopt can be calculated as:

BERopt (ρ, E) =∫ +∞

ρ=−∞

12

[1 − erf

(ρ√

2

)] √E√

2πe−

E (ρ−ρ)22 dρ.

(29)


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Fig. 4. BER for different values of E and a given setup (ρ1 = 2, ρ2 =23, s =

56, aopt =

518 ).

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 510

-7

10-6

10-5

10-4

10-3

10-2

10-1

0.5

Fig. 5. BERopt (ρ, E) for different values of E versus the estimated reliabilityof the combined Gaussian feature ρ

In general, for a combination of F features with estimatedreliabilities ρ1 = µ1/σ1, . . . , ρF = µF/σF , if we definethe squared global estimated reliability as ρ2 =

∑Fi=1 ρ

2i

then BERopt can be calculated using (29). Fig. 5 shows BERoptversus the estimated reliability ρ for different values of E ,including the asymptotic case E = ∞.

2) One-bit efficient model-based uncoupling constructionfor multiple unreliable Gaussian sources: We are interested inobtaining the model-based uncoupling construction consistingof the following metadata extraction and parametrized map-ping functions:

ME(mf , c

)= w, (30)

PM (q, w) = sign (wt q) + 12

, (31)

where q = [q1, . . . , qF ]t ∈ RF , with mf defined as in (20),and such that the BER is minimum when qi ∼ N

(µi, σ

2i

).

In order to achieve this goal, we sort the Gaussian sourcesaccording to | ρi | =

µiσi

, obtaining Fsorted =

fi1, . . . , fiF,

such thatρi1 ≥ ρi2 ≥ · · · ≥ ρiF . The combined feature

is defined as g =∑F

j=1 wi j fi j , where the components of the

weights vector w = [w1, . . . ,wF ]t are defined as:

wik =

sign (2c − 1) sign

(µi1

), for k = 1,

ρik sign(wi1 ρi1 )√∑k−1

j=1 w2i jσ2i j

σik

√∑k−1j=1 ρ

2i j

, for k > 1. (32)

With the weight vectors chosen as above, the reliability of thecombined feature is the optimal one, i.e. ρ2

g =∑F

i=1 ρ2i , and

the BER can be computed using (29).3) General case for unreliable Gaussian features: Multiple-

bit efficient model-based uncoupling construction for multipleunreliable Gaussian sources: Here we will focus on the casewhere a set of BERmax-unreliable features must provide themaximum possible number of bits bi such that their associatedaverage BER is BER(bi) ≤ BERmax in the genuine case. Thecorresponding metadata extraction and parametrized mappingfunctions can be defined as:

ME(mf , c

)= (A1, . . . ,An , w) , (33)

PM (q, w) =

sign

(∑i∈A1 wiqi

)+1

2...

sign(∑i∈An wiqi)+12

. (34)

We need that each of the combined features gAi are reliable

enough, i.e. ρ2Ai≥

[BER−1

opt (BERmax, E)]2

. Therefore, thenumber of bits that can be allocated from a set of features Fwith ρ2

F =∑F

i=1 ρ2i is:

nmax (ρF) ≤

ρ2F[

BER−1opt (BERmax, E)

]2

. (35)

Finding the optimal sets Ai is an example of the Multi-WayNumber Partitioning [26]. Although this is a NP-completeproblem, there are multiple approaches to tackle it, even opti-mal ones, as those described in [27]. However, the followingGreedy approach can perform reasonably well:

1) Initialize:a) Compute ρ2

F =∑

fi ∈F ρ2i .

b) Set n = nmax (ρF).c) Define D = F = f1, . . . , fF as the set of features

that need to be allocated.2) Define A1 = ∅, . . . ,An = ∅ as the sets of features

allocated to each binary source b1, . . . , bn respectively.3) Input c ∈ 0, 1n.4) Set iteration counter it = 1.5) While it ≤ F do:

a) Compute ρA1, . . . , ρAn :• If Ai = ∅ then ρAi = 0.• Otherwise, ρAi =

∑fj ∈Ai

ρ2j

b) Let iit = argmax fi ∈D ρi be the index of the mostreliable Gaussian source not allocated yet.

c) Let jit = argminj ρA j be the index of the lessreliable bit.

d) If A j it , ∅ then define wiit =ρiitρA

j it

σAj it

σiit. Else,

define wiit = sign (µ) sign(2cj it − 1

).


e) Set A j it = A j it ∪ fiit .f) Set D = D − fiit .g) Set it = it + 1.

6) If minj ρA j ≥ BER−1opt (BERmax, E), then

return (A1, . . . ,An ,w); else set n = n − 1, D = F,and go to step 2.

C. General efficient model-based uncoupling construction forGaussian sources

The constructions presented in Sect. IV-A and IV-B canbe used to provide a general procedure to build a generalrobust model-based uncoupling construction for Gaussiansources. The starting point is a set of Gaussian sources,and the requirement is a given BERmax. The model-baseduncoupling construction for Gaussian sources must encode themaximum possible number of reliable bits nmax with a BERbelow BERmax using the robust model-based uncouplingconstructions from the previous sections. First, we need tocalculate the maximum possible number of reliable bits that wecan extract from each single Gaussian source Qi ∼ N

(µi, σ

2i

)with model mi = (µi, σi, E), with i ∈ 1, . . . , F:

ni (mi,BERmax) = max

j BER0-leakage

j-bit (σi, E) ≤ BERmax

,

(36)where for convenience we set BER0-leakage

0-bit (σi, E) = 0. Let usdefine the set of the indexes of the BERmax-reliable Gaus-sian sources as R = i | ni > 0, the set of the indexes ofthe BERmax-unreliable Gaussian sources as U = i | ni = 0,and their corresponding models as mR = mi | i ∈ R =m1R, . . . ,m

|R |R and mU = mi | i ∈ U = m1

U, . . . ,m|U |U

respectively, where m is defined as in (20), and |U| + |R | =F. Then, we can perform the procedure in Sect. IV-B3 inorder to find the maximum number of reliable bits thatwe can encode with the Gaussian sources in U, which wedenote as nU (mU,BERmax). The number of bits encodedby the Gaussian sources in R can be computed using (36)as nR (mR,BERmax) =

∑i∈R ni (mi,BERmax). Therefore, we

can compute the number of bits n as:

nmax (m,BERmax) = nR (mR,BERmax) + nU (mU,BERmax) .(37)

We define the input c as the concatenation of the bits as-sociated to the BERmax-reliable Gaussian sources cR and thebits associated to the BERmax-unreliable Gaussian sources cU ,i.e. c = cR |cU , and we denote MER,PMR and MEU,PMUas the metadata extractor and parametrized mapping func-tions of the model-based uncoupling construction describedin Sect. IV-A and IV-B3 respectively, then we can define thecorresponding metadata extraction function and parametrizedmapping functions as follows:

ME (m, c) = pm,c = pm1R,c

1R

· · · pm

nRR ,c |R |R

pmU,cU

= MER(m1R, c

1R

) · · ·MER

(m |R |R , c |R |R

) MEU (mU, cU) , (38)

PM(q, pm,c

)= PMR

(q1R, c

1R

) · · ·PMR

(q |R |R , c |R |R

) PMU (qU, cU) . (39)

where c =[c1R, . . . , c

|R |R , cU

], q =

[q1R, . . . , q

|R |R , qtU

] t,

and qiR and mi

R are the verification sample and templatescorresponding with the ith index in R respectively.

V. GENERAL PROCEDURE TO OBTAIN OPTIMAL FUZZYEXTRACTORS FROM GAUSSIAN SOURCES

In this section we explain how to design fuzzy extractorsfrom Gaussian sources using operational requirements basedon security and usability.

We assume that we can use the procedure from Sect. IV-Cto get n = nmax (m,BERmax) BERmax-reliable binary features.The operational requirements of a FE are as follows:• The maximum allowed False Rejection Rate (FRRmax).

Taking into account the definition of the (C, n, k, t)-errorcorrecting code, the False Rejection Rate, the BERmaxand the code parameters are related as follows:

FRR (n, t) ≤ Pr X ∼ Bi (BERmax, n) > t

=

n∑i=t+1

(ni

)BERmax

i (1 − BERmax)n−i . (40)

• The maximum allowed False Acceptance Rate (FARmax).Under the statistical assumptions we have considered forthe Gaussian sources, this parameter is related with theerror correcting code parameters. Using the definition ofthe code (C, n, k, t) the False Acceptance Rate (FAR) fora honest adversary, who just samples a different Gaussiansource and tries to impersonate the original one can becomputed as follows:

FARhash (k) = Pr X ∼ Bi (0.5, k) = 0 = 2−k , (41)

whereas the FAR for a malicious adversary attacking thehash function in the Fuzzy Commitment scheme, it iscomputed as follows:

FARhonest (n, t) = Pr X ∼ Bi (0.5, n) ≤ t =∑t

i=0(ni

)2n

= FARhash (k)∑t

i=0(ni

)2n−k

. (42)

For any binary error correcting code, the number of errorpatterns that can be corrected is equal to the number ofnon-null syndromes, and therefore we have:

t∑i=0

(ni

)≤ 2n−k =⇒ FARhash (k) ≥ FARhonest (n, t) .

(43)Attacks on the hash are statistically more likely to succeedthan the honest attack on the fuzzy commitment:

FAR (n, k, t) = max FARhash (k) , FARhonest (n, t)= FARhash (k) . (44)

It is important to note that this definition of the FARcopes with both the Fuzzy-factor (represented by thehonest adversary) and cryptographic (represented by themalicious adversary) securities.

We propose to adopt one of the following approaches,depending on the specific use case:


• A security-driven approach, i.e. setting the parame-ter FARmax as the security requirement, and then min-imizing the FRR while ensuring FAR ≤ FARmax.

• A usability-driven approach, i.e. setting the parame-ter FRRmax as the usability requirement, and then mini-mizing the FAR while ensuring FRR ≤ FRRmax.

A. Security-driven approach

This approach starts from setting the parameter FARmax andthen maximizes the FRR while ensuring FAR ≤ FARmax.

For a given family of error correcting codes E, the set ofvalid code parameters of E can be defined as:

VE = (n, k, t) | ∃ (C, n, k, t) ∈ E . (45)

Let us define the subset of FARmax-secure code parameters:

SE,FARmax = (n, k, t) ∈ VE | FAR (n, k, t) ≤ FARmax . (46)

We denote nmax (m,BERmax) as the maximum numberof BERmax-reliable bits that we can extract from Gaussiansources with template m, as defined in (37). The procedureto compute this was shown in Sect. IV-C. Let us define theminimum possible BER for a given n as:

BERmin (m, n) = min BERmax | nmax (m,BERmax) ≥ n .(47)

Denote the minimum achievable FRR for a given number nof extracted bits and an error correction capability t as:

FRRmin (m, n, t) =n∑

i=t+1

(ni

)BERmin (m, n)i (1 − BERmin (m, n))n−i .

(48)The optimal parameters of the code using the security-drivenapproach for a given security parameter FARmax can be definedas those providing the best possible usability (i.e. the mini-mum possible FRR) while abiding the security requirement(i.e., FAR (n, k, t) ≤ FARmax):

(n, k, t)FARmax-secure = argmin(n,k,t)∈SE,FARmax

FRRmin (m, n, t) . (49)

B. Usability-driven approach

This approach starts from setting the parameter FRRmaxand then maximizes the FAR while ensuring FRR ≤ FRRmax.Therefore, we have to explore the solutions providing FRR ≤FRRmax and find the one providing the minimum FAR.

Using the definitions in (45), (47), and (48), we can definethe set of FRRmax-usable code parameters for a family of ErrorCorrecting Codes E and Gaussian sources with template m:

UF,E,FRRmax = (n, k, t) ∈ VE | FRRmin (m, n, t) ≤ FRRmax .(50)

The optimal parameters of the code using the usability-drivenapproach for a given usability requirement FRRmax is the oneproviding the best possible security (i.e. the lowest possi-ble FAR) while abiding the usability parameter (i.e. FRR ≤FRRmax), which can be defined as follows:

(n, k, t)FRRmax-usable = argmin(n,k,t)∈UF,E,FRRmax

FAR (n, k, t) . (51)

Rep

Using a model−based uncoupling construction

Using a strengthened uncoupled fuzzy extractor

(1)

(2)

Gen

ME PM

fs

ps

s

r1 → q1

fs

q1

ss

ps

r2 → q2

c

q1pc, q1

pc, q1q2

c

q2

r1 ← RF channel sampling → r2

Fig. 6. Fuzzy extraction from RF channels using (1) the strengtheneduncoupled fuzzy extractor, and (2) the model-based uncoupling construction.

VI. APPLICATION EXAMPLE: FUZZY EXTRACTION FROMRF CHANNELS

In this section, we will demonstrate the usefulness of themodel-based uncoupling construction and the strengtheneduncoupled fuzzy extractor in illustrative use cases. Let usconsider two parties communicating to each other through aRF channel. These parties observe the same channel in bothdirections, and this can be used to establish a key shared byboth parties. The involved parties will sample the RF fre-quency response during the first phase of their communication,obtaining an estimate of the RF channel frequency response atmultiple frequencies. Since these values are highly correlatedfor frequencies close to each other, first we decorrelate theseby using Whitened Principal Component Analysis (WPCA).

We illustrate two different approaches (equivalent regard-ing FAR and FRR) to perform this key establishment:

a) Using the strengthened uncoupled fuzzy extractor:In this case, both parties get a common secret key in asetup phase, that can be used as a fixed authentication factor.The key establishment using a strengthened uncoupled fuzzyextractor is illustrated in Fig. 6(1). Both parties obtain their RFchannel frequency estimates, and their corresponding channelrepresentations. Then, one party runs the Gen procedure of thestrengthened uncoupled fuzzy extractor shown in Fig. 1 usingthe secret key and its channel representation, stores the gen-erated private string, and transmits the generated public stringto the other party. The other party runs the Rep procedureusing as inputs the received public string and its own channelrepresentation, recovering the same private string. This privatestring can be used as a session key for communications amongthe parties. If the common secret key and the transmittedmetadata is disclosed, there is no information on the decryptedmetadata about the agreed session key, thus providing forwardsecrecy.

b) Using the model-based uncoupling construction:Linkability among different services is impossible in thisuse case, since the noisy factor will be different each timethe system is used. Thus, it makes sense to avoid using


the second authentication factor by relying on the model-based uncoupling construction. In this case, illustrated inFig. 6(2), there is no setup phase. During the key agreement,both parties obtain their RF channel frequency estimates,and their corresponding WPCA representations. Then, oneparty runs the ME procedure of the model-based uncouplingconstruction using its channel representation and a randomlychosen codeword from an error correcting code as inputs. Theoutput metadata is transmitted to the other party. The otherparty uses the metadata and its own channel representation asinputs to the PM procedure of the model-based uncouplingconstruction, obtaining an estimate of the original codewordas output. The errors in this estimate are then corrected, andboth parties derive a session key from it. Since the transmittedmetadata is unrelated to the chosen codeword, the derived keycan only be known by both parties.

A detailed security analysis of the proposed practicalschemes is out of the scope of this paper, and their mainpurpose is to illustrate the advantages of using the proposedmodel-based uncoupling construction in terms of key lengthand FRR. However, it is worth to mention that the firstapproach provides the additional advantage of unlinkability,which is important when the continuous noisy source is persis-tent, e.g., a biometric or a PUF. The second approach is morepractical for noisy sources derived from variant continuousnoisy sources, e.g., derived from sensory data.

A. Experimental SetupThe radio channel measurements have been conducted using

a 4-port vector network analyzer (VNA) to include multipleantenna orientations and polarizations. The measurement wasconducted in an indoor environment in which the channelwas characterized for the 2.4 GHz ISM band, as this bandis widely used for indoor applications. The measurement wascarried out inside a meeting room of 7 by 12 meters, insidean office building. The meeting room was cleared of anyfurniture. The VNA used is a Keysight PNA-X N5242A 4-portnetwork analyzer. The channel has been measured from 2.2— 2.6095 GHz at 4096 discrete frequency points, whichcorresponds to a frequency step-size of 100 kHz. Here weonly use measurements relating to the 80 MHz available in the2.4 GHz ISM-band, where we assume a stepsize of 500 kHz.Hence, 160 tones are used for feature extraction.

Two antenna pairs are used in the measurement, eachpair having a horizontal and vertical polarized antenna. Theantennas are quarter-wave whip antennas. The antennas areconnected to the VNA using phase-stable and armored cables.For channel measurements, the cables and connectors arecalibrated out using an Agilent E-cal, i.e. the measurementplain is at the antenna connector.

One of the antenna pairs has been placed at a fixed locationin corner of the meeting room, while the other antenna pairwas moved along a fine spatial grid through the meeting roomresulting to 4 times 550 unique radio channel responses. Thedistance between antenna pairs varies between 1.5 and 10.6meters.

Each transceiver obtains a channel characterization for eachantenna pair, consisting of 160 amplitudes. We do not use the

phase information in our experiment. Since each transceiverhas 2 antennas, there are 4 different channels that we charac-terize, thus obtaining a multiple channel characterization r =[r1, . . . , rL

] t of lenght L = 4 × 160 = 640.

B. Feature extraction: WPCA

From a pool of M RF channels frequency responses R =[r1 | · · · | rM ], representing the population of possible fre-quency responses in the considered application scenarios, wecalculate the empirical mean µr =

1M

∑Mm=1 rm and the em-

pirical covariance matrix Σ = 1M−1 (R − µr1M ) (R − µr1M )t ,

where 1M is the column vector with M ones. Let us denoteby Γ, Λ = [v1 | · · · | vΓ], and λ = [λ1, . . . , λΓ]t the rank of Σ,its eigenvectors matrix and its eigenvalues vector respectively.As a covariance matrix, Σ is positive semi-definite, andthus λi > 0∀i ∈ 1, . . . , Γ. Then, we define the WhitenedPrincipal Analysis projection matrix as:

Ω =

[v1√λ1

· · · vΓ√λΓ

]. (52)

WPCA representations of a channel r can be computed asq = Ωt (r − µr) . It can be shown that population-wise,the WPCA channel representations qi ∼ N (0, 1). However,since the characterizations qi corresponds to a sample from arandom variable whose mean’s pdf should be a standard Gaus-sian in the proposed model-based uncoupling construction forGaussian features, we need to normalize each of the WPCAfeatures. The frequency response estimates at each side arenoisy, i.e. ri,side* = ri + ηi,side*, where ri is the actual channelresponse value, and ηi,side* ∼ N

(0, σ2) is a Gaussian additive

noise. It can be shown that qi ∼ N(vti (r − µr) , σ

2

λi

)for

this specific channel. The normalized features are computedas q′i = qi/

√σ2qi − σ2/λi . After this, the population-wise pdf

of the mean of q′i is the standard Gaussian, and the pdf of thechannel-specific q′i also changes accordingly.

C. Experimental Results

In the experiments, we compared the proposed approachesusing the model-based uncoupling construction for Gaussianfeatures described in Section V with the baseline construction,which does not use any quantization helper data, as presentedin Section IV-B. We use a simplified version of the usability-driven approach described in Section V-B, where we only usereliable Gaussian sources. This is reasonable in our use case,since we want the security level to be the same for all the RFchannels. If we incorporate the unreliable Gaussian sources,as the number of bits extracted from them depends on theconcrete value of the samples, this would no longer hold.

The codes are selected from the BCH family withlength 127. We set the target usability in our experimentto FRRmax = 1%. We added white Gaussian noise to theRF channels frequency responses in order to extend thecomparison in performance between the proposed approachesand the baseline to different levels of Signal to Noise Ratio(SNR), which is the main factor affecting the performance ofthe fuzzy extraction. In both approaches we used BCH codes


Approach SNR[dB] n k t FRR%Baseline 22.05 127 50 13 4.41Proposed (no added noise) 127 120 1 6.82Baseline 21.71 127 50 13 4.45Proposed 127 120 1 6.77Baseline 19.47 127 43 14 4.00Proposed 127 113 2 5.73Baseline 17.23 127 36 15 3.73Proposed 127 106 3 5.41Baseline 15.79 127 29 21 4.09Proposed 127 92 5 4.68Baseline 14.22 127 29 21 3.55Proposed 127 78 7 3.64Baseline 12.47 127 29 21 3.00Proposed 127 64 10 3.00Baseline 10.66 127 22 23 2.41Proposed 127 43 14 2.41Baseline 8.79 127 15 27 1.14Proposed 127 29 21 1.18Baseline 6.88 127 8 31 1.00Proposed 127 15 27 1.05

TABLE ICODE PARAMETERS (n, k, t), SNR, AND OBTAINED FRR FOR BOTH THE

BASELINE AND PROPOSED CONSTRUCTIONS.

for error correction. We show the obtained code parametersand FRR for both the baseline and proposed approaches inTable I. One can see that using the proposed method resultsin a much longer security parameter k for comparable FRR.Notice that the resulting FRR are in both cases higher thanthe target usability parameter FRRmax. This seems to beprovoked by mismatch between the statistical assumptions onthe noise nature. Although in most cases this noise behaves asi.i.d. Gaussian, matching with our assumptions, there are alsooutliers where the noise is of higher amplitude, and correlatedamong the different frequencies. However, the obtained FRRremains in an acceptable range for our proposed use case,since in the case that a false rejection is produced, the costof repeating these procedures would be affordable. We canobserve that when the added noise becomes more prevalent(for low values of SNR) the statistical assumptions on thenoise become more accurate, and the obtained FRR graduallyapproaches the target FRRmax.

VII. CONCLUSIONS

In this paper, we formally introduced two new constructions,the model-based uncoupling construction dealing directly withnoisy sources and producing metadata which (i) is unrelatedto the discrete representation, (ii) guarantees the diversityof the discrete representation, and (iii) is useful for theconversion between the continuous and discrete domains; andthe strengthened uncoupled fuzzy extractor, which specifieshow the helper data derived by the model-based uncouplingconstruction and the additional fixed authentication factor mustbe handled for avoiding disclosure of information. The keyextracted using the strengthened uncoupled fuzzy extractoris uncoupled to the noisy source, and can be used withcryptographic purposes. Moreover, the helper data do not leakany information about the key and is unlinkable after theencryption, which makes the strengthened uncoupled fuzzyextractor suitable for privacy-preserving applications.

These characteristics make the model-based uncouplingconstruction best suited for being used in cases where privacy

of the noisy source characteristics or linkability of the noisysource among services may be a concern. On the other hand,the strengthened uncoupled fuzzy extractor is best suited forprivacy preserving long-term cryptographic key derivation.

Furthermore, we presented the optimal model-based un-coupling construction for Gaussian sources, covering the fullrange of possible reliabilities. Based on these procedures, aunified design of key binding schemes from continuous datahas been derived, both from a usability and security viewpoint.

Finally, we also demonstrate the effectiveness of the pro-posed techniques in a fuzzy extraction from RF channelsuse case. The proposed model-based uncoupling constructionfor Gaussian noisy sources almost doubles the amount ofentropy extracted from the RF channels when compared tothe baseline method, while providing also a lower FRR. Theseexperiments demonstrate the advantages in terms of robustnessof the proposed construction, which is able to provide a muchbetter balance in terms of security (increased length of thederived key) and convenience (lower false negatives rate)when compared to fuzzy extractors from uniformly quantizedcontinuous sources.

APPENDIX ABACKGROUND DEFINITIONS

Definition A.1 (Error Correcting Codes). An (C, n, k, t)-errorcorrecting code C consists of two functions:• ECC-enc : Ck → Cn, takes as input a message m ∈ Ck ,

and returns a codeword w = ECC-enc (m) ∈ Cn.• ECC-dec : Cn → Ck , takes as input w′ ∈ Cn, and

returns a message m ∈ Ck , if dC (w,w′) ≤ t for w =ECC-enc (m), and dC (·, ·) the distance in Cn.

Definition A.2 (Symmetric encryption scheme). A(K,P, E,Λ)-symmetric encryption scheme is defined by a 3-tuple (KeyGen,ENC,DEC) of functions. KeyGen : Λ → Kgenerates the encryption and decryption key K ∈ K usinga security parameter λ ∈ Λ as input, i.e., K = KeyGen(λ).ENC : K×P → E takes as input a plaintext p ∈ P and a keyK ∈ K and outputs a ciphertext e ∈ E, i.e., e = ENCK (p).DEC : K × E → P takes as input a ciphertext e ∈ E and akey K ∈ K and outputs a plaintext p ∈ P, i.e., p = DECK (e).If the encryption scheme is perfectly secure, then we drop Λfrom the notation.

Definition A.3 (Universal one-way hash functions (adaptedfrom [28])). Let n1 and n0 be two increasing sequencessuch that n0i ≤ n1i, ∀i, but ∃q a polynomial, such thatq(n0i) ≥ n1i . Let Hk be a collection of functions such that∀h ∈ Hk, h : Cn1k → Cn0k , and let W =

⋃k Hk . Let A

be a PPT adversary, that on input k it outputs x ∈ Cn1k , aninitial value. Then, given a random h ∈ Hk , A attempts tofind y ∈ Cn1k such that h(y) = h(x), but x , y. W is calleda family of universal one-way functions if for all A:

1) If x ∈ Cn1k is A’s initial value, thenPr A(h, x) = y, h(x) = h(y), x , y < negl(n1k).

2) ∀h ∈ Hk ∃ a description of h of length poly in n1k , s.t.given h and x, h(x) is computable in poly time.

3) Hk is accessible: ∃ an algorithm G s.t. on input k, Ggenerates unifomly at random a description of h ∈ Hk .


REFERENCES

[1] Y. Dodis, R. Ostrovsky, L. Reyzin, and A. Smith, “Fuzzy extractors:How to generate strong keys from biometrics and other noisy data,”SIAM J. Comput., vol. 38, no. 1, pp. 97–139, 2008.

[2] J. de Groot, B. Škoric, N. de Vreede, and J.-P. Linnartz, “Quantization inzero leakage helper data schemes,” EURASIP J. on Advances in SignalProcessing, vol. 2016:54, 2016.

[3] K. Simoens, P. Tuyls, and B. Preneel, “Privacy weaknesses in biometricsketches,” in 30th IEEE Symp. on Security and Privacy (S&P 2009),17-20 May 2009, Oakland, California, USA, 2009, pp. 188–203.

[4] “Information technology – Security techniques – Information securitymanagement systems – Requirements, ISO 27001:2013(E),” Int. Orga-nization for Standardization, Geneva, CH, Standard, 2013.

[5] A. Juels and M. Wattenberg, “A fuzzy commitment scheme,” in ACMCCS’99. ACM Press, 1999, pp. 28–36.

[6] V. V. T. Tong, H. Sibert, J. Lecoeur, and M. Girault, “Biometric fuzzyextractors made practical: A proposal based on fingercodes,” in Proc. ofthe 2007 Int. Conf. on Advances in Biometrics, ser. ICB’07. Springer-Verlag, 2007, pp. 604–613.

[7] E. Argones Rua, E. Maiorana, J. Alba Castro, and P. Campisi, “Biometrictemplate protection using universal background models: An applicationto online signature,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 1,pp. 269–282, Feb 2012.

[8] S. Billeb, C. Rathgeb, H. Reininger, K. Kasper, and C. Busch, “Bio-metric template protection for speaker recognition based on universalbackground models,” IET Biometrics, vol. 4, no. 2, pp. 116–126, 2015.

[9] M. van der Veen, T. Kevenaar, G.-J. Schrijen, T. H. Akkermans, andF. Zuo, “Face biometrics with renewable templates,” Proc. SPIE, vol.6072, pp. 60 720J–60 720J–12, 2006.

[10] C. Rathgeb and A. Uhl, “Iris-biometric fuzzy commitment schemesunder signal degradation,” in Proc. of the 5th Int. Conf. on Image andSignal Processing, ser. ICISP’12. Springer-Verlag, 2012, pp. 217–225.

[11] E. Maiorana, D. Basi, and P. Campisi, “Biometric template protectionusing turbo codes and modulation constellations,” in Workshop onInform. Forensics and Security 2012, Tenerife, Spain, 2012.

[12] C. Bösch, J. Guajardo, A.-R. Sadeghi, J. Shokrollahi, and P. Tuyls,“Efficient helper data key extractor on FPGAs,” in CHES 2008, ser.LNCS, E. Oswald and P. Rohatgi, Eds., vol. 5154. Springer BerlinHeidelberg, 2008, pp. 181–197.

[13] V. van der Leest, B. Preneel, and E. van der Sluis, Soft DecisionError Correction for Compact Memory-Based PUFs Using a SingleEnrollment, ser. LNCS. Springer Berlin Heidelberg, 2012, vol. 7428,pp. 268–282.

[14] R. Maes, V. van der Leest, E. van der Sluis, and F. Willems, “Securekey generation from biased pufs: extended version,” J. of CryptographicEngineering, vol. 6, no. 2, pp. 121–137, Jun 2016.

[15] R. Maes, P. Tuyls, and I. Verbauwhede, Low-Overhead Implementationof a Soft Decision Helper Data Algorithm for SRAM PUFs, ser. LNCS.Springer Berlin Heidelberg, 2009, vol. 5747, pp. 332–347.

[16] R. Maes, P. Tulys, and I. Verbauwhede, “A soft decision helper dataalgorithm for SRAM PUFs,” in 2009 IEEE Int. Symp. on Inform. Theory,2009, pp. 2101–2105.

[17] M. D. Yu, R. Sowell, A. Singh, D. M’Raïhi, and S. Devadas, “Perfor-mance metrics and empirical results of a puf cryptographic key genera-tion asic,” in 2012 IEEE Int. Symp. on Hardware-Oriented Security andTrust, 2012, pp. 108–115.

[18] T. Ignatenko and F. M. J. Willems, “Information leakage in fuzzycommitment schemes,” IEEE Trans. Inf. Forensics Security, vol. 5, no. 2,pp. 337–348, June 2010.

[19] T. Ignatenko, “Secret-key rates and privacy leakage in biometric sys-tems,” Ph.D. dissertation, Technische Universiteit Eindhoven, Eind-hoven, 2009.

[20] E. A. Verbitskiy, P. Tuyls, C. Obi, B. Schoenmakers, and B. Skoric, “Keyextraction from general nondiscrete signals,” IEEE Trans. Inf. ForensicsSecurity, vol. 5, no. 2, pp. 269–279, June 2010.

[21] M. Turk and A. Pentland, “Face recognition using eigenfaces,” in Proc.of IEEE Conf. on Comput. Vision and Pattern Recognition, 1991, pp.586–591.

[22] L. Huang, H. Zhuang, S. Morgera, and W. Zhang, “Multi-resolutionpyramidal gabor-eigenface algorithm for face recognition,” in Image andGraphics (ICIG’04), Third Int. Conf. on, December 2004, pp. 266–269.

[23] R. Kuhn, P. Nguyen, J.-C. Junqua, L. Goldwasser, N. Niedzielski,S. Fincke, K. Field, and M. Contolini, “Eigenvoices for Speaker Adap-tation,” in Int. Conf. on Spoken Language Processing, Dec. 1998.

[24] P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, “Joint factoranalysis versus eigenchannels in speaker recognition,” IEEE/ACM Trans.Audio, Speech, Language Process., vol. 15, no. 4, pp. 1435–1447, 2007.

[25] N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker verification,” IEEE/ACM Trans. Audio,Speech, Language Process., vol. 19, no. 4, pp. 788–798, May 2011.

[26] R. E. Korf, “Multi-way number partitioning,” in Proc. of the 21st Int.Jont Conf. on Artifical Intelligence, ser. IJCAI’09. Morgan KaufmannPublishers Inc., 2009, pp. 538–543.

[27] R. E. Korf, E. L. Schreiber, and M. D. Moffitt, “Optimal sequentialmulti-way number partitioning.” in Int. Symp. on Artificial Intelligenceand Mathematics, 2014.

[28] M. Naor and M. Yung, “Universal one-way hash functions and theircryptographic applications,” in Proc. of the Twenty-first Annual ACMSymp. on Theory of Computing, ser. STOC ’89. ACM, 1989, pp. 33–43.

Enrique Argones Rúa received the M.Sc. and Ph.D.degrees in telecommunications engineering from theUniversity of Vigo in Spain in 2003 and 2008respectively. He was with Gradiant, ICT R&D centrefrom 2012 to 2015, where he was the technicalleader of biometric technologies. Since 2015, he iswith the imec–COSIC research group at the KULeuven, Belgium.His research interests are focused on the enhance-ment of privacy and security in authentication sys-tems.

Aysajan Abidin received the B.Sc. degree in compu-tational science from Xinjiang University in China in2006, the M.Sc. degree in engineering mathematicsfrom Chalmers University of Technology in Swedenin 2007, and the Ph.D. degree in information codingfrom Linköping University in Sweden in 2013. Heworked on privacy-preserving biometric authentica-tion as a post-doctoral researcher at Chalmers during2014–2015.Since 2015, he is a research fellow with the imec–COSIC research group, Department of Electrical

Engineering (ESAT), KU Leuven. His research interests include informationsecurity, privacy, authentication, and applied cryptography.

Roel Peeters was a postdoctoral researcher at theresearch group COSIC at KU Leuven (Belgium). Heobtained a Master’s degree in Electrical Engineeringand a Phd in Engineering Sciences at KU Leuven in2007 and 2012 respectively. His research interest aremainly in the analysis and design of cryptographicprotocols. He authored more than 30 scientific publi-cations and participated in various research projects.Currently he is the CTO of nextAuth, a companyproviding mobile user authentication solutions.

Jac Romme was born in Breda, the Netherlands,in 1975. He received the M.Sc. degree in 2000 inelectrical engineering from Eindhoven University ofTechnology, The Netherlands, and a Ph.D. degreefrom Signal Processing and Speech CommunicationLaboratory (SPSC) at the Graz University of Tech-nology, Austria. From 2000-2010, he was researcherfor wireless radio systems at the IMST GmbH,Germany. Since 2010, he has been employed byImec, the Netherlands. Currently, he is principlemember of the technical staff of the IoT department.

Since 2016 he is also a part-time Guest Researcher at Circuit and SystemsGroup at the TU Delft, the Netherlands. His research interests are radiocommunication and localization, (statistical) signal processing, Internet-of-Things and low-power IC design and modelling.

ieee transactions on information forensics and … · ignatenko [18] analyzed the information loss...

Documents