user authentication and cryptographic primitives brad karp ucl computer science cs gz03 / 4030 19 th...
TRANSCRIPT
User Authentication and Cryptographic Primitives
Brad KarpUCL Computer Science
CS GZ03 / 403019th November, 2007
2
Outline
• Authenticating users– Local users: hashed passwords– Remote users: s/key– Unexpected covert channel: the Tenex
password-guessing attack• Symmetric-key-cryptography• Public-key cryptography usage model• RSA algorithm for public-key cryptography
– Number theory background– Algorithm definition– Cryptographic strength of RSA– Ease of misusing RSA
3
Authentication of Local Users
• Goal: only file’s owner can access file• UNIX authentication policy:
– Each file has an owner principal: an integer user ID– Each file has associated owner permissions (read,
write, execute, &c.)– Each process runs with integer user ID; only can
access file as owner if matches file’s owner user ID– OS assigns user ID to user’s shell process at login
time, authenticated by username and password– Shell process creates new child processes with
same user ID• How does UNIX know the correspondence
among <username, user ID, password>, for all users?
4
Straw Man:Plaintext Password Database
• Keep password database in a file, e.g.:bkarp:3715:secretpw
mjh:4212:multicast
• Passwords stored in file in plaintext• Make file readable only by privileged
superuser (root)• /bin/login program prompts for usernames
and passwords on console; runs as root, so can read password database
• How well does this scheme meet original goal?
5
Cryptographic Primitive:Cryptographic Hash Function
• Don’t want someone who sees the password database to learn users’ passwords
• Cryptographic hash function, y=H(x) such that:– H() is preimage-resistant: given y, and with
knowledge of H(), computationally infeasible to recover x
– H() is second-preimage-resistant: given y, computationally infeasible to find x’x s.t. H(x)=H(x’)=y
• Widely used cryptographic hash functions:– MD-5: output is 128 bits, broken– SHA-1: output is 160 bits; on verge of being broken– SHA-256: output is 256 bits, best current practice
6
Better Plan:Hashed Password Database
• Keep password database in a file:bkarp:3715:Xc8zOP0ZHJkpmjh:4212:p6FsAtQl4cwi
• Instead of password plaintext x, store H(x)
• Make file readable by all (!)• One-wayness of H() means no one
can recover x from H(x), right?– WRONG! Users choose memorable
passwords…
7
Insight: Counting Possible Passwords
• If users pick random n-character passwords using c possible characters, how many guesses expected to guess one password?
cn/2e.g., 8 characters, each ~90 possibilities, 4.3 x 1015
• Do users pick random passwords?– Of course not; very hard to remember– Common choice: word in native language
• How many words in common use in modern English?– 50,000-70,000 (or far fewer, if you read Metro)
8
Dictionary Attack on Hashed Password Databases
• Suppose hacker obtains copy of password file (until recently, world-readable on UNIX)
• Compute H(x) for 50K common words• String compare resulting hashed words
against passwords in file• Learn all users’ passwords that are
common English words after only 50K computations of H(x)!
• Same hashed dictionary works on all password files in world!
9
Salted Password Hashes
• Generate a random string of bytes, r• For user password x, store [H(r,x), r] in
password file• Result: same password produces different
result on every machine– So must see password file before can hash
dictionary– …and single hashed dictionary won’t work for
multiple hosts
• Modern UNIX: password hashes salted; hashed password database readable only by root
Dictionary attack still possible after attacker sees password file!Users should pick passwords that aren’t close to dictionary words.
10
Tenex Password Attack:A Covert Channel
• Tenex OS stored directory passwords in plaintext
• OS supported system call:– pw_validate(directory, pw)
• Implementation simply compared pw to stored password in directory, char by char
• Clever attack:– Make pw span two VM pages, put 1st char of guess in
first page, rest of guess in second page– See whether get a page fault—if not, try next value
for 1st char, &c.; if so, first char correct!– Now position 2nd char of guess at end of 1st page, &c.– Result: guess password in time linear in length!
Lessons:Don’t store passwords in cleartext.Covert channels are real, and can be extremely difficult to find and eliminate.
11
Remote User Authentication
• Consider the case where Alice wants to log in remotely, across LAN or WAN from server
• Suppose network links can be eavesdropped by adversary, Eve
• Want scheme immune to replay: if Eve overhears messages, shouldn’t be able to log in as Alice by repeating them to server
• Clear non-solutions:– Alice logs in by sending {alice, password}– Alice logs in by sending {alice, H(password)}
12
Remote User Authentication (2)
• Desirable properties:– Message from Alice must change
unpredictably at each login– Message from Alice must be verifiable at
server as matching secret value known only to Alice
• Can we achieve these properties using only a cryptographic hash function?
13
Remote User Authentication: s/key
• Denote by Hn(x) n successive applications of cryptographic hash function H() to x– i.e., H3(x) = H(H(H(x)))
• Store in server’s user database:alice:99:H99(password)
• At first login, Alice sends:{alice, H98(password)}
• Server then updates its database to contain:alice:98:H98(password)
• At next login, Alice sends:{alice, H97(password)}– and so on…
14
Properties of s/key
• Just as with any hashed password database, Alice must store her secret on the server securely (best if physically at server’s console)
• Alice must choose total number of logins at time of storing secret
• When logins all “used”, must store new secret on server securely again
15
Secrecy through Symmetric Encryption
• Two functions: E() encrypts, D() decrypts
• Parties share secret key K• For message M:
– E(K, M) C– D(K, C) M
• M is plaintext; C is ciphertext• Goal: attacker cannot derive M from C
without K
16
Idealized Symmetric Encryption:One-Time Pad
• Secretly share a truly random bit string P at sender and receiver
• Define as bit-wise XOR• C = E(M) = M P• M = D(C) = C P• Use bits of P only once; never use
them again!
17
Stream Ciphers:Pseudorandom Pads
• Generate pseudorandom bit sequence (stream) at sender and receiver from short key
• Encrypt and decrypt by XOR’ing message with sequence, as with one-time pad
• Again, never, ever re-use bits from pseudorandom sequence!
• What’s wrong with reusing the stream?– Alice Server: c1 = E(s, “Visa card number”)– Server Alice: c2 = E(s, “Transaction
confirmed”)– Suppose Eve hears both messages– Eve can compute:
m = c1 c2 “Transaction confirmed”
18
Symmetric Encryption: Block Ciphers
• Divide plaintext into fixed-size blocks (typically 64 or 128 bits)
• Block cipher maps each plaintext block to same-length ciphertext block
• Best today to use AES (others include Blowfish, DES, …)
• Of course, message of arbitrary length; how to encrypt message of more than one block?
19
Using Block Ciphers: ECB Mode
• Electronic Code Book method• Divide message M into blocks of
cipher’s block size• Simply encrypt each block individually
using the cipher• Send each encrypted block to receiver• Presume cipher provides secrecy, so
attacker cannot decrypt any block• Does ECB mode provide secrecy?
20
Avoid ECB Mode!
• ECB mode does not provide robust secrecy!
• What if there are repeated blocks in the plaintext? Repeated as-is in ciphertext!
• What if sending sparse file, with long runs of zeroes? Non-zero regions obvious!
• WW II U-Boat example (Bob Morris):– Each day at same time, when no news, send
encrypted message: “Nichts zu melden.”– When there’s news, send the news at that time.– Obvious when there’s news– Many, many ciphertexts of same known plaintext
made available to adversary for cryptanalysis—a worry even if encryptions of same plaintext produce different ciphertexts!
21
Using Block Ciphers: CBC Mode
• Better plan: make encryptions of successive blocks depend on one another, and initialization vector known to receiver
22
Integrity with Symmetric Crypto:Message Authentication Codes
• How does receiver know if message modified en route?
• Message Authentication Code:– Sender and receiver share secret key K– On message M, v = MAC(K, M)– Attacker cannot produce valid {M, v} without K
• Append MAC to message for tamper-resistance:– Sender sends {M, MAC(K, M)}– M could be ciphertext, M = E(K’, m)– Receiver of {M, v} can verify that v = MAC(K, M)
• Beware replay attacks—replay of prior {M, v} by Eve!
23
HMAC: A MAC Based on Cryptographic Hash Functions
• HMAC(K, M) =H(K opad . H(K ipad . M))
• where:– . denotes string concatenation– opad = 64 repetitions of 0x36– ipad = 64 repetitions of 0x5c– H() is a cryptographic hash function, like
SHA-256• Fixed-size output, even for long
messages
24
Public-Key Encryption: Interface
• Two keys:– Public key: K, published for all to see– Private (or secret) key: K-1, kept secret
• Encryption: E(K, M) {M}K
• Decryption: D(K-1, {M}K) M• Provides secrecy, like symmetric encryption:
– Can’t derive M from {M}K without knowing K-1
• Same public key used by all to encrypt all messages to same recipient– Can’t derive K-1 from K
25
Number Theory Background:Modular Arithmetic Primer (1)
• Recall the “mod” operator: returns remainder left after dividing one integer by another, the modulus– e.g., 15 mod 6 = 3
• That is:a mod n = r
which just meansa = kn + r for some integers k and r
• Note that 0 <= r < n
26
Modular Arithmetic Primer (2)
• In modular arithmetic, constrain range of integers to be only the residues [0, n-1], for modulus n– e.g., (12 + 13) mod 24 = 1– We may also write
• Modular arithmetic retains familiar properties: commutative, associative, distributive
• Same results whether mod taken at each arithmetic operation, or only at end, e.g.:(a + b) mod n = ((a mod n) + (b mod n)) mod n(ab) mod n = (a mod n)(b mod n) mod n
24) (mod11312
27
Modular Arithmetic: Advantages
• Limits precision required: working mod n, where n is k bits long, any single arithmetic operation yields at most 2k bits– …so results of even seemingly expensive
ops, like exponentiation (ax) fit in same number of bits as original operand(s)
– Lower precision means faster arithmetic• Some operations in modular arithmetic
are computationally very difficult:– e.g., computing discrete logarithms:
find integer x s.t. n) (mod bax
Cryptography leverages “difficult” operations; want reversing encryption without key to be computationally intractable!
28
Modular Arithmetic: Inverses (1)
• In real arithmetic, every integer has a multiplicative inverse—its reciprocal—and their product is 1– e.g., 7x = 1 x = (1/7)
• What does an inverse in modular arithmetic (say, mod 11) look like?
– that is, 7x = 11k + 1 for some x and k– so x = 8 (where k = 5)
11 mod17x
29
Aside: Prime Numbers
• Recall: prime number is integer > 1 that is evenly divisible only by 1 and itself
• Two integers a and b are relatively prime if they share no common factors but 1; i.e., if gcd(a, b) = 1
• There are infinitely many primes• Large primes (512 bits and longer)
figure prominently in public-key cryptography
30
Modular Arithmetic: Inverses (2)
• In general, finding modular inverse means finding x s.t.
• Does modular inverse always exist?– No! Consider
• In general, when a and n are relatively prime, modular inverse x exists and is unique
• When a and n not relatively prime, x doesn’t exist
• When n prime, all of [1…n-1] relatively prime to n, and have an inverse in that range
n) (modx a-1
8) (modx 2-1
Algorithm to find modular inverse: extended Euclidean Algorithm. Tractable; requires O(log n) divisions.
31
Euler’s Phi Function: Efficient Modular Inverses on Relative
Primes• φ(n) = number of integers < n that are
relatively prime to n• If n prime, φ(n) = n-1• If n=pq, where p and q prime:
φ(n) = (p-1)(q-1)• If a and n relatively prime, Euler’s
generalization of Fermat’s little theorem:aφ(n) mod n = 1
• and thus, to find inverse x s.t. x = a-1 mod n:x = aφ(n)-1 mod n
32
RSA Algorithm (1)
• [Rivest, Shamir, Adleman, 1978]• Recall that public-key cryptosystems
use two keys per user:– K, the public key, made available to all– K-1, the private key, kept secret by user
33
RSA Algorithm (2)
• Choose two random, large primes, p and q, of equal length, and compute n=pq
• Randomly choose encryption key e, s.t. e and (p-1)(q-1) are relatively prime
• Use extended Euclidean algorithm to compute d, s.t. d = e-1 mod ((p-1)(q-1))
• Public key: K = (e, n)• Private key: K-1 = d• Discard p and q
34
RSA Algorithm (3)
• Encryption:– Divide message M into blocks mi, each
shorter than n
– Compute ciphertext blocks ci with:ci = mi
e mod n
• Decryption– Recover plaintext blocks mi with:
mi = cid mod n
35
Why Does RSA Decryption Recover Original Plaintext?
• Observe:ci
d = (mie)d = mi
ed = mik(p-1)(q-1)+1 (mod n)
– because e and d are inverses mod (p-1)(q-1)
• mik(p-1)(q-1)+1 = mimi
k(p-1)(q-1) (mod n)
= (mi)(1) = mi
(mod n)– by Euler’s generalization of Fermat’s little
theorem:• n prime, so mi and n relatively prime• thus mi
(p-1)(q-1) = miφ(n) = 1 (mod n)
36
Misuses of RSA Break Secrecy
• When encrypting, what if plaintext drawn from very small set (e.g., {“yes”, “no”})?
• Employees escrow secret documents, encrypted with company’s public key– Upon firing or death of one employee,
company releases plaintext to another– Employee E takes employee A’s ciphertext
c = me mod n, escrows c2e mod n– Employee E fired; co-conspirator F gets 2m!
• Chosen ciphertext attack (CCA): eavesdrop a ciphertext c; submit specially concocted messages for decryption; study resulting plaintexts; learn plaintext, m = cd mod n
37
RSA: Not Quite Exponentiation
• At first glance, RSA operations appear to be raising a message to a power
• But they’re not, really…the mod n means RSA in fact a trap-door permutation– Map one element, m, of set {0, …, n-1} to
another, c– Not invertible without knowing d
• Non-invertibility applies to whole of m and c; not to individual bits of m and c, or other properties over m and c, e.g., parity of m– In escrow attack, multiplicative relationship
among RSA ciphertexts exists, despite non-invertibility
• It’s possible that learning even one bit of m may help recover all of m from c
38
Adaptive Chosen Ciphertext Attack on RSA in SSL 3.0
• SSL 3.0 encrypted with RSA by padding plaintext into blocks using PKCS #1 standard, as follows:– 0x00 | 0x02 |
8 or more non-zero random bytes | 0x00 |plaintext block
• SSL decrypts received ciphertext, checks if result in this format; returns “format error” if not!
• Bleichenbacher’s adaptive CCA attack: with about one million messages to server, attacker can recover m for previously eavesdropped ciphertext c = me mod n– When chosen ciphertext accepted by server, attacker
knows first two plaintext bytes with certainty!
39
Making RSA Secure Against Adaptive CCA Attacks
• Intuition: want plaintext input to RSA to be all-or-nothing transform of actual message– e.g., so that multiplicative property over
ciphertexts doesn’t reveal message, and knowing one bit doesn’t reveal anything about whole message
• Desirable transform properties:– Randomness: unique plaintext for repeated
identical messages– Redundancy: make most strings invalid
ciphertexts– Entanglement: knowing partial information about
input to RSA should reveal nothing about message– Invertibility: of course, must be able to recover
original message when decrypting
40
Practical Padding for RSA:OAEP+ [Shoup]
• Transforms message M into RSA input M’• Not proven adaptive CCA secure, but heuristically so
41
Digital Signatures with RSA
• RSA trap-door permutation also useful for digital signatures
• Public-key signature operations:– Sign: S(K-1, m) {m}K
-1
– Verify: V(K, {m}K-1, m} {true, false}
• Provides integrity, like a MAC:– Cannot produce valid <m, {m}K
-1> pair without knowing K-1
• With RSA:– Sign using private key, using trap-door applied
when decrypting– Verify using public key, using permutation
applied when encrypting
42
Multiplicative Attack Against RSA Signatures
• As in CCA, attacker may try to exploit multiplicative relationship among RSA permutation inputs and outputs, to decrypt eavesdropped ciphertexts
• Eve stores ciphertext c encrypted for Alice, wants to recover corresponding m
• Using Alice’s public key, {n, e}, Eve:– Chooses random number r < n– Computes y = cre mod n– Eve asks Alice to sign y– Alice sends Eve yd mod n = cdred mod n = rcd
mod n– Eve computes r-1 mod n, then recovers
m = cd mod n = r-1rcd mod n
Lesson:Don’t sign whole messages presented to you by others!
43
Only Sign Message Hashes with RSA!
• Again, want all-or-nothing transform over message before signing with trap door
• Full-domain hash:– Before signing message, compute hash of
message sized to be same number of bits as RSA modulus n
– Sign the hash, not the message– Hash reveals nothing about underlying
message, nor messages arithmetically related to it
44
Costs of Cryptography
• Public-key operations significantly more computationally expensive than symmetric-key ones
• Modern CPU can symmetrically encrypt and MAC faster than 100 Mbps
• Public-key encryption typically 100X slower than symmetric crypto– This relationship changes as hardware
changes!
• Result: tend to use public-key encryption and signatures only on short messages
45
Hybrid Cryptography
• Goal: mix speed of symmetric-key flexibility of public-key cryptography
• Send symmetric key encrypted with public key; message encrypted with symmetric key
46
Pitfall: Public Key Provenance
• Suppose client wishes to know it’s talking to particular server
• Where does client get server’s public key?• How does client know it has correct public
key for real server, and not attacker?• Man-in-the-middle attack:
– Client connects to attacker– Attacker gives client attacker’s public key– Client believes communicating with real
server