authentication and passwordsmdr/teaching/dss16/09-passwords.pdf · – secure => uniformly...
TRANSCRIPT
Authentication and passwords
Passwords
The Key Idea
Prover sends a password to a Verifier.
• The channel must be private – If an attacker obtains a user’s password, he can authenticate as her.
• Passwords must be hard to guess – But easy to remember, so that they don’t have to be written down.
• We need to balance security and usability. E.g., – Passwords should be changed frequently – But this change shouldn’t be mandatory
Issues with passwords
Need to examine: • Choosing the password
– And how to set and change it
• Storing the password on each side – cryptography – software / hardware security
• Using/typing the password: – *** vs shoulder surfing
• Transmitting the password – encrypted in some way? – not guaranteed to be sufficient…
Bad User?
Users have the right to be bad.
Attacks Taxonomy
• Discovering – shoulder surfing – device malware – keyboard logger – eavesdropping / sniffing from network – stealing from a server – spoofing (fake login page)
• Guessing and verifying the guess – online – offline
Password guessing
• Online guessing – Attacker attempts to access system and submits a guess of the
password – Attacker can be thwarted by rate limiting
• Offline guessing – Attacker checks a guess of the password against captured data – Attacker does checking in private; cannot be thwarted by rate limiting
Target: • against one particular user • many users, target any one: can be easier! • target many users
Password strength
Entropy of a Source
Let X be a random variable (with a finite or infinite number of possible outcomes xi).
The entropy of X [Shannon] is: H(X) = - Σx Pr[X=x] log2 Pr[X=x]
Suppose xi has probability pi of occurring. Then H(X) = - Σi pi log2 pi = Σi pi log2 (1/pi)
Examples • Let X be a toss of a fair coin
– x1 = heads, p1 = 0.5 x2 = tails, p2 = 0.5
– H(X) = - 0.5 log2(0.5) - 0.5 log2(0.5) = 1 – Thus, exactly as one would expect, a fair coin toss carries exactly
one bit of information.
• Let X be a toss of a biased coin – x1 = heads, p1 = 0.3
x2 = tails, p2 = 0.7 – H(X) = - 0.3 log2(0.3) – 0.7 log2(0.7) = 0.88 – So a biased coin carries less information (it’s more predictable).
• By the same calculation, a 0.01, 0.99 biased coin carries 0.02 bits of information.
Application to passwords • Let X be a password selected from 8 character strings
formed from the alphabet A-Z a-z 0-9 !@$%^&*()_-=+\|”’;., – Say the alphabet has 26+26+10+18 = 80 characters – Passwords have length 8, so there are 880 = 2240 possibilities. – The entropy is 240 bits. – Σ2240 (1/2240) log2(2240) = 240.
• Sounds great – but that’s assuming every password is as likely as every other one. – We know “password123”, “letmein” are much more likely – And “j&fR}5=X” very unlikely.
• So what is the true entropy in typical password scenarios?
Approximating real-life passwords Let’s assume that • 20% of the users pick a very bad password
– Paswords like “password”, “pa$$w0rd”, and each of the (say) 20 other variations…
– Also words like “letmein”, “123456”, ”qwerty”,…, and their variations – Say there are 200 similar passwords.
• Another 30% of the users chose dictionary words with a number at the end, like “laptop3” – Say there are 1000 words, 10 numbers = 10000 such passwords
• Another 40% of them did it slightly better: “lovelife^$&” – Say there are 1000 words, 1000 decorations = 106 such passwords
• That leaves a remaining 10%... – 4% picked from 108 passwords – 4% picked from 1020 passwords – 2% picked from the remaining 2240 such passwords
Calculating the entropy • There are 200 passwords each with probability 0.2*1/200
– Σ200 -(0.2 * 1/200) log2(0.2 * 1/200) = 1.99
• And 10000 passwords, each with prob 0.3*1/10000 – Σ10000 -(0.3 * 1/10000) log2(0.3 * 1/10000) = 4.51
• 106 passwords, each with prob 0.4 * 1/106 – Σ106 -(0.4 * 1/106) log2(0.4 * 1/106) = 8.5
• 108 passwords, each with prob 0.04 * 1/108 – Σ108 -(0.04 * 1/108) log2(0.04 * 1/108) = 1.25
• 1020 passwords, each with prob 0.04 * 1/1020 – Σ1020 -(0.04 * 1/1020) log2(0.04 * 1/1020) = 2.8
• 1072 passwords, each with prob 0.02 * 1/1072 – Σ1072 -(0.02 * 1/1072) log2(0.02 * 1/1072) = 4.8
• Total: 24 bits of entropy on average – And check that our probabilities add up to 1.
Battery horse staple • Let’s say you use a dictionary of 90,000 words
– /usr/share/dict/words has 99,171 words on my system
• Computer chose 4 words uniformly at random – Like correct battery horse staple – This gives us 900004 possibilities, that’s 265.8 possibilities – So it’s about 66 bits of entropy
• Let’s calculate the entropy with the log formula… – There are 265.8 passwords, each with probability 1/265.8
• Σ265.8 -(1/265.8) log2(1/265.8) = 65.8
• That’s great. But how usable is it? Say you have 40 passwords to remember...
User-chosen vs system-generated? 99% of systems
including most banks allow people to chose their passwords.
This makes them much less secure. • lower entropy in general • even for security-aware users, humans are UNABLE to generate really
random numbers; entropy is just lower
But hard to make system-generated passwords memorable.
Core password problem • Impossible situation for humans:
– Secure => uniformly random (chosen by machine) => not memorable by humans
– Passwords must be different for every service.
• Usually implies compromise betw. memorable and random – Passwords are chosen by the user (memorable) – But aim to be as random as possible, e.g.
• Passwords must have mix of A-Z a-z 0-9 $%^&*()… • Must change every 6 months, must be different from previously chosen 10 pwds
• Increasingly popular: password managers – Passwords chosen by machine, stored in pwd manager – User just remembers one password, to unlock the manager – Problem: single point of failure, from both security and availability
points of view.
Secondary passwords (pwd recovery service)
Two problems: • Usually less secure, backdoor entry point
– Sarah Palin email hack: Yahoo email account of USA vice presidential candidate was accessed by David Kernell in Sept 2008, who looked up bio details including high school and birthdate and used Yahoo’s password recovery service.
• Legitimate users fail to pass. – Most these questions such as the “name of your first pet” do or “your
fist car” do not have a unique answer… • problems with spelling, capital letters etc. • users may deliberately put incorrect information anyway
Password Storage
Password Storage and Verification How to store a password p ?
Method 1: store p. VERY BAD ! Unnecessary point of failure. Attacker might obtain all
the passwords.
If a website is able to remind you of your pwd, it’s storing them: rubbish website.
Not needed! Key concept: OWF = One-Way Function.
Password Storage and Verification Method 2: store h(p). Better but…
Brute force attacks possible, even though h is a OWF.
If an attacker obtains all the password hashes, it can try to guess them, and check the guesses offline.
Simple use of a OWF does protect strong passwords, but it doesn’t adequately protect weak ones.
Password Storage and Verification Method 2: store h(p).
• This method allows an attacker to guess a password and verify it against all the users in one go – For each guessed p, just check if h(p) is in the password
file.
Password Storage and Verification Method 3:
Key idea: make sure identical passwords are stored differently. – Example 1: store h(name, p). – Example 2: store h(salt, p), salt.
• With salt being a random “shadow ID” for this user. • Unix originally stored h(p) in readable /etc/passwd • Now it stores h(salt,p),p in /etc/shadow, readable only by root:
defence in depth – now cannot relate passwords from different users, – removes the faster dictionary attack form the last slide
So is it better to store:
Method 3A: store h(name, machine ID, salt, password), salt
?
Slow hash functions
• Instead of using a plain hash function, one can use one that is deliberately slow – This slows down the attacker who is doing offline
guessing. It adds small cost to the verification process – hopefully negligible.
• PBKDF2 (standardised in 2000) iterates 1000 times (or 10000). – Unfortunately, can be done very fast on ASICs or GPUs – bcrypt (1999) needs more RAM, resists better. – scrypt can use arbitrarily large amounts of memory,
resists better
Lots of password leaks In June 2012 a file containing over six million
password hashes which allegedly originated from LinkedIn was widely circulated over the Internet.
• Hashes were not salted. • Later, hackers found out lots of passwords using
rainbow tables and dictionary attacks. – Many cracked passwords contained "linked" or even
"linkedin”; for example "lawrencelinkedin". – Even passwords such as "parikh093760239",
"a06v1203n08" and "376417miata? " have already been cracked…
Ashley Madison In July 2015, a group calling itself “The Impact Team”
stole the user data of Ashley Madison, a website aiming to enable extramarital affairs.
• Passwords were salted, and hashed with bcrypt. • This means strong passwords are safe, but weak
ones can still be brute-forced. – Five days of cracking revealed 4000 passwords out of
36M (1%). • 123456 (202), password (105), 12345 (99), qwerty (32),
12345678 (31), ashley (28), baseball (27), etc. – Caveat: maybe some users picked weak passwords
because they entered throw-away data…
Case study: Firefox password mgt
(see other slides)
Limited disclosure schemes
Used by many banks, please type digits 1,3,4 and the last.
Addresses: malware on client; shoulder surfing; keyboard logger
Does not address: theft of password file from server
• Quiz: if failed, should the system ask the same or different subset?
* * * *
Limited disclosure schemes
How to store these passwords? • Store in the usual way, h(p,salt),salt? Doesn’t work! • Store individual characters, h(c1,salt1),h(c2,salt2),…?
Insecure! • Something else?
One-Time Passwords (OTP) Key properties: • The password is changed each time • The attacker cannot know it in advance,
– real-time man-in-middle attacks remain possible
Lamport OTP Scheme Based on OWF. Use hash chains, go backwards.
Let x1=h(x), x2=h(x1), …, x1000=h(x999). Store x1000 on the server. Small storage. Fast. Go backwards: passwords are x999, then x998, then… Each xi allows to log-in only once. If user submits p, server checks if h(p)=x. If true, then x:=p, and login is accepted.
User keeps a sheet with x999, x998, x997, x996…. Problem: can be photocopied…
– and the user still has it, naively thinking it is secure…
Time-synchronized OTP
=> PC login…
Time-synchronized OTP
• Code is fixed for 30-60 s. Window of opportunity: 30 s, second session possible connected from another location…
Challenge-Response Protocols
Challenge is a random nonce
randomB
A B
A, MACK(randomB, B)
K K
“Challenge” is time or counter
can also use a block or stream cipher, used as a MAC
A B
K K
A, tc, MACK(tc, B)
Counters, nonces, timestamps
• Challenge = random nonce (best solution)
• Counter/sequence number as a “static” challenge – E.g., wireless car key sends
id, MACK(id, counter), where k shared between key and car
• Time as a challenge difficult to make secure – Need reliable, synchronised clocks – Challenge to make granularity small enough
Unilateral vs bilateral authentication Unilateral auth is historically very popular. Examples: • password -> login • SIM card -> GSM base station (fixed in 3G) • offline bank card transactions -> Point of Sale
terminal
Problems: • login page spoofing etc. • false GSM base stations • false ATMs
Bilateral authentication Really important on the web, to try to prevent phishing
attacks
• TLS
Problems: • Key certification