week 15 - monday. what did we talk about last time? ethics
TRANSCRIPT
CS363Week 15 - Monday
Last time
What did we talk about last time? Ethics
Questions?
Project 3
Security tidbit of the day
Hacktivist group Anonymous is worried that people in developing countries cannot always rely on dedicated Internet connections and cell towers
They are engineering Airchat, a free tool that uses radio waves to send messages and connect to the Internet without any centralized infrastructure
Tracking people would be harder No connection between IP and
locationFollow the story:http://www.ibtimes.co.uk/anonymous-airchat-aims-allow-communication-without-needing-phone-internet-access-1445888
Case Studies of Ethics
Case VII: Accuracy of Information
Emma is a researcher who is analyzing the nutritional content of a cereal called Raw Bits
She gets a statistical programmer Paul to analyze the data
His analysis shows that Raw Bits is not nutritious and may be harmful
He suggests that another set of correlations could show Raw Bits in a more favorable light He claims he could argue any side of any issue
with statistics
Analysis
Is it ethical for Paul to suggest analyzing data to support two different conclusions?
Is Paul obligated to present both positive and negative analyses? Is he responsible for their use?
Is it ethical for Emma to accept positive or negative conclusions if she doesn't understand the statistics?
She suspects that the company will Get a new researcher if she sends them only the negative
results Publicize only the positive results if she sends them both
What course of action should she take?
Case VIII: Ethics of Hacking or Cracking
Goli is an independently wealthy computer security specialist She works only for fun
She attacks commercial products for vulnerabilities and is good at finding them
She probes systems on the Internet and, when she finds vulnerabilities, she contacts the owners of the sites to offer her services to fix them
She loves good pastry and plants programs that slow the performance of web sites of bakeries that don't use enough butter in their pastries
Analysis
Is it ethical for Goli to probe for vulnerabilities in systems?
What if her probing sometimes causes failures or performance problems?
How much and to whom should she report the vulnerabilities she finds?
What if she damaged websites based on an issue more serious than butter? What if she only damaged websites for
companies with records of human rights abuses?
Week 1 Review
Terminology
A vulnerability is a weakness in a security system
A threat is a set of circumstances that can cause loss or harm
Performing an attack is exploiting a vulnerability
A control is a protection against an attack by reducing a vulnerability
“A threat is blocked by control of a vulnerability.”
Threats
• Someone read something they weren’t supposed to
Interception
• Something became unavailable or unusable
Interruption
• Someone changed something they weren’t supposed to
Modification
• Someone created fake thingsFabrication
Method, opportunity, motive
As with traditional crime, an attacker must have these three things:
• Skills and tools to perform the attackMethod
• Time and access to accomplish the attack
Opportunity
• A reason to perform the attackMotive
The basics of computer security:
Confidentiality
IntegrityAvailability
Confidentiality
You don’t want other people to be able to read your stuff Some of your stuff, anyway
Cryptography, the art of encoding information so that it is only readable by those knowing a secret (key or password), is a principle tool used here
Confidentiality is also called secrecy or privacy
Integrity
You don’t want people to mess up your stuff
You want to know: That your important data cannot be easily
changed That outside data you consider trustworthy
cannot be easily changed either There are many different ways that
data can be messed up, and every application has different priorities
Availability
You want to be able to use your stuff Many attacks are based on denial of
service, simply stopping a system from functioning correctly
Availability can mean any of the following: The service is present in usable form There is enough capacity for authorized users The service is making reasonable progress The service completes in an acceptable period
of time
Methods of defense
There are five common ways of dealing with attacks, many of which can be used together
Prevent• Remove the vulnerability from the system
Deter• Make the attack harder to execute
Deflect• Make another target more attractive (perhaps
a decoy)Detect• Discover that the attack happened,
immediately or later
Recover• Recover from the effects of the attack
Controls
Many different controls can be used to achieve the five methods of defense
Week 2 Review
Terminology
A system popularized by Ron Rivest uses Alice and Bob as the two parties communicating Carl or another “C” name can be used if
three people are involvedTrent is a trusted third partyEve is used for an evil user who often
eavesdropsMallory is used for a malicious user who
is usually trying to modify messages
More terminology
Encryption takes a message and hides its meaning Decryption is the reverse process Encode and encipher can mean the same as encrypt Decode and decipher can mean the same decrypt A system for encrypting and decrypting messages is a
cryptosystem Plaintext (often represented as P) is the original
message Ciphertext (often represented as C) is the encrypted
version E() and D() are used as functions to represent the
encryption and decryption processes C = E(P) P = D(C)
Encryption algorithms
The algorithms for encryption often rely on a secret piece of information, called a key
We can notate the use of a specific key in either of the two following ways: C = EK(P) C = E(K, P)
In symmetric (or private key) encryption, the encryption key and the decryption key are the same
In asymmetric (or public key) encryption, the encryption key and the decryption key are different
Symmetric vs. asymmetric
Key K
Encryption Decryption
Symmetric Encryption
Plaintext P Ciphertext C
Plaintext P
Encryption Key KE
Decryption Key KD
Encryption Decryption
Asymmetric Encryption
Plaintext P Ciphertext C
Plaintext P
Cryptography and cryptanalysis Cryptography means “secret writing” A cryptographer is someone who specializes in using
cryptography to make messages secret A cryptanalyst is someone who is trying to break the
cryptography and discover the plaintext or the key A cryptanalyst could:
Break a single message Find patterns in the encryption that allow future messages to
be decrypted Discover information in the messages without fully decrypting
them Discover the key Find weaknesses in the implementation of the encryption Find weaknesses in the encryption that may or may not be able
to lead to breaks in the future
Cryptanalysis
There are two kinds of security for encryption schemes Unconditionally secure▪ No matter how much time or energy an attacker has, it is impossible
to determine the plaintext Computationally secure▪ The cost of breaking the cipher exceeds the value of the encrypted
information▪ The time required to break the cipher exceeds the useful lifetime of
the information We focus on computationally secure, because there is
only one practical system that is unconditionally secure
"I want them to remain secret for as long as men are capable of evil" -Avi from Cryptonomicon
Attacks
Cryptography is supposed to prevent people from reading certain messages
Thus, we measure a cryptosystem based on its resistance to an adversary or attacker
Kinds of attacks: Ciphertext only: Attacker only has access to an
encrypted message, with a goal of decrypting it Known plaintext: Attacker has access to a plaintext
and its matching ciphertext, with a goal of discovering the key
Chosen plaintext: Attacker may ask to encrypt any plaintext, with a goal of discovering the key
Others, less common
Substitution ciphers
Substitution ciphers cover a wide range of possible ciphers, including the shift cipher
In a substitution cipher, each element of the plaintext is substituted for some corresponding element of the ciphertext
Monoalphabetic substitution ciphers always use the same substitutions for a letter (or given sequence of letters)
Polyalphabetic substitution ciphers use different substitutions throughout the encryption process
Example: Simple Monoalphabetic Substitution Cipher
We can map to a random permutation of letters
For example:
E(“MATH IS GREAT”) = “UIYP TQ ABZIY”
26! possible permutations Hard to check every one
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
I N O V Z H A P T R G E U F D W S B Q Y L K M J C X
English language defeats us Some letters are used more frequently than
others:ETAOINSHRDLU
Longer texts willbehave more andmore consistently
Make a histogram, break the cipher Digram analysis can help too
Frequency Attack
Vigenère cipher
The Vigenère cipher is a form of polyalphabetic substitution cipher
In this cipher, we take a key word and repeat it, over and over, until it is as long as the message
Then, we add the repetitions of keywords to our message mod 26
Vigenère example
Key: BENCH Plaintext: A LIMERICK PACKS LAUGHS
ANATOMICALB E N C H B E N C H B E N C H B E N C H B E N C H B E N C H
A L I M E R I C K P A C K S L A U G H S A N A T O M I C A L
B P V O L S MP M WB G X U S B Y T J Z B R N V V N MP C S
Cryptanalysis of Vigenère
The index of coincidence measures the differences in the frequencies in the ciphertext
It is the probability that two randomly chosen letters from the ciphertext are the same
IC =
25
0
)1()1(
1
iii FF
NNPeriod 1 2 3 4 5 10 Large
Expected IC
0.066 0.052 0.047 0.045 0.044 0.041 0.038
Kasiski method
If the IC indicates that a period of more than 1 is being used, look for repeated sequences
Look at the gaps between long sequences Try to find the GCD of gaps between long
sequences If you have a reasonable guess for the length
of the key, break the ciphertext into groups based on the corresponding letter of the key
If the IC is high (in the range of a single letter), then you have probably found the key length
After the length is known…
The rest is easy Try various shifts for each letter of
the key so that high frequency letters (E, T, A) occur with high frequency and low frequency letters (Q, X, Z) occur with low frequency
Guess and check
One-Time Pad
A One-Time Pad is similar to the Vigenère cipher, except that the key is as long as the message
What will this do to the index of coincidence?
Any given ciphertext could be decrypted into any plaintext, provided that you have the right key
One-Time Pad example
Key: THISISTHESECRETPASSWORD Plaintext: SOMEBODY SHOUTED
MCINTYRES O ME B O D Y S H O U T E D MC I N T Y R E
T H I S I S T H E S E C R E T P A S S WO R D
L V U WJ G WF WZ S WK I W B C A F P MI H
Perfect secrecy
A One-Time Pad has the property of perfect secrecy or Shannon secrecy
Perfect secrecy means that P(M) = P(M|C)
Thus, learning the ciphertext tells you nothing about the plaintext
One-Time Pad weaknesses You can only use it one time
Otherwise, recovering the key is trivial Completely vulnerable to known plaintext
attack The key is as long as the message If you have a way of sending a key that
long securely, why not send the message the same way?
Generating keys with appropriate levels of randomness presents a problem
Week 3 Review
Block ciphers
A block cipher is a symmetric key cipher that works on a block of data of a given size
For compatibility with hardware, block sizes are often powers of two: 64 bits, 128 bits, 256 bits, etc.
Block ciphers are a fundamental part of many modern cryptosystems
To encrypt a message longer than a single block: First break the message into blocks Then, each block could be encrypted individually Or data from the first block can be used in the
encryption of the second, and so on
DES
Data Encryption Standard DES is a typical block cipher It was chosen as the government's
standard for encryption in 1976 (but has since been deprecated)
DES works on blocks 64 bits in size DES uses a 56 bit key NSA helped design it… amidst some
controversy
DES internals
DES has 16 rounds The book calls them cycles
In each round, the input is broken into 2 halves, manipulated, and combined with part of the key
Input
Permutation
Left0 Right0f
Key1
Left1 Right1+
Left0 Right0
S-boxes
DES uses bitwise operations as well as lookup tables
DES has 8 substitution boxes (S-boxes) which take 6 bits of data and give back 4
The function from the F circle
The expansion permutation takes 32 input bits and expands them into 48 bits while permuting them 16 bits are repeated
These 48 bits are XORed with the round key
The resulting 48 bits are substituted through S-boxes which produces a 32 bit result
The final 32 bits are permuted
Expansion Permutati
on
XOR with Key
S-box
P-box
NSA controversy
The NSA tinkered with DES They shortened the key length from the original 128 bits
of Lucifer to 56 They changed the S-boxes People were concerned that the NSA had introduced a
trapdoor so that they could read messages Eventually, the NSA released information about the
choice of S-boxes: No S-box is a linear or affine function of its input Changing 1 bit of the S-box input changes at least 2 bits
of its output If a single bit is held constant, changing the others should
not radically change the total number of 1s or 0s in the output
DES strengths
DES is fast Easy to implement in software or
hardware Encryption is the same as decryption Triple DES is still standard for many
financial applications Resistant to differential and linear
cryptanalysis (247 and 243 known pairs required, respectively)
DES weaknesses
Short key size Brute force attack by EFF in 1998 in 56
hours then in 1999 in just over 22 hours Brute force attack by University of Bochum
and Kiel in 9 days in 2006 (but, using a machine costing only $10,000)
If you could check 1,000,000,000 keys per second (which is unlikely with a commodity PC), it would take an average of 417 days to recover a key
Double DES attack
K1 492989976
K2 688857766
K3 282627672
K4 499659602
K5 532263602
K6 498278096
K7 752271744
K8 846172716
864059530 K1
717075649 K2
993328605 K3
991061777 K4
154785500 K5
210537840 K6
688857766 K7
528110960 K8
Encrypt P1 Decrypt C1 Two pairs of plaintexts and ciphertexts are needed
Encrypt P1 with all possible keys and save them
Decrypt C1 with all possible keys If the result matches
anything in the list, use the key to encrypt P2
If that matches C2, you win! On the left, I show all the
decryptions, but only the encryptions need to be stored
Triple DES
Although susceptible to a brute force attack, DES has no other major weaknesses Double DES can be defeated by an extension of the brute
force attack What about triple DES?
Let EK(X) and DK(X) be encryption and decryption using DES with key K
Triple DES uses keys K1, K2, and K3 C = EK1(DK2(EK3(M))) Setting K1 = K2 = K3 allows for compatibility with single
DES systems Triple DES is still a standard for financial
transactions with no known practical attacks
AES
Advanced Encryption Standard Block cipher designed to replace DES Block size of 128-bits Key sizes of 128, 192, and 256 bits Like DES, has a number of rounds (10, 12,
or 14 depending on key size) Originally called Rijndael, after its Belgian
inventors Competed with 14 other algorithms over a
5 year period before being selected by NIST
AES internals
AES keeps an internal state of 128 bits in a 4 x 4 table of bytes
There are four operations on the state: Substitute bytes Shift rows Mix columns Add round key
AES rounds
AES supports key sizes of 128, 192, and 256 bits Rijndael supports unlimited key size, in principle, as well as other
block sizes 128 bit keys use 10 rounds, 192 use 12, and 256 use 14
Add round key
First Round Substitute
bytes
Shift rows
Mix columns
Add round key
Normal Round Substitute
bytes
Shift rows
Add round key
Last Round
AES pros and cons
Strengths Strong key size Fast in hardware and software Rich algebraic structure Well-studied, open standard
Weaknesses Almost none A few theoretical attacks exist on reduced
round numbers of AES No practical attacks other than side channel
attacks
Side channel attacks
Attacks that rely on timing, measuring cache, energy consumption, or other ways an implementation leaks data are called side channel attacks
Several practical side channel attacks for AES do exist In 2005, Bernstein found a cache-timing attack that broke an
OpenSSL implementation of AES using 200 million chosen plaintexts and a server that would give him precise timing data
Later in 2005, Osvik et al. found an attack that recovered a key after 800 encryptions in only 65 milliseconds, with software running on the target machine
In 2009, Saha et al. found an attack on hardware using differential fault analysis to recover a key with a complexity of 232
In 2010, Bangerter et al. found a cache-timing attack that required no knowledge of plaintexts or ciphertexts and could work in about 3 minutes after monitoring 100 encryptions
AES vs. DES
DES AES
Date 1976 1999
Block size 64 bits 128 bits
Key length 56 bits 128, 192, 256 bits
Encryption primitives
Substitution, permutation
Substitution, shift, bit mixing
Cryptographic primitives Confusion, diffusion Confusion, diffusion
Design Open Open
Design rationale Closed Open
Selection process Secret Secret with public comment
Source IBM with NSA help Independent Belgians
Security Broken if you’ve got the resources
No practical attacks yet
Public key cryptography
Sometimes, we need something other than a shared secret
We want a public key that anyone can use to encrypt a message to Alice
Alice has a private key that can decrypt such a message
The public key can only encrypt messages, it cannot be used to decrypt messages
Prime
RSA depends in large part on the difficulty of factoring large composite numbers (particularly those that are a product of only 2 primes)
For those of you who aren't in Formal Methods, an integer p is prime if p > 1 p is not divisible by any positive integers
other than 1 and itself
Fundamental theorem of arithmetic
Any integer greater than 1 can be factored into a unique series of prime factors: Example: 52 = 22 ∙ 13
Two integers a and b (greater than 1) are relatively prime or coprime if and only if a shares no prime factors with b
Greatest common divisor
The greatest common divisor or GCD of two numbers gives the largest factor they have in common
Example: GCD( 12, 18 ) = GCD( 42, 56 ) =
For small numbers, we can determine GCD by doing a complete factorization
Euclid's algorithm
For large numbers, we can use Euclid's algorithm to determine the GCD of two numbers
Algorithm GCD( a, b)1. If b = 0▪ Return a
2. Else▪ temp = a mod b▪ a = b▪ b = temp
3. Goto Step 1 Example: GCD( 1970, 1066)
Week 4 Review
Fermat’s Little Theorem
If p is prime and a is a positive integer not divisible by p, then:
ap –1 1 (mod p)
Euler's in the mix too
Euler’s totient function (n) (n) = the number of positive
integers less than n and relatively prime to n (including 1)
If p is prime, then (p) = p – 1 If we have two primes p and q
(which are different), then:(pq) = (p)∙(q) = (p – 1)(q – 1)
Take that, Fermat
Euler’s Theorem:For every a and n that are relatively prime,
a(n) 1 (mod n)
This generalizes Fermat’s Theorem because (p) = p – 1 if p is prime
Proof is messier
RSA Algorithm
Named for Rivest, Shamir, and Adleman
Take a plaintext M converted to an integer
Create an ciphertext C as follows:C = Me mod n
Decrypt C back into M as follows:M = Cd mod n = (Me)d mod n = Med mod n
The pieces
Term Details Source
M Message to be encrypted Sender
C Encrypted message Computed by sender
n Modulus, n = pq Known by everyone
p Prime number Known by receiver
q Prime number Known by receiver
e Encryption exponent Known by everyone
d Decryption exponent Computed by receiver
(n) Totient of n Known by receiver
How it Works
To encrypt:C = Me mod n
e is often 3, but is always publically known
To decrypt:M = Cd mod n = Med mod n
We get d by finding the multiplicative inverse of e mod (n)
So, ed 1 (mod (n))
Why it Works
We know that ed 1 (mod (n)) This means that ed = k(n) + 1 for
some nonnegative integer kMed = Mk(n) + 1 M∙(M(n))k (mod n) By Euler’s Theorem
M(n) 1 (mod n) So, M∙(M(n))k M (mod n)
Why it’s safe
You can’t compute the multiplicative inverse of e mod (n) unless you know what (n) is
If you know p and q, finding (n) is easy
Finding (n) is equivalent to finding p and q by factoring n
No one knows an efficient way to factor a large composite number
Key management
Once you have great cryptographic primitives, managing keys is still a problem
How do you distribute new keys? When you have a new user When old keys have been cracked or need to
be replaced How do you store keys? As with the One Time Pad, if you could
easily send secret keys confidentially, why not send messages the same way?
Notation for sending
We will refer to several schemes for sending data
Let X and Y be parties and Z be a message { Z } k means message Z encrypted with key
k Thus, our standard notation will be:
X Y: { Z } k Which means, X sends message Z, encrypted with
key k, to Y X and Y will be participants like Alice and Bob
and k will be a clearly labeled key A || B means concatenate message A with B
Kinds of keys
Typical to key exchanges is the idea of interchange keys and session keys
An interchange key is a key associated with a particular user over a (long) period of time
A session key is a key used for a particular set of communication events
Why have both kinds of keys?
Classical exchange: Attempt 0 If Bob and Alice have no prior
arrangements, classical cryptosystems require a trusted third party Trent
Trent and Alice share a secret key kAlice and Trent and Bob share a secret key kBob
Here is the protocol:1. Alice Trent: {request session key to Bob}
kAlice
2. Trent Alice: { ksession } kAlice || { ksession } kBob
3. Alice Bob: { ksession } kBob
Classical key exchange
Purpose Exchange a session key between two parties
Weaknesses A trusted third party is required Protocols are complicated Some protocols have hard to spot security risks
Practice looking for the holes in the protocols They always have a "man in the middle" aspect Always assume that Eve can completely control
all communication
Public key exchange
Suddenly, the sun comes out! Public key exchanges should be
really easy The basic outline is:
1. Alice Bob: { ksession } eBob
eBob is Bob's public key Only Bob can read it, everything's
perfect! Problems can still happen if parties
cannot get each other’s public keys reliably
Hash function definition
A cryptographic (or one-way) hash function (called a cryptographic checksum in the book) takes a variable sized message M and produces a fixed-size hash code H(M)
Not the same as hash functions from data structures
The hash code produced is also called a digest It can be used to provide authentication of both
the integrity and the sender of a message It allows us to store some information about a
message that an attacker cannot use to recover the message
Collisions
When two messages hash to the same value, this is called a collision
Because of the pigeonhole principle, collisions are unavoidable
The key feature we want from our hash functions is that collisions are difficult to predict
Crucial properties
• Given a digest, should be hard to find a message that would produce it
• One-way property
Preimage Resistance
• Given a message m, it should be hard to find a different message that has the same digest
Second Preimage
Resistance
• Should be hard to find any two messages that hash to the same digest (collision)
Collision Resistance
Additional properties
• A small change in input should correspond to a large change in output
Avalanching
• Hash function should work on a block of data of any size
Applicability
• Output should be a fixed length Uniformity
• It should be fast to compute a digest in software and hardware
• No longer than retrieval from secondary storage
Speed
Password dilemma resolved
Instead of storing the actual passwords, Windows and Unix machines store the hash of the passwords
When someone logs on, the operating system hashes the password and compares it to the stored version
No one gets to see your original password!
Any Problems?
What’s the probability that Ahmad has the same password (or a password that hashes to the same value) as Bai Li?
Very small! What’s the probability that anyone
has the same password (or a password that hashes to the same value) as anyone else?
Not nearly as small!
Salt
If you are the administrator of a large system, you might notice that two people have the same password hash
With people's password habits, the odds are very high that their passwords are the same
To add to the semantic security of such schemes random salt (often 8 bits or so) is added to the end of a password
When checking a password against the hash, the system tries all 28 possible values for the salt
Common Hash Functions
MD5
Message Digest Algorithm 5 Very popular hashing algorithm Designed by Ron Rivest (of RSA fame) Digest size: 128 bits Security
Completely broken Reasonable size attacks (232) exist to create two
messages with the same hash value MD5 hashes are still commonly used to
check to see if a download finished without error
SHA family
Secure Hash Algorithm Created by NIST SHA-0 was published in 1993, but it was replaced in 1995 by
SHA-1 The difference between the two is only a single bitwise rotation,
but the NSA said it was important Digest size: 160 bits Security
Mostly broken Attacks running in 251 - 257 time exist
SHA-2 is a successor family of hash functions 224, 256, 384, 512 bit digests Better security, but not as widely used Designed by the NSA
SHA-3 is now available Variable length digests
Week 5 Review
Birthday attack
If a hash value is made up of k bits 2k can be big So, we need to check one hash against 2k -
1 other hashes to have a 50% probability of matching
But, by the birthday paradox
We need a much smaller number to get a collision!
k ≈ ට2(ln2)2k ≈ 1.18 ൫2k/2൯
IDEA Forms
Upcoming
Next time…
Review up to Exam 2 Richard Fenoglio presents
Reminders
Review Chapters 3 through 7 Keep cracking each other's Project 3
Final report due this Friday