language modeling and encryption on packet switched networks kevin mccurley
TRANSCRIPT
![Page 1: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/1.jpg)
Language Modeling andEncryption on
Packet Switched Networks
Kevin McCurley
![Page 2: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/2.jpg)
A.A. Markovof St. Petersburg
1856-1922
Андрей Андреевич Марков
![Page 3: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/3.jpg)
The science of cryptology
1. Devise a mathematical model of communication
2. Devise a mathematical model of an adversary
3. Construct a cryptosystem4. Prove that the theoretical adversary cannot
break the cryptosystem (perhaps under assumptions)
![Page 4: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/4.jpg)
The application of cryptology
• Choose a construction for a cryptosystem
• Adapt it to your model of communication
• Watch it get broken by adversaries who don’t fit the model.
The choice of a model isat least as important as a proof.
![Page 5: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/5.jpg)
Security models
The real
world
Complexity- theoreticsecurity
Information theoreticsecurity
![Page 6: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/6.jpg)
Attacks that have fallen outside the scope of security models
• Electromagnetic radiation measurements
• Timing analysis
• Power analysis
• Fault analysis
• Acoustic attacks
• Cache attacks
…
![Page 7: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/7.jpg)
Micali and Reyzin“Physical Observable Security”
TCC 2004
• Extension of security models to include physical instantiation of computation.
• Better description of adversary results in a more robust model.
![Page 8: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/8.jpg)
Are these the only attacks?
• Micali & Reyzin address the physical act of computation
• What about the physical act of communication?
How do we even define communication?
![Page 9: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/9.jpg)
Claude E. Shannon
“The mathematical theory of communication” (1948)
“The communication theory of secrecy systems” (1949)
![Page 10: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/10.jpg)
Shannon’s linear model of communication (1948)
Informationsource
Transmitter
Noise
Receiver DestinationChannel
A discrete Markov process on a finite domain
![Page 11: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/11.jpg)
Information theoreticperfect secrecy
• Messages are drawn from an underlying known probability distribution
• An encryption system has perfect secrecy if P(ciphertext | plaintext) is independent of plaintext.
![Page 12: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/12.jpg)
The one time pad has perfect secrecy.
BUT: messages must all be the same length or else you lose perfect secrecy.
One time pad
![Page 13: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/13.jpg)
What about infinite message spaces?
Theorem. (Chor-Kushilevitz, 1989) It is impossible to construct an information-theoretically secure cryptosystem on a countably infinite message space.
![Page 14: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/14.jpg)
Theorem (Goldreich) A semantically secure encryption scheme cannot hide the length of the plaintext against polynomial time adversaries.
Are these just theoretical concerns? or do they show up in practice?
Is leaking the message length unavoidable?
![Page 15: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/15.jpg)
Example: communication to a stock broker with messages of “buy IBM” or “sell IBM”
Example: in a file system, sizes of many files may provide evidence that encrypted files are copies of known files
Example: in the military, voluminous communications may indicate a command or intelligence center with multimedia
What can be learned from the length of a message?
![Page 16: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/16.jpg)
In practical systems, bandwidth and storage are not free!
How can we deal with message lengths?
Pad the messages to be all the same length Keep talking at all times
![Page 17: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/17.jpg)
i3or uqpcs hrt nbqpdn 0xcae opx
How does one approach the question?is it encrypted or just gibberish?what language is it?what is the character set?who said it?when did they say it?how was it encrypted?
What is this the encryption of?
![Page 18: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/18.jpg)
A related problem:communication is segmented
• Human spoken language is broken into syllables, words, sentences.
• Human written communication is broken into paragraphs, chapters, articles, books.
• Movies are broken into frames.
• Internet communication is packet switched.
![Page 19: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/19.jpg)
# ####### ### ## ##### #### ###### ### ######## ######## #### #### #### ##### #### ## # #######
###### ########
An cryptanalyst would have a tremendous advantage in guessing the message!
I usually get my stuff from people who promised somebody else that they would keep it a secret.
Walter Winchell
Suppose I told you where the spaces were
![Page 20: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/20.jpg)
Word lengths in 3 translations of a Tolstoy novel
![Page 21: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/21.jpg)
Word length transitions
1 2 3 4 5 6 7 8S1
S2
S3
S4
S5
S6
S7
S8
1 2 3 4 5 6 7 8S1
S2
S3
S4
S5
S6
S7
S8
French
1 2 3 4 5 6 7 8
S1
S2
S3
S4
S5
S6
S7
S8
English
German
![Page 22: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/22.jpg)
Internet CommunicationsAre Packet Switched
Packets are formed and transmitted according to the “language of the application”
Complications arise from buffering, Nagle’s algorithm, etc.
![Page 23: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/23.jpg)
SYNACK
SYN-ACK
Banner
Login:
mab
password:
w
h
. . .
The “language” of ssh (simplified)
![Page 24: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/24.jpg)
Cryptanalytic attacks based on packet timings and size
• Keystroke timings (Song, Wagner, Tiang)
• Probable plaintext (Bellovin)
• Several traffic classification studies:– Moore and Zuev (2005) 95% accuracy with
supervised Bayesian analysis
![Page 25: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/25.jpg)
Another looming attack:voice over IP
• VoIP has high quality of service demands.
• VoIP packets are easy to recognize.
• VoIP supports “silence suppression” for bandwidth savings.
![Page 26: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/26.jpg)
IPSEC traffic padding (TFC)
• Omitted in early versions of IPSEC
• Added in recent versions of tunnel mode to obfuscate traffic patterns.
• No theoretical basis for security arguments.
• No guidance on how to generate dummy traffic.
![Page 27: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/27.jpg)
If a known packet is repeatedly encrypted,
then the padding distribution may be
recognizable, and can be subtracted.
Note: simple padding is NOT enough
![Page 28: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/28.jpg)
Do we have to packetize data?
Success of Internet depends on it Fairness in a shared medium Quality of service Buffering Error correction retransmission
![Page 29: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/29.jpg)
Can we afford to keep the Internet channels full?
• Everyone depends on everyone else using only what they need.
• Internet exists as a sparse graph:– O(n) total nodes to connect n endpoints– To support circuits, we would need a much
denser graph
• Alternatives in onion routing, mixes
2
n
![Page 30: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/30.jpg)
Shannon’s epic 1949 paper“Communication Theory of Secrecy Systems”
1. Concealment systems, that obscure the existence of communication.
2. Privacy systems, requiring special communication hardware.
3. Secrecy systems, utilizing mathematical transformations.
Shannon addressed only secrecy systems, calling concealment a “psychological problem”
![Page 31: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/31.jpg)
What is an appropriate definition of communication?
![Page 32: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/32.jpg)
Shannon’s communication model revisited
Informationsource
Transmitter
Noise
Receiver DestinationChannel
“Frequently the messages have meaning; that is they referto or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.”
![Page 33: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/33.jpg)
Deficiencies in Shannon’s model
• Communication should be bidirectional.
• Communication is a physical process with side effects.
• Communication is segmented.
• Communication has context.
![Page 34: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/34.jpg)
Temporal aspects of communication
Do nothing secretly; for time sees and
hears all things, and discloses all.
- Sophocles, 496-406 B.C.
![Page 35: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/35.jpg)
Two versions of Shannon’s paper
• Bell System Technical Journal (1948)– “A Mathematical Theory of Communication”
• University of Illinois Press (1949)– “The Mathematical Theory of Communication”– With a section by Warren Weaver: “Recent
Contributions to The Mathematical Theory of Communication”
![Page 36: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/36.jpg)
Weaver’s three levels of communication
Level A. How accurately can the symbols of communication be transmitted? (the technical problem)
Level B. How precisely do the transmitted symbols convey the desired meaning? (the semantic
problem)Level C. How effectively does the received
meaning effect conduct in the desired way? (the effectiveness problem)
![Page 37: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/37.jpg)
Definitions of communication• Shannon:
– Reproducing the output from a stochastic process either approximately or exactly.
• Lasswell’s definition:– Who says what to whom in what channel with what
effect
• The exchange of messages that change the a priori expectation of events.
• Griffin: the management of messages for the purpose of creating meaning
• Schramm: a purposeful effort to establish a commonness between a source and receiver
![Page 38: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/38.jpg)
The DIKW Hierarchy(from the 1980s)
• Data (bits or raw symbols)– ’k’,51771
• Information (symbols with relationships)– ‘e’ is more common than ‘z’
• Knowledge (useful information)– ‘e’ is energy; ‘e’ is a mathematical constant
• Wisdom (understanding of knowledge)– Extrapolative, non-deterministic process.
![Page 39: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/39.jpg)
DIKW Hierarchy(the origins)
• Where is the wisdom we have lost in knowledge?
• Where is the knowledge we have lost in information?
T.S. Eliot, 1934
“The Rock”
![Page 40: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/40.jpg)
Moving up the DIWK Hierarchy
• Data Information– Understanding structure
• Information Knowledge– Understanding patterns, context, relationships
• Knowledge Wisdom– Understanding principles, applications,
meaning, interpretation.
![Page 41: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/41.jpg)
At what layer does a cryptanalyst work?
• Data: what is the fifth symbol in this message?• Information: Are there more 1’s than 0’s in this
message? Is there ever a sequence 00010000000?• Knowledge: what language is being spoken in this
communication?• Wisdom: did the initiator just ask a question or
give a command? Should I expect a response? Who is in command?
![Page 42: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/42.jpg)
At what layer does the crypto designer work?
• Data: make sure that this bit is unrecognizable from a random bit.
• Information: make sure that they can’t estimate the distribution of bits accurately.
• Knowledge: make sure that the adversary cannot recover the encoded knowledge.
• Wisdom: make sure the cryptanalyst has no understanding of what they observe.
![Page 43: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/43.jpg)
Closing thoughts
• We need better models of communication in order to advance cryptology.
• We need better definitions of knowledge and wisdom in order to advance cryptology.
• Absolute security for internet communication is probably impossible.
![Page 44: Language Modeling and Encryption on Packet Switched Networks Kevin McCurley](https://reader030.vdocuments.us/reader030/viewer/2022032707/56649e295503460f94b1723c/html5/thumbnails/44.jpg)
Security is mostly a superstition. ... Life is nothing if not a grand adventure.
- Helen Keller
http://mccurley.org/papers/traffic/ for updates