CNS - 20 - TLS IV
Lecture Info
Dates:
Lecturer: Giuseppe Bianchi
Slides:
Introduction: In this lecture we have seen two more attacks to TLS: the BEAST attack, which exploits the predicatbility of the IV in TLS v1.0, and the CRIME attack, which exploited the compression step.
1 CBC for Multiple Messages
If we have to encrypt multiple messages using a block cipher with a CBC mode of operation, it intuitively makes sense that each message has to be encrypted with a different initialization vector.
While this condition is necessary, it is not sufficient to guarantee security. Indeed, the initialization vector must also be unpredictable.
Consider now a big message \(M\) that is obtained by combining two different messages \(m_1\) and \(m_2\). To encrypt this message you'd need a single IV, and the IV for \(m_2\) would be the last ciphertext block of \(m_1\). It thus seems equivalent to simply send the two message separately, and while we add the IV to encrypted version of the first message, we send the encrypted version of the second message implicitly assuming that the IV for this second message is actually the last cipher block of the first message. If the first construction is secure (which it is), then intuitively it seems like also the second construction is secure.
1.1 What if Predictable IV?
Suppose you have the following situation
and assume the following:
You know that the \(i\) th block of the plaintext message contains a password.
The possible values for the password are either JOE or UGO.
You can do a Choosen Plaintext Attack (CPA).
Even with the following assumption, if the encryption scheme is IND-CPA (semantically secure), then you'd still be unable to understand whether the password was JOE or UGO. We will now see that if the IV are predictable, then we break the semantic security property.
Assume now that the IV is predictable.
the basic idea of the attack is that by choosing a proper first block of the plaintext message, the attacker can pass arbitrary values to the PRP. Indeed, if the attacker has
\[\text{newmsg}[0] = X \mathbin{\oplus} \text{Something}\]
has the first plaintext block, then the first ciphertext block is obtained by passing to the PRP the value \(\text{Something}\). To actually guess which password was sent in the original message then the something has have the following form
\[\text{Something} = \text{FF54B1F1} \mathbin{\oplus} \text{"PASS=JOE"}\]
if I get as a result \(\text{132FA776}\), then the password was JOE, otherwise it was UGO, semantic security is broken, and I'm able to perform a dictionary attack.
2 BEAST Attack (TLS v1.0)
The fact that in CBC block encryption the IV should be unpredictrable was well known since at least 1999, where in an email Rogaway explained very clearly this regarding an IPSec issue. Then in 2004 Moller explained that in TLS v1.0 there was a standardization mistake, in which the IV of the new message is the last \(c[i]\) of the previous message, making it thus very predictable. This mistake was corrected in TLS v1.1, in which a new explicit IV was generated per new msg (this was mandatory for DTLS). Finally, in 2011 T. Duong and J. Rizzo came out with the BEAST attack.
This kind of attack seemed unpractical for various reasons, such as:
Software issues: You would need to inject an active agent code running on the browser of the user in order to perform the choosen plaintext attack. This however was later found feasible with modern web technologies such as websockets and html5.
Complexity issues: Even if you are able to attack the system, in order to decrypt the concent of the block you would need to brute force all of the \(2^{128}\) possible plaintexts, which is way too large. Once again, this was fixed by the creativity of the attackers, which came out with a chosen boundary attack which is linear with respect to the authentication cookie size.
The attack is showcased in the following youtube video BEAST vs HTTPS.
2.1 Chosen Boundary Attack
The main breakthrough of the attack was the introduction of the following new idea: inserting some preamble text in order to align the plaintext that we want to break such that only a single unknown character is in inside a particular block. The extra preamble text needed to allign the first character of the password is injected by the code we have put on the victim machine and this injection takes place before the message is encrypted and sent to the server.
In the BEAST attack in particular the idea was to align the authentication token so that only the first character of the password was within a given block.
After the first letter is discovered, the preamble text can be changed so that in the block the first two letters appear
With this approach the complexity for an \(N\) bytes cookie is \(256 * N\), against a full brute force approach which would take \(256^N\) guesses. For an \(8\) byte password this would instantiate in
\[\begin{split} 256 \cdot 8 &= 2^{11} = 2048 \\ 256^{8} &= 2^{64} = 18 446 744 073 709 551 616 \\ \end{split}\]
3 CRIME Attack (2012)
The same guys that did the BEAST attack, T. Duong and J. Rizzo, also came out with the CRIME attack in 2012. This attack exploits the fact that in TLS before being encrypted the data is also compressed.
3.1 Basic Idea
Consider to send messages which contain only letters of the alphabet. With this assumption a possible compression scheme would be the following: anytime you have a letter appearing multiple times in a row, you compress it by prefixing a number to a single instance of the letter. For example,
\[\begin{split} \text{AAAABC} \,\, &\overset{\text{compress}}{\longrightarrow} \,\, \text{4ABC} &\overset{\text{encrypt}}{\longrightarrow} \text{X&%\$} \\ \text{ABCDEF} \,\, &\overset{\text{compress}}{\longrightarrow} \,\, \text{ABCDEF} &\overset{\text{encrypt}}{\longrightarrow} \text{&\$%A%\$} \\ \end{split}\]
The underlyign idea behind the CRIME attack was then the following:
Add a preamble to the message to be compressed and then encrypted, just like it was done in the BEAST attack. This preamble should be easily compressible.
See the size of the resulting encryption. By checking the size one can infer information about the original plaintext, such as, for example, the first letter.
Graphically we get,
3.2 Actual Attack
The actual attack CRIME stands for Compression Ratio Info-Leak Made Easy. This attack works irrespective of the cipher used, so it can be applied to both stream ciphers and block ciphers.
To actually make this attack work the attacker has to be able to inject a suitably choosen text prior to the user data. The text choosen depends on the particularities of the compression algorithm used. In the CRIME attack the compression algorithm which was attacked was DEFLATE.
The DEFLATE algorithm jointly uses two sub-algorithm:
Huffman Coding (bit oriented).
LZ77 (bytes oriented): replaces repeated 3+ char strings within a given window size with (offset, size) pointers. For example, give the following string
GIUSEPPE BIANCHI AND MARCO BIANCHINI
then the compressed result is the following
GIUSEPPE BIANCHI AND MARCO(-17, 8)NI
3.2.1 How to Choose Input
Consider the following HTTP request that a valid user makes to authenticate to twitter
GET / HTTP/1.1 Cokkie: twid=flavia\r\n
An attacker can now inject the following text, which is not passed on the server side
/comment:twid=a GET / HTTP/1.1 Cokkie: twid=flavia\r\n
Now we can guess each letter by trying all the possibilities, one after the other. When the resulting encryption is shorter than the default case, then we know we found a new letter. This makes the attack linear.
3.3 Aftermath
After the CRIME attack in september 2012 compression in TLS was disabled in Chrome.
The problem of compression is actually much more fundamental, and it doesnt concern only TLS. Indeed, similar attacks were made also in other settings, such as:
The TIME attack.
The BREACH attack.
Thus, anytime you see COMPRESS then ENCRYPT, you have to be careful of whats going on.
Observation: In 2002, 10 years prior the actual exploitation, Kelsey wrote a paper named "Compression and Information Leakage of *Plaintext" which discussed these issues.