# 0CTF 2017 - Integrity crypto challenge

Published on by

This year me and some friends (mostly TRU/e Security Master students) decided to join a CTF (Capture The Flag) game, 0CTF 2017. For most of us, it would be the first (global) CTF so this day was mainly for fun and learning. This write-up is intended to be accessible for my fellow students and will therefore explain the approach in more detail.

One of the challenges that we managed to solve is Integrity:

Just a simple scheme.
nc 202.120.7.217 8221
integrity_f2ed28d6534491b42c922e7d21f59495.zip

The zip archive contains a Python 2 script, integrity.py:

```#!/usr/bin/python -u

from Crypto.Cipher import AES
from hashlib import md5
from Crypto import Random
from signal import alarm

BS = 16
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
unpad = lambda s : s[0:-ord(s[-1])]

class Scheme:
def __init__(self,key):
self.key = key

def encrypt(self,raw):
raw = md5(raw).digest() + raw

cipher = AES.new(self.key,AES.MODE_CBC,iv)

return ( iv + cipher.encrypt(raw) ).encode("hex")

def decrypt(self,enc):
enc = enc.decode("hex")

iv = enc[:BS]
enc = enc[BS:]

cipher = AES.new(self.key,AES.MODE_CBC,iv)
blob = cipher.decrypt(enc)

checksum = blob[:BS]
data = blob[BS:]

if md5(data).digest() == checksum:
else:
return

scheme = Scheme(key)

alarm(30)

print "Welcome to 0CTF encryption service!"
while True:
cmd = raw_input()

if not cmd:
break

if cmd[0]=='r' :
name = raw_input().strip()

if(len(name) > 32):
break
print "You cannot use this name!"
break
else:
print scheme.encrypt(name)

elif cmd[0]=='l':
data = raw_input().strip()
name = scheme.decrypt(data)

print flag
else:
print "Welcome %s!" % name
else:
print "Unknown cmd!"
break
```

A high-level overview of what the program is doing:

• You have 30 seconds to exploit the program before it terminates (due to `alarm(30)`).
• The registration command accepts a name (at most 32 characters, cannot be `admin`) and then prints the encrypted name.
• The login command accepts the ciphertext and then greets the decrypted name. If the name was `admin` it will show the flag that was read from the `flag` file. (Give me, give me!)
• The encryption uses the AES block cipher in CBC mode using a random 128-bit (16 byte) key (fixed for the duration of the session) and a random 128-bit initialization vector (IV). Data is padded to the AES block size (16 bytes) using the PKCS#7 padding scheme.
• A 128-bit MD5 checksum is prepended to the original (padded) data to preserve the integrity (hey, that is the name of this challenge!).
• On decryption, no name is returned if the checksum validation fails.

An example of a normal session (our input is highlighted):

```\$ python2 integrity.py
Welcome to 0CTF encryption service!
r
user
13bd65d4bbd9bec99c9f63876eb80720fc4dc0819270f3f655d7a437cf3855d2c0c3bbde892a555b872b4043785813db
l
13bd65d4bbd9bec99c9f63876eb80720fc4dc0819270f3f655d7a437cf3855d2c0c3bbde892a555b872b4043785813db
Welcome user!
```

Let's have a deeper look at the encryption scheme, starting with the `Scheme.encrypt` method:

1. `raw = pad(raw)`
Add PKCS#7 padding to ensure that the data size is a multiple of the block size (`BS = 16`). If the input happened to be an exact multiple of the block size, then a full block is added as padding (16 times `"\x10"`).
2. `raw = md5(raw).digest() + raw`
Prepend the MD5 hash of the padded name. Observe that the size of the MD5 digest matches the block size of AES (16 bytes).
3. `iv + cipher.encrypt(raw)`
Encrypt the previous concatenation of the checksum and padded data using AES in CBC mode using the fixed 128-bit key and a new, random 128-bit IV. This IV is then prepended.

The end result can be partitioned in 16 byte blocks as follows:

```+---------------+---------------+---------------+
|       IV      | Enc(checksum) |   Enc(data)   |
+---------------+---------------+---------------+
```

If the name exceeds 15 bytes, then we have another block:

```+---------------+---------------+---------------+---------------+
|       IV      | Enc(checksum) |   Enc(data1)  |   Enc(data2)  |
+---------------+---------------+---------------+---------------+
```

where `Enc(x)` denotes encryption of `x`. Decryption is handled by `Scheme.decrypt` as follows:

1. `iv = enc[:BS]`; `enc = enc[BS:]`
Split the result in the 16-byte IV and the ciphertext.
2. `blob = cipher.decrypt(enc)`
Decrypt the ciphertext using the fixed key and the received IV.
3. `checksum = blob[:BS]`; `data = blob[BS:]`
Split the decrypted blob, obtaining the 16-byte checksum and data (the padded name).
4. `if md5(data).digest() == checksum: return unpad(data)`
If the actual checksum matches the received one, strip the padding and return the name. Otherwise, nothing is returned. Note that the padding is removed by reading the last byte, then removing that number of bytes from the end.

Now, what could possibly go wrong? Maybe we can send a specially crafted name of 16 bytes, like `"admins"` (length 6), padded with 10 bytes containing the `"\x0b"` byte (representing padding length 11). The idea was that the receiver then removes the 10 padding bytes and the letter `s`, obtaining name `"admin"`. This did not work though since the encryption oracle always adds a full 16 byte block which we cannot remove since it would invalidate the checksum.

The next idea is to look at attacking the CBC block mode. The operation of encryption of decryption of our data using AES in CBC mode is best explained with this diagram (modelled after the figure from this article):

```                                        data under our control
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
+---------------+ +---------------+-------------------------------+
|       IV      | + - - - - - - - + - - - - - - - + - - - - - - - +
|               | |   checksum    |     data1     |     data2     |
+---------------+ +---------------+---------------+---------------+
|                |               |               |
├--------------⟶ ⊕    ╭--------⟶ ⊕    ╭--------⟶ ⊕
|                ↓    |          ↓    |          ↓
|              +---+  |        +---+  |        +---+
ENCRYPT  |        key ⟶ |AES|  |  key ⟶ |AES|  |  key ⟶ |AES|
|              +---+  |        +---+  |        +---+
|                |    |          |    |          |
|                ├----╯          ├----╯          |
↓                ↓               ↓               ↓
+---------------+ +-----------------------------------------------+
|               | |         C  I  P  H  E  R  T  E  X  T          |
|       IV      | + - - - - - - - + - - - - - - - + - - - - - - - +
|               | | Enc(checksum) |   Enc(data1)  |   Enc(data2)  |
+---------------+ +---------------+---------------+---------------+
|                |               |               |
|                ├----╮          ├----╮          |
|                ↓    |          ↓    |          ↓
|              +---+  |        +---+  |        +---+
DECRYPT  |        key ⟶ |AES|  |  key ⟶ |AES|  |  key ⟶ |AES|
|              +---+  |        +---+  |        +---+
|                ↓    |          ↓    |          ↓
╰--------------⟶ ⊕    ╰--------⟶ ⊕    ╰--------⟶ ⊕
↓               ↓               ↓
+---------------+---------------+---------------+
|   checksum    |     data1     |     data2     |
+ - - - - - - - + - - - - - - - + - - - - - - - +
+---------------+-------------------------------+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

As you can see, the decryption of the individual blocks from the ciphertext also involves an XOR operation with the previous ciphertext block (or in case of the first block, the IV).

What happens when we drop the first block of our encrypted data? Well, things will be interpreted a bit differently...

```                                        data under our control
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
+---------------+ +---------------+-------------------------------+
|       IV      | + - - - - - - - + - - - - - - - + - - - - - - - +
|               | |   checksum    |     data1     |     data2     |
+---------------+ +---------------+---------------+---------------+
|                |               |               |
├--------------⟶ ⊕    ╭--------⟶ ⊕    ╭--------⟶ ⊕
|                ↓    |          ↓    |          ↓
|              +---+  |        +---+  |        +---+
ENCRYPT  |        key ⟶ |AES|  |  key ⟶ |AES|  |  key ⟶ |AES|
|              +---+  |        +---+  |        +---+
|                |    |          |    |          |
x                ├----╯          ├----╯          |
↓               ↓               ↓
+-----------------------------------------------+
ORIGINAL        |         C  I  P  H  E  R  T  E  X  T          |
SEMANTICS        + - - - - - - - + - - - - - - - + - - - - - - - +
| Enc(checksum) |   Enc(data1)  |   Enc(data2)  |
+---------------+---------------+---------------+
NEW         +---------------+ +-------------------------------+
SEMANTICS      |        IV'    | | C  I  P  H  E  R  T  E  X  T '|
+---------------+ +-------------------------------+
|               |               |
|               ├----╮          |
|               ↓    |          ↓
|             +---+  |        +---+
DECRYPT                   |       key ⟶ |AES|  |  key ⟶ |AES|
|             +---+  |        +---+
|               ↓    |          ↓
╰-------------⟶ ⊕    ╰--------⟶ ⊕
↓               ↓
+---------------+---------------+
ORIGINAL SEMANTICS             |     data1     |     data2     |
+---------------+---------------+
+---------------+---------------+
NEW SEMANTICS                |   checksum'   |   pad(name')  |
+---------------+---------------+
^^^^^^^^^^^^^^^
```

Something interesting has happened. Now the first block of the original ciphertext is interpreted as IV while the following blocks are still interpreted as ciphertext. The decryption will still succeed due to how the CBC mode works.

In our normal scenario, this modified input will still be rejected since the decrypted `checksum'` does not match `MD5(pad(name'))`. But as an attacker, observe that we have full control of this `checksum'` (`data1`) field!

In order to impersonate the admin, we should now follow the protocol and compute the MD5 checksum over the padded data and prepend this to the original name (`"admin"`). Now all that remains is sending this payload to the server, read back the ciphertext, drop the first 16 bytes (or 32 hexadecimal characters) and send this data as login. Bingo, we got our flag!

As for the implementation of the above idea, since we have only 30 seconds to input something and since writing binary data via the terminal is awkward, we wrote a Python script to communicate with the server. For ease of testing, we initially started a dummy server using socat:

```echo dummy > flag
```

The Python 3 source code of the implementation as used during the CTF:

```#!/usr/bin/env python
import hashlib
import socket

lines = []

global lines
while len(lines) < required_lines:
chunk = sock.recv(4096)
assert chunk, "maybe EOF?"
chunk = chunk.strip()
if chunk:
lines += chunk.split(b"\n")

def expect_line(needle=b''):
global lines
print("Expecting", needle, "...", end="")
line = lines[0]
assert needle in line, '%r not in %r' % (needle, line)
lines = lines[1:]
print("found", line)
return line

# Skip hello
expect_line(b'Welcome to')

# Register
sock.sendall(b"r\n")
width = 16
data = hashlib.md5(data).digest() + data
print("Data:", data)

sock.sendall(data + b"\n")

# "Here is your response", try to get ciphertext
expect_line(b'Here is your')
ciphertext = expect_line()
print("Ciphertext:", ciphertext)
# Ciphertext = IV || AES-CBC(md5(data) || data)
# IV, checksum are both 16 bytes.
# Drop IV, then we have:
#              ^^^^^^^^^ treated as new IV!
# Server will decrypt and see that the checksum over pad("admin") is correct
ciphertext = ciphertext[width*2:]

# Next round, try to login
sock.sendall(b"l\n")
sock.sendall(ciphertext + b"\n")
welcome = expect_line(b"Welcome ")
# Hopefully the flag
flag = expect_line(b'')
print("Flag:", flag)
else:
print("No flag found :(")
```

The careful reader might notice that our data as constructed above is exactly 16 bytes which would normally be subject to padding. Then why would this still work if the padding was not stripped from the resulting ciphertext? It turns out that the used padding (`0b`) is the vertical tab which is stripped by the `strip()` function (in `name = raw_input().strip()`). Coincidence, but we only noticed this after the CTF 😁.

And finally, here is the flag that we obtained from the server:

```Expecting b'Welcome to' ...found b'Welcome to 0CTF encryption service!'