PlaidCTF 2012 "simple" writeup
Written by Pierre Bourdon
simple is a binary that waits for a string on stdin and returns whether the input was the valid key or not. It does that in a very interesting way: there are only 112 bytes of executable x86 code in this 45K binary. After a bit of static analysis in IDA we found out that these 112 bytes implement a common One Instruction Set Computer virtual machine for the subleq architecture with two very simple twists.
A subleq computer works in a very simple way: only one instruction can be executed:
subleq a, b, c(with
cthree addresses). This instruction computes
*b = *b - *athen jumps to
*b <= 0.
The binary starts by loading
espregister, then for each iteration of the loop pops 3 values from the stack. This is the "fetch" state of the virtual machine. After that, a normal subleq computer would simply execute a substraction and a conditional jump. Here, the VM starts by checking if
ais an invalid address (less than 0). If it is, it executes a
readsyscall to read a single character to
*b. Then, it checks if
bis an invalid address. If it is, it executes a
writesyscall to write a single character from
*a. Then if both addresses are valid it executes a standard
We started to reverse the whole bytecode for that virtual machine, but reversing something made of several thousands of identical instructions is really hard and we weren't really able to find anything interesting through static analysis.
We then tried to combine static and dynamic analysis by tracing the bytecode to see which instructions were executed and in which order. Using gdb and breaking on the first instruction of the VM loop, it is easy to get the address of the current instruction. When we were reading some traces we noticed that the bytecode was (obviously) doing loops: to read characters until a newline, but also probably to check if the input string is valid. That means that we can actually count how much loop iterations were performed for our input by counting how much instructions were executed. This is basically a timing attack applied to a virtual machine.
We used this to bruteforce the input string one character by one character using the charset
[a-zA-Z0-9_-]. For each character added to the input we counted the number of instructions the VM executed and noticed that this number varied slightly:
1 2 3 4 5 6 7 8 9
trying key 'h'... 2997 trying key 'i'... 2997 trying key 'j'... 2997 trying key 'k'... 2997 trying key 'l'... 2977 trying key 'm'... 2993 trying key 'n'... 2993 trying key 'o'... 2993 trying key 'p'... 2993
lhad a lower count and every character after it was slightly lower than characters before it. We assumed that it was because it was a valid first character. We then bruteforced all the other characters before getting to the final solution of this problem:
1 2 3 4 5 6 7 8 9 10 11
trying key 'l3ss_1s_m0r3z'... 12309 trying key 'l3ss_1s_m0r3_'... 12309 trying key 'l3ss_1s_m0r3-'... 12309 trying key 'l3ss_1s_m0r3!'... 14559 trying key 'l3ss_1s_m0r30'... 12309 trying key 'l3ss_1s_m0r31'... 12309 trying key 'l3ss_1s_m0r32'... 12309 $ ./simple l3ss_1s_m0r3! Key: l3ss_1s_m0r3!
Here is the script we used to bruteforce the key character by character:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
#! /bin/sh current_guess="$1" charset="a b c d e f g h i j k l m n o p q r s t u v w x y z" charset="$charset _ - ! 0 1 2 3 4 5 6 7 8 9" # Simple GDB script: disable the paging with "set height 0", then define a # recursive function which prints "instr" and continues to the next breakpoint. # GDB has a recursion limit so we have to execute "countinstrs" several times :/ cat > gdbscript <<EOF set height 0 define countinstrs p "instr" c countinstrs end b *0x31341 run < guess countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs EOF for c in $charset; do guess="$current_guess$c" echo "$guess" > guess echo -n "trying key '$guess'... " gdb ./simple < gdbscript 2>&1 \ | grep '^\$.*= "instr"$' \ | tail -1 \ | cut -d ' ' -f 1 \ | cut -c 2- done
PlaidCTF 2012 "stego" writeup
Written by Pierre-Marie de Rodat
1 2 3 4 5
We are a little unsure what the robots fascination with Star Trek is but it would seem from the amount of accesses this image has been getting that it holds something interesting for them. Can you figure out what it is? http://i.imgur.com/MjYUJ.gif
stego is an animated image (GIF) made from a Star Trek sequence. The first task was to learn more about the GIF format. I used the spec itself. The main things to know are:
- Images are represented as an array of palette-based pixels;
- There is one global color palette and image frames can embed their own local palette;
- A GIF file contains a “stream” of sections: image frames, metadata sections, etc.
The first try was to decode completely the given file in order to check the sections against unusual metadata blocks (to maybe find embedded information) or hidden frames, but everything was just usual.
Then, I took a closer look to the palettes: no image frame had a local palette, so I just looked at the global palette, and I found something surprising: many palette entries had the same color! The consequence was that in some image frames, one could see an uniform area instead of different colors, hidding shapes in the same way the Ishihara test would for color deficient people.
To reveal these shapes, I replaced the whole palette using random colors in order to remove color “aliasing”. With an image editor, I could then see that the first frame displayed “You’re on the right track but you have to go deeper”. Meh.
As a next try, I still looked at palette-related issues (the given hint was “Palette” after all!): are some palette colors over-used, or under-used? Nothing raised from this search, and after some time, someone from the team asked me “have you tried to
xorall image frames together?”. Damn, I had not, but could it be so simple… and so palette-unrelated?
Using a simple Python script, I tried to
xorpalette indexes for each image frames and it gave me the previous text, plus