simple is a binary that waits for a string on stdin and returns whether the input was the valid key or not. It does that in a very interesting way: there are only 112 bytes of executable x86 code in this 45K binary. After a bit of static analysis in IDA we found out that these 112 bytes implement a common One Instruction Set Computer virtual machine for the subleq architecture with two very simple twists.
A subleq computer works in a very simple way: only one instruction can be
executed: subleq a, b, c
(with a
, b
and c
three addresses). This
instruction computes *b = *b - *a
then jumps to c
if *b <= 0
.
The binary starts by loading 0x323ac
into the esp
register, then for each
iteration of the loop pops 3 values from the stack. This is the “fetch” state
of the virtual machine. After that, a normal subleq computer would simply
execute a substraction and a conditional jump. Here, the VM starts by checking
if a
is an invalid address (less than 0). If it is, it executes a read
syscall to read a single character to *b
. Then, it checks if b
is an
invalid address. If it is, it executes a write
syscall to write a single
character from *a
. Then if both addresses are valid it executes a standard
subleq
.
We started to reverse the whole bytecode for that virtual machine, but reversing something made of several thousands of identical instructions is really hard and we weren’t really able to find anything interesting through static analysis.
We then tried to combine static and dynamic analysis by tracing the bytecode to see which instructions were executed and in which order. Using gdb and breaking on the first instruction of the VM loop, it is easy to get the address of the current instruction. When we were reading some traces we noticed that the bytecode was (obviously) doing loops: to read characters until a newline, but also probably to check if the input string is valid. That means that we can actually count how much loop iterations were performed for our input by counting how much instructions were executed. This is basically a timing attack applied to a virtual machine.
We used this to bruteforce the input string one character by one character
using the charset [a-zA-Z0-9_-]
. For each character added to the input we
counted the number of instructions the VM executed and noticed that this number
varied slightly:
trying key 'h'... 2997
trying key 'i'... 2997
trying key 'j'... 2997
trying key 'k'... 2997
trying key 'l'... 2977
trying key 'm'... 2993
trying key 'n'... 2993
trying key 'o'... 2993
trying key 'p'... 2993
The character l
had a lower count and every character after it was slightly
lower than characters before it. We assumed that it was because it was a valid
first character. We then bruteforced all the other characters before getting to
the final solution of this problem:
trying key 'l3ss_1s_m0r3z'... 12309
trying key 'l3ss_1s_m0r3_'... 12309
trying key 'l3ss_1s_m0r3-'... 12309
trying key 'l3ss_1s_m0r3!'... 14559
trying key 'l3ss_1s_m0r30'... 12309
trying key 'l3ss_1s_m0r31'... 12309
trying key 'l3ss_1s_m0r32'... 12309
$ ./simple
l3ss_1s_m0r3!
Key: l3ss_1s_m0r3!
Here is the script we used to bruteforce the key character by character:
```bash #! /bin/sh
current_guess=”$1”
charset=”a b c d e f g h i j k l m n o p q r s t u v w x y z” charset=”$charset _ - ! 0 1 2 3 4 5 6 7 8 9”
Simple GDB script: disable the paging with “set height 0”, then define a
recursive function which prints “instr” and continues to the next breakpoint.
GDB has a recursion limit so we have to execute “countinstrs” several times :/
cat > gdbscript «EOF set height 0 define countinstrs p “instr” c countinstrs end
b *0x31341 run < guess
countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs countinstrs EOF
for c in $charset; do guess=”$current_guess$c” echo “$guess” > guess echo -n “trying key ‘$guess’… “ gdb ./simple < gdbscript 2>&1 \ | grep ‘^$.*= “instr”$’ \ | tail -1 \ | cut -d ‘ ‘ -f 1 \ | cut -c 2- done