Retrospective on r2wars 2018
Between 5th and 8th of September I attended r2con, the annual radare2 conference in Barcelona, Spain. This highly-technical, single-track event consisted of multiple training sessions, workshops, and talks, all revolving around topics such as reverse engineering, binary exploitation, malware analysis, software instrumentation, niche hardware platforms, and even chiptune music production. Check out the event page and the r2con2018 Github repository for more details.
This post is about my entries in the r2wars competition that took place during r2con2018. If you’re into wargames, CTFs, competitive programming, low-level programming tricks, or preparing for a future r2wars event, keep reading.
This is the “official” description of the game:
r2wars is a game similar to corewars, where 2 programs run on a shared memory space trying to catch each other in order to trash their code and make them crash.
More details can be found here, but the crux of the matter is the following:
- The bots must be written in
- On every run, exactly 2 programs compete against each other.
- Both programs share the same address space of 1024 bytes, and a separate but shared (I think) stack.
- On every run, both programs are loaded at random locations within the shared address space.
- The r2wars scheduler will execute one instruction from each program, on every cycle. Execution of complex instructions, such as the
REPinstruction family on
x86, gets adjusted accordingly in order to reflect the “one operation per cycle” principle.
- In order to win a run, a program must either overwrite the opponent, or simply outlive it by waiting for it to execute an illegal instruction or reference an illegal memory address.
- There is a limit of 8000 cycles per run, which results in a draw when reached.
- Every game iteration is a tournament, where the winners are the top 3 highest-scoring bots.
You’re doing it wrong!
If you’re a seasoned assembly programmer or spent the best years of your life optimizing shellcode, you’ll notice that these bots are far from optimal -and you’ll be right. All my submissions were hacked together during breaks, in crowded and noisy rooms, in an attempt to meet submission deadlines.
Development & Debugging
The r2wars environment that was used during the event can be found here. Unfortunatelly, it’s written in C# and I did not have the patience to resolve the dependencies and the build errors on my FreeBSD laptop.
I also have to admit that my r2 skills are not something to write home about, so I used the following ghetto approach to develop and debug my bots:
x86instructions in vim
- Encode them into hex-pairs:
$ rasm2 -b 32 -a x86 -f <filename>
radare2with 1024 bytes of addressable memory, initialize ESIL, write the hex-pairs into the address space, initialize the stack and program counter registers, fire-up the debugger in visual mode:
$ radare2 -ax86 -b32 malloc://1024 [0x00000000]> aei [0x00000000]> aeim [0x00000000]> wx [hex-pairs] @100 [0x00000000]> aer PC=100 [0x00000000]> aer SP=200 [0x00000000]> V!
- Step through the code using F8 while monitoring the content of the registers
- Repeat Ad Nauseum
The Good One
After contemplating various strategies, I decided on a 2-stage offensive approach:
- Copy the active payload to a known location within the 1KB address space.
- Jump into the active payload and start trashing the address space, from top to bottom.
I used variations of this strategy in 4 out of 5 games in total. This bot made it into the winning triplet every time that it was used, so it’s safe to say it wasn’t a bad approach.
You can find all my bots on Github
Game #1 [2nd place]
This is what my initial submission looked like:
mov eax, 0xc3c3c3c3 ;  mov ebx, eax mov ecx, eax mov edx, eax mov ebp, eax mov edi, eax mov esp, 1020 ;  mov esi, 1021 ;  mov [esi], 0xe6ff60 ;  jmp esi ; 
My intent here is to keep the active payload as small and fast as possible. The payload consists of only 3 bytes, and it’s the following:
pushad ; 60 jmp esi ; ffe6
pushad instruction is amazing. Not only you get to write 32 bytes worth of data (all
x86 general-purpose registers) wherever your stack register is pointing, it also decreases the stack pointer by the amount of bytes it pushed, all in a single byte.
As you can see in
, I copy the aforementioned active payload into addresses 1021-1023, set the
esi register to always point to the active payload, and start executing it. I’ve set the stack pointer to point right below my active payload (
), so this loop will trash the entire address space, top-to-bottom, 32 bytes at a time. (Remember that
x86 stack grows towards smaller addresses.)
, the stack will be mostly full of
0xc3 bytes, which is the
ret instruction in
x86 ISA. So, if another
x86 bot attempts to execute instructions within an area I already wrote-through, it will pop a random value from the stack and try to return there, which will most likely trigger an illegal memory access and result in a crash.
Now, the obvious problem with this bot is that it can trash the executable address space only once, before the stack pointer reaches 0 and wraps-up, triggerring memory access violation. So nothing fancy here, but it was a good start and it earned me the 2nd place in that round.
Game #2 [1st place]
In this iteration of the game I patched the aforementioned issue. This increased the size of my active payload and made my bot a bit slower, but now, as long as my active payload is not corrupted, my bot will not crash itself.
mov eax, 0xc3c3c3c3 ;  mov ebx, eax mov ecx, eax mov edx, eax mov edi, 32 ;  mov ebp, 0x3fc ;  mov edx, 1014 ;  mov esp, edx ;  mov esi, 1015 ;  mov [esi], 0x7ffc3960 ;  mov [esi+4], 0xffd489fb ;  mov [esi+8], 0xe6 ;  jmp esi ; 
The new active playload is now 9 bytes (
) and it lives at address range [1015-1023]. It consists of the following instructions:
pushad ; 60 cmp esp, edi ; 39fc jg 0x3f7 ; 7ffb mov esp, edx ; 89d4 jmp esi ; ffe6
The reason behind comparing registers to registers, instead registers to constant values, is that the opcodes for register-to-register comparison are shorter, resulting in a smaller payload.
I’m checking whether the stack pointer has surpassed the 32nd byte of the address space, since performing a
pushad beyond this point will result in memory access violation. If that’s the case, I set the stack pointer back to its initial state -just below the active payload- and start trashing the address space once again.
Now, there are 2 issues with this approach:
- I don’t overwrite the last 32 bytes of the address space, so a competitor can hide there and eventually take me out.
- There are no measures to prevent other
x86-based competitors from executing my payload. This happened a couple of times and resulted in time-outs and draws.
Game #3 aka “The One With The Stack” [3rd place]
My bot now trashes the entire address space, except the region of my active payload. It also trashes the active payload delivery sequence before jumping into the active payload. This prevents my opponents from successfully executing my code since their registers are out-of-sync with mine.
mov eax, 0xc3c3c3c3 ;  mov ebx, eax mov ecx, eax mov edx, eax mov edi, 32 mov ebp, 0x3fc mov edx, 1011 mov esi, 1012 mov [esi], 0x7ffc3960 ;  mov [esi+4], 0x60fc89fb ;  mov [esi+8], 0xe6ffd489 ;  call self_pwn ;  self_pwn: ;  pop esp ;  sub esp, 8 ;  pushad ;  mov esp, edx ;  jmp esi ; 
The active payload this time is 12 bytes (
pushad ; 60 cmp esp, edi ; 39fc jg 0x3f4 ; 7ffb mov esp, edi ; 89fc pushad ; 60 mov esp, edx ; 89d4 jmp esi ; ffe6
Whenever my stack pointer ends-up below the first 32 bytes of the address space, I set it back to 32 and perfom a
pushad which successfully overwrites the area that I missed earlier. Then I set it again just below my active payload, and start trashing the address space again.
I also overwrite the initial sequence that generates the active payload and jumps into it, in
. I use the
pop esp sequence to acquire the address of that initial sequence, tweak the stack pointer, and proceed to trash it with
pushad. Once this is done, I set the stack pointer back to its upper limit, jump into my active payload and start trashing the address space.
Now this is where things get rather dramatic. @Aissn and @kbeckmann come up with the same idea: use the stack for code execution. This increased the duration of the game from the typical runtime of 10m, to 1h 15m. No other bots were actively using the stack, so this game had a lot of time-outs and draws. The game ended with
@kbeckmann getting the first 2 places, and my bot was ranked 3rd.
After this, SkUaTeR patched the game environment and set the stack to be non-executable.
Game #4 [1st place]
The 4th game of r2wars took place on Saturday morning. I just used the same version and ranked 1st.
This was probably a good time to stop changing my bot. Instead, I rewrote my bot to use a completely different strategy. Although I can’t know for sure, I suspect this led to my demise during the final game. Which brings us to the next section…
The Not-So-Good One
At some point during the conference, @kbeckmann mentioned something about keeping the payload within general-purpose registers at all times, and somehow executing it. I decided to experiment with this idea. The bad news is that my bot ranked 4th in this game. The good news is that I suspect that I found bugs in ESIL, which was pretty much the entire point of r2wars.
Game #5 [4th place]
The main strategy behind this bot is the following:
- Keep the active payload within
- Descend the address space 32 bytes at a time using
pushad, and jump to
esp, which will always be pointing into the contents of
This approach overwrites the entire address space, and the active payload is constantly on the move. It’s like surfing a wave:
mov eax, 0xc3c3c3c3 ;  mov ebx, 1024 ;  mov ecx, 0 ;  mov edx, 0x90909000 ;  mov esi, 0xe4ff60e3 ;  mov edi, 0x440fcc39 ;  call pwn ;  pwn: ;  pop esp ;  sub esp,4 ;  pushad ;  mov esp, ebx ;  pushad ;  jmp esp ; 
I reused multiple ideas from my previous bots:
- Filling the stack with
- Keeping the payload short by using register-to-register comparison and storing the limits in
- Trashing the initial stage to prevent enemy bots from executing my code (
The active payload is the following:
cmp esp, ecx ; 39cc cmove esp, ebx ; 0f44e3 pushad ; 60 jmp esp ; ffe4
In this scenario, I start at the very top of the address space (1024), and trash it by 32 bytes at a time. Part of the data that I use to trash the address space is my actual payload, and
esp is always pointing there, so every time I overwrite the address space I jump into
esp, shifting my active payload by 32 bytes. Whenever the stack pointer reaches the address 0, I reset it back to 1024 with a
cmove, and start all over.
The r2wars competition was definitelly one of the highlights of r2con2018. The organization team was also very generous and rewarded r2wars competitors with some really cool
Something that I didn’t pay attention to during the games but realized during this retrospective, is that the r2wars event was great demonstration of the power of the
radare2 framework. Working through this competition with
rasm2 was a breeze and saved me a ton of time and effort. I could just concentrate on what I was actually doing instead of wasting time and energy fighting my toolchain, which would be the case if I played this event using some combination of
objdump, Python, C and
So yeah. Come and fight me on r2wars @ r2con2019.