Binary exploitation tools

Conventions

<n> : user provided value
[n] : an optional argument, usually depending on PEDA
[<n>] : user provided value, but optional

gdb-peda

Get peda

Get peda with (copy-paste) :

WD=/tmp/my_tools
mkdir -p $WD
cd $WD
chmod 777 $WD
curl -LO https://github.com/longld/peda/archive/master.zip
busybox unzip master.zip
chmod -R a+w $WD
alias gdb-peda="gdb -x $WD/peda-master/peda.py -q " 
cd

Most useful peda commands

pdisass <function_name> : disassemble with color
pattern create <n> : generates a pattern of size n
pattern search : searches for the pattern in memory and in registries
searchmem <pattern> [<address range>] : search for a pattern (supports regex) in memory. adress range helps you specifyu where to search (usually stack)
b *<function_name>+<offset> : a more user-friendly way of setting up breakpoints
shellcode generate x86/linux exec : generates an linux x86-32 execve(/bin/sh) shellcode.
r < <(<command>) : gdb doesn't support pipes but this command will run command and sends it's result on the stdin of the debugged program.

Cutter

Cutter is great for disassembly and static analysis. Even though it's not as stable as binary ninja, it opens the x86-64 binaries (PE32+, ELF64, macho-64) for free.

It's a GUI so not a lot of commands to type, but don't forget to use the drak theme, or you won't be able to read easily graphes etc. (God damned communist software) Cutter => Preferences, in the panel change Qt Theme to Dark instead of Default.

Don't forget to use right-click to find X-refs, rename variables, add comments etc.

Use + and - to zoom and unzoom on the graph

python2

Yes, use python and python2 specifically. NOT python3. Python3 assumes data stored as strings are unicode and ot tries to decode it when you access them causing a crash when you treat binary data such as shellcodes and retyurn addresses. It sucks, you will need to treat everything as a bytearray.

Python2 is great because it is straightforward, you can focus on your exploit and not a missing semicolon at the end of a line.

Python being a non-declarative language, I recommend the use of a linter (Ananconda/sublime-linter) to check for potential errors like :

test = 1
tset = test + 2
test + 3 * test + 150

Because it won't crash but it will not yield the good result. A linter will tell you tset is set but not used and color the line.

struct

struct is probably the most useful package which will be available on every machine which runs python. It basically helps you convert integers, longs, etc. in a byte representation taking endianness into account. The following code will yield '\xef\xbe\xad\xde'

import struct

struct.pack('<I', 0xdeadbeef)

More info on struct

It will help you build your payloads while keeping the code readable and flexible.

subprocess

subprocess is a very flexible way to run your vulnerable binaries.

The following code runs /home/level0/level0 with a modified PATH environement variable and pipe on the standard input. Make sure to use flush() to ensure the reading program will accept your write (useful for multi-stage payloads/exploits)

import os
import subprocess

# Get the env, change PATH variable
new_env = os.environ.copy()
new_env.update({"PATH": "/tmp/bin:/tmp/usr"})

# Build the argv array
args = ['/home/level0/level0', 'the arg1', 'the arg2']

# Run the program, piping the stdin
proc = subprocess.Popen(args, env=new_env, stdin=subprocess.PIPE)

# Write in the pipe to the stdin, flush it to make sure it dows not wait more data
proc.stdin.write("lmao I write on the stdin using a pipe !\n")
proc.stdin.flush()
proc.stdin.write("lmao I write on the stdin using a pipe AGAIN!\n")
proc.stdin.flush()

Use spaces, not tabs

Tabs are great to adapt indentation to the reader's preference but it sucks when copy-pasting (terminal often asks wether you want ot keep them, or convert them). Plus it's pep8, python's best practices.

copy-paste, don't scp

Often it won't be easy (long unreadable passwords etc.) to copy your script to the vulnerable VM, copy-pasting on nano will do the job. rm -f solve.py ; nano solve.py will easily do the job while not indenting the code as vim would.

Build your own tools

Although it's not necessary to reinvent the wheel, you will be lead to do some actions repeatedly, script them when possible. Python has a gazillion libraries, the most usefuls are generally installed and developpement is really easy.

There is no shame in automation, only in keeping doing the same thing over and over just to show everyone you're a a real h4xx0rz.

reverse-engineering

Reverse engineering is tricky to learn, it's boring and frustrating for beginers but it's not that hard to train.

Train on your own (simple) programs

Just create a simple C program and progressively add complexity. Start with a simple variable assignation, then add some function calls, then structures, then compile using different levels of optimization (-O1, -O2, -O3) etc. Just do NOT forget to compile on the same architecture as the one you are learning to reverse. Arguments are not processed the same way in linux x86-64 (registers) as in linux x86-32 (stack), instructions are added etc.

Learn how to recognize data types

It seems obvious, but when you read mov eax, BYTE [eax + ecx], eax is probably a string pointer and ecx is its index.

This gist may be updated in the future, so star it so it show up in your feed when I push an update.

hlequien/binary_exploitation_tools.md