<n>
: user provided value[n]
: an optional argument, usually depending on PEDA[<n>]
: user provided value, but optional
Get peda
with (copy-paste) :
WD=/tmp/my_tools
mkdir -p $WD
cd $WD
chmod 777 $WD
curl -LO https://github.com/longld/peda/archive/master.zip
busybox unzip master.zip
chmod -R a+w $WD
alias gdb-peda="gdb -x $WD/peda-master/peda.py -q "
cd
pdisass <function_name>
: disassemble with colorpattern create <n>
: generates a pattern of sizen
pattern search
: searches for the pattern in memory and in registriessearchmem <pattern> [<address range>]
: search for a pattern (supports regex) in memory.adress range
helps you specifyu where to search (usuallystack
)b *<function_name>+<offset>
: a more user-friendly way of setting up breakpointsshellcode generate x86/linux exec
: generates an linux x86-32 execve(/bin/sh) shellcode.r < <(<command>)
: gdb doesn't support pipes but this command will runcommand
and sends it's result on the stdin of the debugged program.
Cutter is great for disassembly and static analysis. Even though it's not as stable as binary ninja, it opens the x86-64 binaries (PE32+, ELF64, macho-64) for free.
It's a GUI so not a lot of commands to type, but don't forget to use the drak theme, or you won't be able to read easily graphes etc. (God damned communist software)
Cutter => Preferences
, in the panel change Qt Theme
to Dark
instead of Default
.
Don't forget to use right-click to find X-refs, rename variables, add comments etc.
Use +
and -
to zoom and unzoom on the graph
Yes, use python and python2 specifically. NOT python3. Python3 assumes data stored as strings are unicode and ot tries to decode it when you access them causing a crash when you treat binary data such as shellcodes and retyurn addresses. It sucks, you will need to treat everything as a bytearray.
Python2 is great because it is straightforward, you can focus on your exploit and not a missing semicolon at the end of a line.
Python being a non-declarative language, I recommend the use of a linter (Ananconda/sublime-linter) to check for potential errors like :
test = 1
tset = test + 2
test + 3 * test + 150
Because it won't crash but it will not yield the good result. A linter will tell you tset
is set but not used and color the line.
struct
is probably the most useful package which will be available on every machine which runs python. It basically helps you convert integers, longs, etc. in a byte representation taking endianness into account. The following code will yield '\xef\xbe\xad\xde'
import struct
struct.pack('<I', 0xdeadbeef)
It will help you build your payloads while keeping the code readable and flexible.
subprocess
is a very flexible way to run your vulnerable binaries.
The following code runs /home/level0/level0
with a modified PATH
environement variable and pipe on the standard input. Make sure to use flush()
to ensure the reading program will accept your write (useful for multi-stage payloads/exploits)
import os
import subprocess
# Get the env, change PATH variable
new_env = os.environ.copy()
new_env.update({"PATH": "/tmp/bin:/tmp/usr"})
# Build the argv array
args = ['/home/level0/level0', 'the arg1', 'the arg2']
# Run the program, piping the stdin
proc = subprocess.Popen(args, env=new_env, stdin=subprocess.PIPE)
# Write in the pipe to the stdin, flush it to make sure it dows not wait more data
proc.stdin.write("lmao I write on the stdin using a pipe !\n")
proc.stdin.flush()
proc.stdin.write("lmao I write on the stdin using a pipe AGAIN!\n")
proc.stdin.flush()
Tabs are great to adapt indentation to the reader's preference but it sucks when copy-pasting (terminal often asks wether you want ot keep them, or convert them). Plus it's pep8, python's best practices.
Often it won't be easy (long unreadable passwords etc.) to copy your script to the vulnerable VM, copy-pasting on nano will do the job. rm -f solve.py ; nano solve.py
will easily do the job while not indenting the code as vim
would.
Although it's not necessary to reinvent the wheel, you will be lead to do some actions repeatedly, script them when possible. Python has a gazillion libraries, the most usefuls are generally installed and developpement is really easy.
There is no shame in automation, only in keeping doing the same thing over and over just to show everyone you're a a real h4xx0rz.
Reverse engineering is tricky to learn, it's boring and frustrating for beginers but it's not that hard to train.
Just create a simple C program and progressively add complexity. Start with a simple variable assignation, then add some function calls, then structures, then compile using different levels of optimization (-O1
, -O2
, -O3
) etc. Just do NOT forget to compile on the same architecture as the one you are learning to reverse. Arguments are not processed the same way in linux x86-64 (registers) as in linux x86-32 (stack), instructions are added etc.
It seems obvious, but when you read mov eax, BYTE [eax + ecx]
, eax
is probably a string pointer and ecx
is its index.
This gist may be updated in the future, so star it so it show up in your feed when I push an update.