PostgreSQL Page Checksum Protection

This gist summarises a way to create corrupted databases. Most of the material is adapted from Luca's Article.

Setup the database

Create and start the server with data checksums enabled

initdb -k -D cluster
pg_ctl -D cluster start

Create some fake data (get the dellstore dataset here)

createdb $USER
psql -f dellstore2-normal-1.0.sql

Corrupt a page

The goal is to corrupt $PGDATA/base/{oid}/{relfilenode}.

First we check out the relfilenode with largest number of relpages and reltuples.

psql -c "SELECT relname, relpages, reltuples, relfilenode FROM pg_class
WHERE relkind = 'r' AND relname NOT LIKE 'pg%'
ORDER BY relpages DESC
LIMIT 1;"

Get oid of database

psql -c "SELECT datname, oid FROM pg_database;"

Add a simple perl script (corrupt.pl) to corrupt the database:

#!env perl

open my $db_file, "+<", $ARGV[ 0 ]

|| die "Impossibile aprire il file!\n\n"; seek $db_file, ( 8 * 1024 ) + $ARGV[ 1 ], 0;

print { $db_file } "Hello Corrupted Database!"; close $db_file;

Run the script (may have to run multiple times)

sudo perl corrupt.pl cluster/base/16384/16397 5

Restart the database

pg_ctl -D cluster restart

Select table to verify that an error occurred

psql -c "SELECT * FROM customers;"

# ERROR:  invalid page in block 1 of relation base/16384/16397

Stop the server

pg_ctl -D cluster stop

Verify the corruption offline

We can also verify that this worked using Google's pg_page_verification tool. This will allow us to "verify checksums on PostgreSQL data pages without having to load each page into shared buffer cache."

To set it up we can download a copy of the PostgreSQL source code, and transfer the code onto src/include/pg_page_verification.c and src/include/Makefile.

Afterward we can simply make and run the binary:

./pg_page_verification -D corrupt

# CORRUPTION FOUND: 1

pohzipohzi/checksum.md

PostgreSQL Page Checksum Protection

Setup the database

Corrupt a page

Verify the corruption offline