Although Cypher seems to be a nice RPG system, their shitty PDF layouting makes me sick when creating characters, specially when choosing their abilities. I was anxious going back and forth, needing to remember the page I was reading before. This is insane. Also, having the abilities parsed, it's way easier to create beautiful and automated character sheets on Google Docs and whatnot.
To run it, first install Xpdf command line tools and add to your $PATH
.
Also, make sure you have the same official digital version (buy "Cypher System Rulebook, Revised Edition" here) of the PDF I do (or alter the variables and margin
parameters).
$ md5sum Cypher_System_-_Revised_Edition.pdf
de36ce487fb46b9f2373d713205579f4 Cypher_System_-_Revised_Edition.pdf
#!/bin/env bash
INPUT="Cypher_System_-_Revised_Edition.pdf"
START_PAGE=109
END_PAGE=201
mkdir pages || true
for i in $(seq $START_PAGE $END_PAGE); do
if [ `expr $i % 2` == 0 ]; then
# even pages
echo "Processing page $i, column A"
pdftotext -f ${i} -l ${i} -nodiag -nopgbrk -marginl 70 -margint 40 -marginr 342 -marginb 0 "$INPUT" "pages/${i}_1.txt"
echo "Processing page $i, column B"
pdftotext -f ${i} -l ${i} -nodiag -nopgbrk -marginl 262 -margint 40 -marginr 148 -marginb 0 "$INPUT" "pages/${i}_2.txt"
else
# odd pages
echo "Processing page $i, column A"
pdftotext -f ${i} -l ${i} -nodiag -nopgbrk -marginl 150 -margint 40 -marginr 258 -marginb 0 "$INPUT" "pages/${i}_1.txt"
echo "Processing page $i, column B"
pdftotext -f ${i} -l ${i} -nodiag -nopgbrk -marginl 344 -margint 40 -marginr 40 -marginb 0 "$INPUT" "pages/${i}_2.txt"
fi
done
cat pages/*.txt > abilities.txt
printf "Name\tCost\tDescription\tType\tFull Parsed String\n" > output.tsv
# Remove headers
# Join multiline (huge text) abilities text
# Parse to TSV: https://regex101.com/r/RTyt36/3
cat abilities.txt \
| grep -Ev '^ABILITIES' \
| sed ':a;N;/\n11/!s/\n/ /;ta;P;D' \
| sort \
| sed -E 's/^11(([^(:]+\b) ?(\(([^)]+)\))?:? (.*(Action[^.]*|Enabler).*|.*))/\2\t\4\t\5\t\6\t\1/gm' \
>> output.tsv