Skip to content

Instantly share code, notes, and snippets.

@kevinfoerster
Created March 12, 2019 13:29
Show Gist options
  • Save kevinfoerster/5c8d77315605094fc51b9373b1bd01e2 to your computer and use it in GitHub Desktop.
Save kevinfoerster/5c8d77315605094fc51b9373b1bd01e2 to your computer and use it in GitHub Desktop.
this script is based on https://robservatory.com/find-and-fix-non-searchable-pdfs/, but includes OCRKit to OCR any pdfs found in the current working directory
#!/bin/bash
saveIFS=$IFS
IFS=$(echo -en "\n\b")
FilesToCheck=$(find `pwd` -maxdepth 99 -name "*.pdf")
for i in $FilesToCheck
do
errCheck=$(pdffonts ${i} 2>&1 | tail -1)
if [[ $errCheck =~ ^- ]]
then
printf "running OCR on "$i"\n"
/Applications/OCRKit\ Pro.app/Contents/MacOS/OCRKit\ Pro --lang de --format pdf $i
fi
done
IFS=$saveIFS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment