Skip to content

Instantly share code, notes, and snippets.

@raylee
Last active January 6, 2021 14:34
Show Gist options
  • Save raylee/1dbec20859d695fe17200997f587e81e to your computer and use it in GitHub Desktop.
Save raylee/1dbec20859d695fe17200997f587e81e to your computer and use it in GitHub Desktop.
notes on exporting a wordpress site
#!/bin/bash
# Create a list of pages from the website, one url per line, in a file named "pages". Easiest
# way for small sites is to get them from the admin interface, open each page in a tab then
# copy all tab URLs to the clipboard via chrome extension:
# https://chrome.google.com/webstore/detail/copy-all-urls/djdmadneanknadilpjiknlnanaolmbfk?hl=en
# Have ripgrep and wget handy. The heavy lifting is done by an html to markdown lib, wrapped here:
# https://github.com/suntong/html2md#usage
for page in $(cat pages); do
echo $(basename $page)
html2md -i $page > $(basename $page)
done
for image in $(rg -Iio '\(.*\.(png|jpg)\)' | tr -d '()' | sort | uniq); do
wget $image
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment