Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save StuffbyYuki/4fef6a5f678edcd5eb324936afa69e5b to your computer and use it in GitHub Desktop.
Save StuffbyYuki/4fef6a5f678edcd5eb324936afa69e5b to your computer and use it in GitHub Desktop.
Script for Downloading and Parsing Restrosheet Data
# modified what's in: https://github.com/davidbmitchell/Baseball-PostgreSQL
# install the necessary tool
brew install chadwick
# create folders
mkdir -p /path/to/retrosheet/{unparsed,parsed}
cd /path/to/retrosheet/unparsed
# you can set startDecade to whichever decade you like
startDecade=1910 endDecade=2020
while [ $startDecade -le $endDecade ] ; do
http://www.retrosheet.org/events/"$startDecade"seve.zip
let startDecade=startDecade+10
done
# unzip the downloaded files
find . -name "*.zip" -exec unzip {} \; -exec /bin/rm {} \;
# parse the data
# variables for first and last year
x=1910 y=2023
for (( i=$x; i<=$y; i++)); do cwevent -n -f 0-96 -x 0-60 -y "$i" "$i"*.EV* > ../parsed/all"$i".csv; done
for (( i=$x; i<=$y; i++)); do cwgame -n -f 0-83 -x 0-94 -y "$i" "$i"*.EV* > ../parsed/games"$i".csv; done
for (( i=$x; i<=$y; i++)); do cwsub -n -f 0-9 -y "$i" "$i"*.EV* > ../parsed/sub"$i".csv; done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment