Skip to content

Instantly share code, notes, and snippets.

@jbielick
Last active April 16, 2016 14:17
Show Gist options
  • Save jbielick/32b09c48c9c20c703949 to your computer and use it in GitHub Desktop.
Save jbielick/32b09c48c9c20c703949 to your computer and use it in GitHub Desktop.
download, convert, clean, import US tiger shapefiles into RethinkDB tables.
objects=zcta5 county concity
tiger_url=ftp://ftp.census.gov/geo/tiger/TIGER2015
zcta5: zcta5.import
county: county.import
concity: concity.import
%.import: %.geo.json
rethinkdb import -f $< --table geo.$*
@echo '$* import complete'
%.geo.json: %.shp
ogr2ogr -t_srs crs:84 -f "GeoJSON" /vsistdout/ $< | \
./clean_collection.js > $@.tmp
mv $@.tmp $@
%.shp: %.zip
unzip -d $@ $<
%.zip: %.manifest
curl $(shell head -n 1 $<) -o $@
%.manifest:
$(eval url := $(tiger_url)/$(shell echo $* | tr -s '[:lower:]' '[:upper:]')/)
curl -l $(url) | \
sort -n | \
sed 's,^,$(url)/,' > $@
all: $(objects)
clean:
rm *.manifest
.PRECIOUS: %.zip %.geo.json
.PHONY: all clean %.import
@brwnll
Copy link

brwnll commented Apr 14, 2016

What language is this in?

@brwnll
Copy link

brwnll commented Apr 14, 2016

Figured out this is a makefile. But attempts to execute result in curl: no URL specified!. Looks like line 25, if this is a parameter I'm supposed to be supplying, I'm not sure what to.

@jbielick
Copy link
Author

@brwnll, you were right that I didn't get a notification about gist comments :P Thanks for the email!

Are you importing into rethinkdb? I would suggest mongodb for a few reasons, but if you're already set on rethink, I've fixed up this makefile with the updated ftp url and a few little other fixes (take a look at the revisions for more info).

There are more examples of the usage of this makefile (this particular gist is pretty old) and imports into rethink / mongo in this repo and I use this makefile for actual production imports for this project app. Take a look at the Makefile for either of those if you're interested and if you'd like some specific help with importing this data into either mongo or rethinkdb I can certainly help out! This gist also references a script clean_collection.js which is not present here. This makefile is a better example of simply generating the geojson feature collection (delimited by line breaks) for general usage in any import process. I had some issues with the rethinkdb import via commandline consuming too much memory so a js streaming script was the solution I eventually used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment