Skip to content

Instantly share code, notes, and snippets.

@pnorman
Last active September 15, 2019 15:26
Show Gist options
  • Save pnorman/6739765 to your computer and use it in GitHub Desktop.
Save pnorman/6739765 to your computer and use it in GitHub Desktop.
Draft install instructions for osm2pgsql + carto + renderd

Manually building a tile server

This page describes how to install, setup and configure all the necessary software to operate your own tile server. The step-by-step instructions are written for Ubuntu Linux 12.04 LTS (Precise Pangolin), however they should transfer fairly straightforwardly to other versions of Ubuntu or Linux distributions.

##Software installation The OSM tile server stack is a collection of programs and libraries that work together to create a tile server. As so often with OpenStreetMap, there are many ways to achieve this goal and nearly all of the components have alternatives that have various specific advantages and disadvantages. This tutorial describes the most standard version that is also used on the main OpenStreetMap.org tile server.

This guide covers installation of osm2pgsql, loading a PostgreSQL/PostGIS database, and rendering tiles for an online webmap.The database can also be used to [develop stylesheets], or render data with other software.

Before starting you want to update your system.

sudo apt-get update && sudo apt-get -y upgrade

If on a brand new system you also want to do sudo apt-get dist-upgrade && sudo shutdown -r now

Basic software installation

There is a lot of software to install. The goal is to use packages and official PPAs whenever possible.

Explain what a PPA is

Because this all needs to be done as root do sudo -i now.

PostgreSQL + PostGIS

Although PostgreSQL 9.1 and PostGIS 2.0 will work, it's best to go with the most recent versions which are faster and have stability improvements. It is possible to use PostgreSQL 8.4 or PostGIS 1.5 but it is not recommended.

We will use the official PostgreSQL repo for packages, but first we need to add it

cat > /etc/apt/sources.list.d/pgdg.list <<EOF
deb http://apt.postgresql.org/pub/repos/apt/ precise-pgdg main
#deb-src http://apt.postgresql.org/pub/repos/apt/ precise-pgdg main
EOF
wget --quiet -O - http://apt.postgresql.org/pub/repos/apt/ACCC4CF8.asc | apt-key add -
apt-get update

Now we want to install PostgreSQL + PostGIS + hstore

apt-get update
apt-get --no-install-recommends install -y postgresql-9.3-postgis-2.1 postgresql-contrib-9.3

Misc software

We need a bunch of assorted packages for the upcoming steps

apt-get --no-install-recommends install -y python-software-properties git unzip

If we're rendering tiles we need a webserver, and we also need it for munin for monitoring. If you're purely running a database and not serving tiles or running munin, you can omit this.

apt-get --no-install-recommends install -y apache2.2-bin apache2.2-common apache2-mpm-worker

You really want a way to monitor your systems performance and munin is the standard way to do that.

apt-get --no-install-recommends install -y munin-node munin munin-plugins-extra libdbd-pg-perl \
sysstat iotop

sed -i "s|Allow from.*|Allow from all|" /etc/munin/apache.conf
service apache2 reload

OSM Software

We're now going to install a bunch of OSM-specific software from a PPA.

If you only want the database you can omit renderd. If you don't want to ever update your database you can omit osmosis.

echo 'yes' | add-apt-repository ppa:kakrueger/openstreetmap
apt-get update
apt-get --no-install-recommends install -y osm2pgsql osmosis

You can now exit the root shell with exit.

Loading data

Setting up the database

Before loading data you need to create a user and a database. Replace database

sudo -u postgres createuser -s username
sudo -u postgres createdb -O username gis
psql -d gis -c "CREATE EXTENSION hstore; CREATE EXTENSION postgis;"

If you're running munin you want to update its plugins

sudo munin-node-configure --sh | sudo sh
sudo service munin-node restart

Getting some data

First you need to download some data. For testing you probably want to use an extract. go to http://download.geofabrik.de/ and find a small region like Liechtenstein. these instructions use the full planet, but it's just a matter of changing filenames. The full planet is about 25GB.

point at torrent which can be faster

For the full planet

wget http://planet.openstreetmap.org/pbf/planet-latest.osm.pbf.md5
wget http://planet.openstreetmap.org/pbf/planet-latest.osm.pbf
md5sum -c planet-latest.osm.pbf.md5 # Check that the download wasn't corrupted

For just liechtenstein

wget http://download.geofabrik.de/europe/liechtenstein-latest.osm.pbf.md5
wget http://download.geofabrik.de/europe/liechtenstein-latest.osm.pbf
md5sum -c liechtenstein-latest.osm.pbf.md5 # Check that the download wasn't corrupted

Adjusting settings

osm2pgsql uses overcommit like many scientific and large data applications, which requires adjusting a kernel setting.

sudo sysctl -w vm.overcommit_memory=1

If you want to permanently adjust this setting you can use

sudo tee /etc/sysctl.d/60-overcommit.conf <<EOF
# Overcommit settings to allow faster osm2pgsql imports
vm.overcommit_memory=1
EOF
sudo sysctl -p # is this right?

do we want to set vm.overcommit_ratio?

The default PostgreSQL settings aren't great for very large databases like OSM databases. Proper tuning can just about double the performance you're getting. The most important PostgreSQL settings to change are maintenance_work_mem and work_mem, both which should be increased for faster loading and faster queries while rendering.

More information on tuning can be found at

Figuring out the osm2pgsql command line

Before getting into possible command lines, you need to decide what you want to do with the database.

You need to decide

  • How often you want to update it
  • If you are using a standard style
  • Where all the files are

an osm2pgsql command line is complicated, so it helps to break it down into parts.

An example command line is

~/osm/osm2pgsql/osm2pgsql --create --slim --drop 
-C 20000 --number-processes 4 \
--flat-nodes ~/osm/flat.nodes \
-S ~/osm/osm2pgsql/default.style --hstore --multi-geometry \
~/osm/planet-latest.osm.pbf
  • --create specifies that anything in the database can be discarded
  • --slim specifies to use slim mode which uses less ram, additional database tables and allows updates. non-slim mode on the entire planet takes well over 64GB of RAM
  • --drop removes the slim tables after import, preventing updates but saving space and making a faster import
  • -C 20000 sets the node cache to 20000 MB of RAM
  • --number-processes 4 instructs osm2pgsql to use 4 processes. this should be the same number of CPU threads you have, to a maximum of 8
  • --flat-nodes ~/osm/flat.nodes specifies where to create a 20GB file with every node location
  • -S ~/osm2pgsql/default.style sets the location of the .style file, which tells osm2pgsql what columns to create
  • --hstore tells osm2pgsql to create a tags column as a hstore and put any tags it didn't put into their own columns into there
  • --multi-geometry tells osm2pgsql not to split up detatched polygons like some administrative boundaries
  • ~/osm/planet-latest.osm.pbf is the location of the OSM data to load

If you want to do regular updates you must not use the --drop option. If you are doing updates every week or less frequently it can be worth it to use --drop and just reload all the data each time.

running osm2pgsql

After starting osm2pgsql with the above command you'll get a lot of NOTICE messages about tables not existing and being skipped. These are normal.

You should get a line like this eventually

Reading in file: ~/osm/planet-latest.osm.pbf
Processing: Node(1889609k 1170.8k/s) Way(182531k 18.68k/s) Relation(1934380 62.07/s)  parse time: 42547s

Speeds will depend on your server, but the above are reasonable and were on an Amazon EC2 m2.2xlarge instance with local storage.

how much to explain about the output?

The next step is pending ways. You may get warnings here

Going over pending ways...
Maximum node in persistent node cache: 2297430015
        110203098 ways are pending

Using 4 helper-processes
WARNING: Failed to fork helper process 1: Cannot allocate memory. Trying to recover.
WARNING: Failed to fork helper process 2: Cannot allocate memory. Trying to recover.
WARNING: Failed to fork helper process 3: Cannot allocate memory. Trying to recover.
Mid: loading persistent node cache from /mnt/flat/flat.nodes
Maximum node in persistent node cache: 2297430015
Helper process 0 out of 1 initialised
processing way (7685k) at 1.04k/s (done 0 of 1)

At this point if all you want is an osm2pgsql database you can stop here (after osm2pgsql has finished).


stuff below is still draft

##Installing softare

To reneder tiles we need some more software

Node

Carto, the stylesheet pre-processor used for the openstreetmap.org stylesheets, is written using node.js. Using anything other than the latest version of node.js is a headache, so we'll get it from a PPA. We also need part of gdal.

sudo add-apt-repository ppa:chris-lea/node.js
sudo apt-get update
sudo apt-get  --no-install-recommends -y install nodejs gdal-bin libgdal1-dev

mapnik

Mapnik is the library used for rendering maps. We need version 2.1 or later and it's a pain to compile from source, so we'll use the official PPA

sudo add-apt-repository ppa:mapnik/v2.2.0
sudo apt-get update
sudo apt-get install libmapnik mapnik-utils python-mapnik libmapnik-dev

renderd/mod_tile

Cannot use PPA

sudo apt-get install apache2-dev
cd ~/osm
git clone https://github.com/openstreetmap/mod_tile.git
cd mod_tile
./autogen.sh
./configure
make
sudo make install
sudo sudo make install-mod_tile

Setting up the styles

get osm-carto

clone

run shapefile script

install carto

process with carto

setting up renderd

adjust renderd.conf

adjust apache config?


more stuff?

server mode tilemill

You need node and mapnik as installed above, as well as assorted other packages already installed when you installed osm2pgsql

dependencies

sudo apt-get  --no-install-recommends -y install git build-essential protobuf-compiler libprotobuf-lite7 libprotobuf-dev libgdal1-dev libmapnik-dev mapnik-utils

If using tilemill on a headless server you need to remove topcube from package.json

cd ~/osm
git clone https://github.com/mapbox/tilemill.git
cd tilemill
sed -i "s/\"topcube\": *\"[^\"]*\",//" package.json
npm install millstone
npm install jsdom
npm install

Tuning PostgreSQL

The following should tune your configuration to something more reasonable

# variable for the PG conf location because we're working on it so much
PGCONF="/etc/postgresql/9.3/main/postgresql.conf"

# Settings to allow 2GB
sed -i 's/#*kernel.shmmax = .*/kernel.shmmax = 2147483648/' /etc/sysctl.d/30-postgresql-shm.conf
sed -i 's/#*kernel.shmall = .*/kernel.shmmax = 524288/' /etc/sysctl.d/30-postgresql-shm.conf
sysctl -p # doesn't work, what's the sysctl command?

sed -i 's/#*shared_buffers = .*/shared_buffers = 1GB/' $PGCONF
# Set to about 50% of system memory
sed -i 's/#*effective_cache_size = .*/effective_cache_size = 8GB/' $PGCONF

# If you have the RAM, 1GB is good
sed -i 's/#*maintenance_work_mem = .*/maintenance_work_mem = 256MB/' $PGCONF

# More helps speed up rendering queries
sed -i 's/#*work_mem = .*/work_mem = 32MB/' $PGCONF

# Suggested parameters for bulk loading
sed -i 's/#*checkpoint_segments = .*/checkpoint_segments = 256/' $PGCONF
sed -i 's/#*checkpoint_completion_target = .*/checkpoint_completion_target = 0.9/' $PGCONF

# Parameter tuning
sed -i 's/#*random_page_cost = .*/random_page_cost = 2.0/' $PGCONF
# On older PG versions increase cpu_tuple_cost to 0.05-0.10

# Autovacuum tuning to minimize bloat
sed -i 's/#*autovacuum_vacuum_scale_factor = .*/autovacuum_vacuum_scale_factor = 0.04/' $PGCONF
sed -i 's/#*autovacuum_analyze_scale_factor = .*/autovacuum_analyze_scale_factor = 0.02/' $PGCONF
/etc/init.d/postgresql restart
@steamboatid
Copy link

sed -i 's/#kernel.shmall = ./kernel.shmmax = 524288/' /etc/sysctl.d/30-postgresql-shm.conf

it should
sed -i 's/#kernel.shmall = ./kernel.shmall = 524288/' /etc/sysctl.d/30-postgresql-shm.conf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment