Skip to content

Instantly share code, notes, and snippets.

@basnijholt
Last active July 5, 2023 12:24
Show Gist options
  • Save basnijholt/c375ea2d1df6702492b619e0873d6c7c to your computer and use it in GitHub Desktop.
Save basnijholt/c375ea2d1df6702492b619e0873d6c7c to your computer and use it in GitHub Desktop.
Using Slurm with ipyparallel

Using ipyparallel on the cluster

One time only

Create a parallel profile

ipython profile create --parallel --profile=slurm

cd into ~/.ipython/profile_slurm/

and add (edit) the following files: ipcontroller_config.py:

c.HubFactory.ip = u'*'
c.HubFactory.registration_timeout = 600

ipengine_config.py:

c.IPEngineApp.wait_for_url_file = 300
c.EngineFactory.timeout = 300

ipcluster_config.py:

c.IPClusterStart.controller_launcher_class = 'SlurmControllerLauncher'
c.IPClusterEngines.engine_launcher_class = 'SlurmEngineSetLauncher'
c.SlurmEngineSetLauncher.batch_template = """#!/bin/sh
#SBATCH --ntasks={n}
#SBATCH --mem-per-cpu=4G
#SBATCH --job-name=ipy-engine-
srun ipengine --profile-dir="{profile_dir}" --cluster-id=""
"""

Every time you want to connect to the cluster

Open tmux or screen and run

ipcluster start --n=500 --profile=slurm

You will see something like:

2018-05-24 01:35:07.560 [IPClusterStart] Removing pid file: /gscratch/home/t-banijh/.ipython/profile_slurm/pid/ipcluster.pid
2018-05-24 01:35:07.562 [IPClusterStart] Starting ipcluster with [daemon=False]
2018-05-24 01:35:07.565 [IPClusterStart] Creating pid file: /gscratch/home/t-banijh/.ipython/profile_slurm/pid/ipcluster.pid
2018-05-24 01:35:07.569 [IPClusterStart] Starting Controller with SlurmControllerLauncher
2018-05-24 01:35:07.599 [IPClusterStart] Job submitted with job id: '8839'
2018-05-24 01:35:08.601 [IPClusterStart] Starting 500 Engines with SlurmEngineSetLauncher
2018-05-24 01:35:08.627 [IPClusterStart] Job submitted with job id: '8840'
2018-05-24 01:35:38.659 [IPClusterStart] Engines appear to have started successfully

Wait until you see: "Engines appear to have started successfully" then go to your notebook that runs on the cluster and

import ipyparallel
client = ipyparallel.Client(profile='slurm')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment