snakemake notes gathered during Boston Snakemake Days 2021, https://koesterlab.github.io/bsd2021.html

Tips from Johannes Köster - Author of Snakemake on advanced use of snakemake

When definining conda environments, prefer using channel priority in order of

bioconda > conda-forge > anything else

When definining dependencies for conda env, never ever use version constraints beyond major.minor(1.2) or rarely use major.minor.patch(1.2.3)
You should never use full version major.minor.patch-blah_blah like libgcc-ng=7.2.0=hdf63c60_3. Specifying full versions will inevitably fail future snakemake runs as conda website can not maintains all work-in-progress (like major.minor.patch-blah_blah) on its website.
Instead of conda env, you may containerize mature snakemake workflow using snakemake --containerize > Dockerfile. You can then run snakemake workflow by specifying this container. Container should be compatible for both, docker and singularity. More at https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html
You may also use conda based snakemake wrappers and related website for frequently used rules, e.g, samtools sort, bwa align, etc. Also, checkout community supported snakemake workflow catalog and related website.
For large files which are shared across several workflows, e.g., bwa indexes, aggregated variant calls, bam files, etc., you may leverage snakemake workflow caching. https://snakemake.readthedocs.io/en/stable/executing/caching.html
Prefer using best practices when working with and distributing your workflow. https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html
You can also combine one or more rules from other snakemake workflows (published on version control websites). https://snakemake.readthedocs.io/en/stable/snakefiles/deployment.html#using-and-combining-pre-exising-workflows
Follow best practices while working with snakemake.
For log files, you can redirect stdout and stderr using directives in either shell commands or script file. For script files, use language specific log collection methods at the top of the script, e.g., sink function in R (example) and sys module in python (example).

log <- file(snakemake@log[[1]], open="wt")
sink(log)
sink(log, type="message")

import sys
sys.stderr = open(snakemake.log[0], "w")

snakemake can use input functions that can be a simple python function using snakemake wildcards (as defined from output). You can also use similar functions in params section, including use of python lamda funciton. See details here. If input function returns more than one file, you can also use snakemake unpack function to return dictionary object with key-value pairs details here.
In current version, 6.8.0, snakemake will not rerun entire workflow if say you add some of input files in configfile, e.g., add more fastqs in the first rule, but the final calls/all.vcf is present as in here. You can override this behavior using --list-input-changes. In upcoming release, snakemake will introduce snakemake --rerun-changes to rerun entire workflow on changes in ? one or more of input files.
snakemake has an experimental support for logging via slack. Ref.: snakemake logging and engenegr/log_handler_slack.py
snakemake can use scatter-gather similar to HPC array-like logic. Details here. Also, check dna-seq-varlociraptor workflow.

sbamin/snakemake_tips.md

Tips from Johannes Köster - Author of Snakemake on advanced use of snakemake