Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save IdoBar/6aa12a0acb5414ef57fbef246cd20e5e to your computer and use it in GitHub Desktop.
Save IdoBar/6aa12a0acb5414ef57fbef246cd20e5e to your computer and use it in GitHub Desktop.
#! /bin/bash
## See also https://github.com/nextflow-io/nextflow/discussions/4308
## cd to a parent directory for a Nextflow pipeline executation, i.e. contains .nextflow and work directories
WORKDIR=$1
## Find work directories essential to the last pipeline run, as absolute paths
nextflow log last > $WORKDIR/preserve_dirs.txt
## Find all work directories, as absolute paths
find "$(readlink -f ./work)" -maxdepth 2 -type d -path '**/work/*/*' > $WORKDIR/all_dirs.txt
## Concatenate, sort, and count, filtering to those that show up only once (i.e., just once from all_dirs.txt)
cat $WORKDIR/all_dirs.txt $WORKDIR/preserve_dirs.txt | sort | uniq -c | grep " 1 /" | grep -Po '/.+' > $WORKDIR/to_delete_dirs.txt
## Delete the extraneous work directories
cat $WORKDIR/to_delete_dirs.txt | xargs -r -P 4 -n 1 rm -rf
## Clean up
rm -f $WORKDIR/preserve_dirs.txt $WORKDIR/all_dirs.txt $WORKDIR/to_delete_dirs.txt
@IdoBar
Copy link
Author

IdoBar commented Sep 27, 2024

Takes as input a temporary (or current) path to see temporary files (with listing of work directories)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment