Skip to content

Instantly share code, notes, and snippets.

@clintval
Last active March 28, 2024 04:40
Show Gist options
  • Save clintval/4079fb480d9350ae52ff33f86786bb2b to your computer and use it in GitHub Desktop.
Save clintval/4079fb480d9350ae52ff33f86786bb2b to your computer and use it in GitHub Desktop.
Bioinformatics example of a multi-stage Dockerfile

Multi-Stage Dockerfile for Bioinformatics

A set of directives creates a "builder" layer that is then used for every dependency, including the final layer.

Samtools and BLAST are each built in parallel using the builder layer as a base layer.

The final layer inherits from the builder layer and then copies in the binaries from the samtools and BLAST layers.

Altering a directive in the samtools layer will not affect the BLAST layer (and vice versa) so caching is more efficient.

Build the Dockefile

docker build ./ --no-cache
[+] Building 57.5s (16/16) FINISHED
 => [internal] load build definition from Dockerfile                                                           0.0s
 => => transferring dockerfile: 129B                                                                           0.0s
 => [internal] load .dockerignore                                                                              0.0s
 => => transferring context: 2B                                                                                0.0s
 => resolve image config for docker.io/docker/dockerfile:1.3                                                   0.5s
 => CACHED docker-image://docker.io/docker/dockerfile:1.3@sha256:42399d4635eddd7a9b8a24be879d2f9a930d0ed040a6  0.0s
 => [internal] load build definition from Dockerfile                                                           0.0s
 => [internal] load .dockerignore                                                                              0.0s
 => [internal] load metadata for docker.io/library/openjdk:8-slim-buster                                       0.4s
 => CACHED [builder 1/3] FROM docker.io/library/openjdk:8-slim-buster@sha256:021eeaf954b19feb4129cc9f6c7831b4  0.0s
 => [builder 2/3] RUN apt-get update && apt-get install -y git                                                10.1s
 => [builder 3/3] RUN mkdir -p -m 0600 ~/.ssh     && ssh-keyscan github.com >> ~/.ssh/known_hosts              0.9s
 => [blast 1/1] RUN apt-get update     && apt-get install -y wget     && wget https://ftp.ncbi.nlm.nih.gov/b  26.3s
 => [samtools 1/2] RUN apt-get update && apt-get -y install     autoconf     automake     build-essential     14.2s
 => [samtools 2/2] RUN wget https://github.com/samtools/samtools/releases/download/1.11/samtools-1.11.tar.bz  30.4s
 => [stage-3 1/2] COPY --from=blast    /usr/local/bin/blastn   /bin/blastn                                     0.1s
 => [stage-3 2/2] COPY --from=samtools /usr/local/bin/samtools /bin/samtools                                   0.0s
 => exporting to image                                                                                         0.7s
 => => exporting layers                                                                                        0.7s
 => => writing image sha256:edb3b52fb45bd9e1e8d61f10265ca2d46332e07d8127d5a92587e53d8cc4e0f3                   0.0s

Test the Image

docker run -it edb3b52fb45b samtools collate
Usage: samtools collate [-Ou] [-o <name>] [-n nFiles] [-l cLevel] <in.bam> [<prefix>]

Options:
      -O       output to stdout
      -o       output file name (use prefix if not set)
      -u       uncompressed BAM output
      -f       fast (only primary alignments)
      -r       working reads stored (with -f) [10000]
      -l INT   compression level [1]
      -n INT   number of temporary files [64]
      --no-PG  do not add a PG line
      --input-fmt-option OPT[=VAL]
               Specify a single input file format option in the form
               of OPTION or OPTION=VALUE
      --output-fmt FORMAT[,OPT[=VAL]]...
               Specify output format (SAM, BAM, CRAM)
      --output-fmt-option OPT[=VAL]
               Specify a single output file format option in the form
               of OPTION or OPTION=VALUE
      --reference FILE
               Reference sequence FASTA FILE [null]
  -@, --threads INT
               Number of additional threads to use [0]
      --verbosity INT
               Set level of verbosity
  <prefix> is required unless the -o or -O options are used.
# syntax=docker/dockerfile:1.3
FROM openjdk:8-slim-buster AS builder
RUN apt-get update && apt-get install -y git
# Not actually needed by any subsequent commands, but shows how you can bake
# things into the builder layer if they are commonly needed by other layers.
RUN mkdir -p -m 0600 ~/.ssh \
&& ssh-keyscan github.com >> ~/.ssh/known_hosts
################################################################################
FROM builder AS samtools
RUN apt-get update && apt-get -y install \
autoconf \
automake \
build-essential \
gcc \
libbz2-dev \
libcurl4-gnutls-dev \
libncurses5-dev \
libssl-dev \
liblzma-dev \
make \
perl \
wget \
zlib1g-dev
RUN wget https://github.com/samtools/samtools/releases/download/1.11/samtools-1.11.tar.bz2 \
&& tar jxf samtools-1.11.tar.bz2 \
&& cd samtools-1.11 \
&& ./configure \
&& make -j 2 \
&& mv samtools /usr/local/bin/samtools \
&& cd .. \
&& rm -r samtools-1.11/ \
&& rm samtools-1.11.tar.bz2
################################################################################
FROM builder AS blast
RUN apt-get update \
&& apt-get install -y wget \
&& wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.12.0/ncbi-blast-2.12.0+-x64-linux.tar.gz \
&& tar -xvzf ncbi-blast-2.12.0+-x64-linux.tar.gz \
&& rm ncbi-blast-2.12.0+-x64-linux.tar.gz \
&& mv ncbi-blast-2.12.0+/bin/blastn /usr/local/bin/blastn \
&& rm -r ncbi-blast-2.12.0+
################################################################################
FROM builder
COPY --from=blast /usr/local/bin/blastn /bin/blastn
COPY --from=samtools /usr/local/bin/samtools /bin/samtools
ENV PATH="/bin:${PATH}"
CMD /bin/bash
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment