- Container: environment that runs an applications that is not dependent on the OS. Kind of like a lightweight VM. Containers are stateless; if you need to update the components inside, create another container instead.
- Image: template to create a container. Its components are defined by a
Dockerfile
. - Volume: storage area detached from the container for maintaining state. There are 4 types:
- Volumes: managed by Docker itself, they're stored in a separate part of the host filesystem (on Linux by default this would be
var/lib/docker/volumes
). Useful for production deployments. - Bind mounts: mount a specific directory or file from the host to the container, such as a project folder. Useful for development.
- Tmpfs mounts: mount a temporary file system in the container, which is stored in the host system's memory only, not on disk. Useful for sensitive information that should not be persisted between container restarts or for performance reasons (RAM is faster than disk).
- Named pipes: used to create a pipe which can be accessed in a FIFO manner from both the container and the host system. Used primarily to store files in the host's system memory when running Docker on Windows.
- Volumes: managed by Docker itself, they're stored in a separate part of the host filesystem (on Linux by default this would be
- Foreground/interactive vs background/detached: a detached container runs in the background whereas an interactive container will usually have a terminal of some sort for interacting with.
List your local images
docker images
Delete images (many ways)
- You may use
docker image rm
,docker image remove
ordocker rmi
for all of these commands - Delete an image based on its ID
docker rmi <image_id>
> It will return an error if there are multiple tags attached to it)
- Force delete an image based on its ID (it will delete all tags associated with it)
docker rmi -f <image_id>
- Delete an image based on its repository and tag
docker rmi <repository>:<tag>
Clean up images
docker images -q -f dangling=true
-q
is for quiet mode-f dangling=true
filters the results to only show untagged images
docker image prune
> removes all dangling images.- You may also use
docker image prune -a
to remove all unused images (images not referenced by any containers)
- You may also use
List your running containers
docker ps
docker ps -a
> List all containers, including stopped ones.
Stop a running container
docker stop <container_id>
Remove a container
docker rm <container_id>
docker rm -v <container_id>
> This will remove both the container and its volumes (named volumes however will not be deleted)
List volumes
docker volume ls
Delete a volume
docker volume rm <volume_id>
Remove unused volumes
docker volume prune
> removes anonymous volumes not used by at least one containerddocker volume prune -a
> removes all unused volumes, both named and anonymous
Run a Docker image inside a container
docker run -it --rm image_name:tag
-it
is a combination of-i
(interactive mode) and-t
(allocate a terminal).--rm
means that the container will be removed when exited.- You may find Docker images at the Docker Hub.
- This command will use the entrypoint defined by the image. It won't necesarily open a terminal inside the container.
Run a Docker image inside a container and override the entrypoint
docker run -it --rm --entrypoint=bash image_name:version
- This will override the entrypoint of your image and open a bash terminal inside the container instead.
- Some images that have specific start-up sequences may have unexpected behaviour when the entrypoint is overriden
Run a Docker image inside a container and map a port in the container to a port in the host machine
docker run -it --rm -p 9999:9696 image_name:tag
- The syntax por port mapping is
host:container
("outside:inside") - In this example, the port 9696 within the container will be mapped to port 9999 in the host. You may access the container by accessing the host's port 9999.
- The syntax por port mapping is
Create a Dockerfile
with instructions to create a basic custom Docker image.
# set base image
FROM python:3.9
# set the working directory in the container
WORKDIR /app
# copy dependencies to the working directory
COPY requirements.txt .
# Install dependencies
RUN pip install -r requirements
# Copy all files within your project folder to the working directory of the image
COPY . /app
# command to run on container start
CMD ["python", "./main.py"]
- Docker will process each line as a layer. Some layers are cached, so in order to speed up build time, first copy and run immutable objects and then take care of your code/modules, as shown in this example.
- Base images are useful because they save a lot of work and build time. Choose a lean base image and avoid unnecessary packages.
- Alpine or distroless images are a good beginning point.
- For Python, go with an Alpine base image. Get the name from the hub page.
- You can also use the default Python images if size isn't too important, as in the example. They are Debian-based and come with
apt-get
support.
- Each container should only have one concern. Decouple applications into multiple containers.
- You may create an additional
.dockerignore
file in which you may declare any files and folders that must be ignored when using theCOPY . /app
command. Its syntax is identical to.gitignore
files.
Create a slightly more complex Dockerfile
with pipenv dependencies and specific entrypoints.
# set base image
FROM python:3.9
# (pipenv) install pipenv
RUN pip install pipenv
# set the working directory in the container
WORKDIR /app
# (pipenv) copy dependencies to the working directory
COPY ["Pipfile", "Pipfile.lock", "./"]
# (pipenv) Install dependencies
# (pipenv) We don't need a virtualenv in Docker, so we can install dependencies to the system
RUN pipenv install --system --deploy
# Copy the model
COPY ["predict.py", "model.bin", "./"]
# Expose a port on the container
# Remember to map the port to a port in the host when running the container!
EXPOSE 9696
# Specify entrypoint
ENTRYPOINT ["gunicorn", "--bind=0.0.0.0:9696", "predict:app"]
- The
COPY
instruction has 2 forms, shown here. The second form (like for pipenv in this example) must be used if any paths may contain whitespaces. The last param is always the destination directoy, which may be.
or./
for copying to the directory specified byWORKDIR
.
Build an image based on a Dockerfile
docker build -f Dockerfile -t my_image .
- The default Dockerfile that the command will look for is
$PATH/Dockerfile
. If yourDockerfile
is in the same directory that you will run the command and you have not named it something else,-f Dockerfile
can be removed from the command. my_image
will be the name of your image. You may optionally tag it like so:my_image:my_tag
.
- The default Dockerfile that the command will look for is
Check how much storage space all of the images, containers and volumes are taking.
docker system df
docker system info
Clean up unused stuff.
docker system prune
> delete stopped containers, dangling images, unused volumes, etc.docker system prune -a
> same as previous but also delete all images without associated volumes.docker system prune -af
> same as previous but forced.
Example docker-compose.yaml
file below. You may find a more elaborate example at https://docs.docker.com/compose/compose-application-model/
version: "3.9"
services:
model-server:
image: zoomcamp-10-model:v1
gateway:
image: zoomcamp-10-gateway:v2
environment:
- TF_SERVING_HOST=model-server:8500
- MY_CUSTOM_VAR=${MY_CUSTOM_VAR}
ports:
- "9999:9696"
version
used to be required bydocker-compose
but not anymore. However, since there have been syntax changes in different versions, it's always convenient to declare which version was used when writing the file.- The app has 2 components (services):
model-server
andgateway
- Each service must have a Docker
image
. - You may specify environment variables with
environment
and port mappings withports
- The dash (
-
) means that the entry is a list (array). In this example there are 2 lists; theenvironment
list contains 2 elements and theports
list contains just one. - The array syntax requires the use of
=
for assigning values to variables, like inTF_SERVING_HOST=model-server:8500
. - Entries in
ports
must always be surrounded with quotes.- Entries in other lists do not require quotes.
- Port syntax is
host:container
("outside:inside"). In this example, the port 9696 within the container will be mapped to port 9999 in the host. You may access the container by accessing the host's port 9999.- IMPORTANT: you may access the container from your localhost with localhost:9999, but any container instantiated with the Docker Compose file must access that container using port 9696, because all containers declared within the Docker Compose file are inside the same virtual network and thus are not aware of the host's external ports.
- The dash (
- Environment variables do not have to be hardcoded in the
docker-compose.yml
file. By default, you can create a.env
file in the same folder as thedocker-compose.yml
file and define your variables there.- The
TF_SERVING_HOST
variable is hardcoded in thedocker-compose.yml
- The
MY_CUSTOM_VAR
variable is defined in the.env
file. Within this file, you may declare a variable per line, using theVAR_NAME=VAR_VALUE
syntax (note the lack of spaces; wrapping keys or values with quotes is optional). - You can define one variable per line inside this file, and you may use either the array syntax or the regular map syntax:
- The
# Array syntax
environment:
- TF_SERVING_HOST=model-server:8500
- MY_CUSTOM_VAR=${MY_CUSTOM_VAR}
# Map syntax
environment:
TF_SERVING_HOST: model-server:8500
MY_CUSTOM_VAR: ${MY_CUSTOM_VAR}
Run the app. The command assumes that your compose file is named docker-compose.yml
and is placed in the same directory that you're invoking the command from.
docker-compose up
Run the app in detached mode.
docker-compose up -d
Shut down the app
docker-compose down
Shut down the app as well as any volumes, both named volumes declared in the compose file as well as anonymous volumes attached to containers.
docker-compose down -v
If you have multiple compose files with non-standard names and you want to specify which one you want to deploy, you may use -f path/to/compose_file.yml
with both up
and down
.
Create local cluster
kind create cluster
Delete local cluster
kind delete cluster
Load an image to the local cluster
kind load docker-image docker-image:tag
Create a default cluster on EKS.
eksctl create cluster
Create a cluster with a config YAML file
eksctl create cluster -f eks-config.yaml
Example eks-config.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: mlzoomcamp-eks
region: eu-west-1
nodeGroups:
- name: ng-m5-xlarge
instanceType: m5.xlarge
desiredCapacity: 1
metadata
contains both thename
of the cluster as well as the AWSregion
.nodeGroups
contains a list of node groups. In this example the list has a single entry.desiredCapacity
contains the amount of nodes inside the node group.instanceType
is the desired AWS EC2 instance type for the node group. All nodes will be of that instance type.
Delete a cluster
eksctl delete cluster -f eks-config.yaml
Example deployment.yaml
file
apiVersion: apps/v1
kind: Deployment
metadata:
name: <deployment-name>
spec:
replicas: 1
selector:
matchLabels:
app: <app-name>
template:
metadata:
labels:
app: <app-name>
spec:
containers:
- name: <my-container>
image: my-component-image:some-tag
resources:
limits:
memory: "128Mi"
cpu: "100m"
ports:
- containerPort: 9696
env:
- name: TF_SERVING_HOST
value: <service-name>.<namespace>.svc.cluster.local:8500
kind
must beDeployment
metadata.name
contains the name of the deploymentspec.replicas
states how many pods should be replicated in the deployment. This example file only states 1 replica.spec.selector
defines how the deployment finds which pods to manage.spec.selector.matchLabels
is a rule that will match a label in the pod template (the label in this case isapp:<app-name>
)spec.template
contains the blueprint for the pods:metadata
in this example contains the labels we use for the pods so that the deployment can find and manage them...spec.containers
contains a plethora of info:name
is the name of the containers inside the pod.image
is the Docker image to be used by the containers.resources
states the physical resource limits- For CPU,
100m
means 100 milliCPUs, or 10% of the available CPU computing time.
- For CPU,
ports
contains the ports to use by the containers.env
contains names and values for nvironment variables, useful for apps to be able to find other containers by their internal cluster URL.- When defining a service, Kubernetes publishes a DNS entry inside the Cluster to make it possible for pods to find other pods. These DNS entries follow the
<service-name>.<namespace>.svc.cluster.local:<port>
format. - The default namespace is
default
.
- When defining a service, Kubernetes publishes a DNS entry inside the Cluster to make it possible for pods to find other pods. These DNS entries follow the
Example service.yaml
file.
apiVersion: v1
kind: Service
metadata:
name: <service-name>
spec:
type: LoadBalancer
selector:
app: <app-name>
ports:
- port: 80
targetPort: 9696
kind
must beService
metadata.name
contains the name of the servicespec.type
specifies the type of Service.- Internal services are of type
ClusterIP
. This is the default service type if this field is not stated in the file. - External services are of type
LoadBalancer
and are assigned an external IP.
- Internal services are of type
spec.selector
contains the label to find the deployment to which it belongs to.spec.ports
contains both the port of the service (port
) as well as the port of the deployment (targetPort
).
Tool to run Ubuntu VM's easily with command-line interface.
List available instances
multipass list
Create and launch a new instance using the latest LTS release
multipass launch --name my_instance
Access the instance shell
multipass shell my_instance
Mount a shared folder in the instance
multipass mount path/to/local/folder my_instance:path/to/instance/folder
Unmount all mounted folders of instance
multipass umount my_instance
Stop an instance
multipass stop my_instance
Start a previously created instance
multipass start my_instance
Get info on a specific instance
multipass info my_instance
Delete an instance (send it to the recycle bin)
multipass delete my_instance
Recover a deleted instance
multipas recover my_instance
Permanently delete all deleted instances
multipass purge