Skip to content

Instantly share code, notes, and snippets.

@onlyphantom
Last active September 11, 2024 20:59
Show Gist options
  • Save onlyphantom/0bffc5dcc25a756e247cb526c01072c0 to your computer and use it in GitHub Desktop.
Save onlyphantom/0bffc5dcc25a756e247cb526c01072c0 to your computer and use it in GitHub Desktop.
Demystifying Docker Volumes for Mac and PC Users

Demystifying Docker Volumes for Mac and PC Users

  1. Docker runs on a Linux kernel

Docker can be confusing to PC and Windows users because many tutorials on that topic assume you're using a Linux machine.

As a Linux user, you learn that Volumes are stored in a part of the host filesystem managed by Docker, and that is /var/lib/docker/volumes. When you're running Docker on a Windows or Mac OS machine, you will read the same documentation and instructions but feel frustrated as that path don't exist on your system. This simple note is my answer to that.

When you use Docker on a Windows PC, you're typically doing one of these two things:

  • Run Linux containers in a full Linux VM (what Docker typically does today)
  • Run Linux containers with Hyper-V isolation

In the first option, when you bind mounting volumes using docker run -v, the files are stored on the Windows NTFS filesystem and there are noted incompatibilities for many popular services. This is what Microsoft documentation says:

These applications all require volume mapping and will not start or run correctly.

  • MySQL
  • PostgreSQL
  • WordPress
  • Jenkins
  • MariaDB
  • RabbitMQ

When using Docker for Mac, you're actually running an instance of Alpine Linux through a lightweight virtualization layer. The hypervisor provides a filesystem and network sharing that is more "Mac-native".

Why does this matter? Because majority of tutorials online do not take them into account when showing code snippets and explanations on Docker Volumes.

  1. Docker Volumes and Storage

Following the official documentations, you will learn that Docker stores data on a local file system by creating this directory structure under /var/lib/docker. This is where docker store all its data (files related to images and containers running on the host):

/var/lib/docker
  /aufs
  /containers
  /image
  /volumes

Any volumes created are stored under volumes:

# the following command:
docker volume create data_volume
# creates the following directory
/var/lib/docker
  /volumes
    / data_volume

Now this is where the confusion begins. Non-Linux users would try and cd to the path provided by said documentation or tutorial and couldn't find it, resulting in threads like Link 1 and Link 2

As it turns out, you will need to get into the Docker VM on your machine. I'll provide the example for a Mac user.

Supposed I run the mysql service and inspect my volumes I will find the following:

docker run -v data_volume:/var/lib/mysql mysql
docker volume ls
DRIVER              VOLUME NAME
local               2dc8364
local               7909f81
...                 ...
local               data_volume

Now open up a second terminal and connect to tty on Docker VM using this (Mac):

screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty

tty is short for teletype, but known today as the terminal. The screen command is designed to offer user the ability to use multiple terminal sessions from a single console. When the session is detached, the process that continues and user can reattach to the screen session later. We use Screen to connect to the Docker VM's terminal (tty) in the command above.

You will now be in the Docker VM:

pwd
# returns: /var/lib/docker/volumes
uname -r
# returns: 4.9.184-linuxkit
ls
# returns:
# 2dc8364
# 7909f81
# ...
# data_volume

Pro-tip: Do not open a second terminal tab to connect to the tty as you will just see garbled text. Detach from the linux screen session using Ctrl-a + d - this will keep the screen session active so you can reattach to it later using screen -r. Use screen -ls to list multiple screens. To kill this session and exit use Ctrl-a + k.

  1. Default location varies by services

Not specific to Mac or Windows users, but knowing where the default location of your services store its data is important to configuring your volume mount.

For example, mysql by default store its data in /var/lib/mysql and if we wish to mount that volume to the /data_volume folder we created in step (1), we could do the following:

docker run -v data_volume:/var/lib/mysql mysql

Now all data created by the mysql service will be mounted onto data_volume on the docker host, such that even when the container is destroyed the data is still persisted in that volume. To fully inspect data_volume, follow the instruction in step (2). Other services will have different default so read the documentation thoroughly. Postgres for example store its database files in /var/lib/postgresql/data, so your Dockerfile or docker-compose.yaml file will have the following configuration instead:

volumes:
  - ./postgres-data:/var/lib/postgresql/data
  1. Default to volume mounting, not bind mounts

You can create the volume explicitly using docker volume create db_vol or implicitly:

  • When you use provide a docker-compose with a volume mount configuration
  • When you docker run -v db_vol:/var/lib/mysql mysql and db_vol don't yet exist (not created using the explicit commands docker volume create)

The above options (both explicit and implicit) create the directory under /var/lib/docker:

/var/lib/docker
  /volumes
    /db_vol

This is a different concept from bind mounting. Consider the case where you have your data already persistent on some other storage location on the Docker host (in this example: /data/) that is not in the default /var/lib/docker directory. You can provide the full path when doing the mount. The code snippet show the difference in that two types of mounting:

# volume mounts (default to /var/lib/docker on Docker host)
docker run -v db_vol:/var/lib/mysql mysql
# bind mounts
docker run -v /data/mysql:/var/lib/mysql mysql
  • Volume mount: Mounts the volume from the volumes directory. Best way to persist data.
  • Bind mount: Mounts a directory from any location on the Docker host, even important system files or directories

When you create a volume, it is stored within a directory on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container. This is similar to the way that bind mounts work, except that volumes are managed by Docker and are isolated from the core functionality of the host machine (since bind mount can mount a directory from any location on the host).

A volume can be mounted into multiple containers simultaneously and when it's not used by any container can be removed using docker volume prune.

When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its full path on the host machine. They also rely on the host machine's filesystem having a specific directory structure available. Bind mounts also allow access to sensitive files on the host filesystem, including creating, modifying or deleting important system files or directories impacting non-Docker processes on the host system.

@Arun-Karunakaran
Copy link

Arun-Karunakaran commented Mar 9, 2022

Thanks for such a detailed info on verifying for volumes on docker for mac. This approach did not serve my needs and I could not locate anything on the mac system like tty under ~/Library/Containers/com.docker.docker/Data/vms/0/ . And when i run the screen command screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty , I get the error command not found. This problem was a nightmare for me. I had to run debian nsenter and sh into it to locate the docker shared volume files on mac.
We need to use the command ,
docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i sh
and then navigate to /var/lib/docker/volumes to find our results.
Thanks,
Arun K

@onlyphantom
Copy link
Author

Thanks for adding to the conversation with that info Arun! I'm on my Linux box a lot more than I'm on my Mac and so I appreciate the additional clarification on what works for you.

@juangea
Copy link

juangea commented Apr 6, 2022

I have a question, what happens if I want to use a hard drive I have with 4Tb to store the data of a docker image, is there a way to do so?

Thanks!

@Arun-Karunakaran
Copy link

I have a question, what happens if I want to use a hard drive I have with 4Tb to store the data of a docker image, is there a way to do so?

Thanks!

Yes . You can do it. Best way would be to use a docker hosted on seperate machine where you can easily scale your disk space too and use your build machine with docker client installed to connect to your remote docker host.

@onlyphantom
Copy link
Author

onlyphantom commented Apr 7, 2022

Yes @juangea, the steps are pretty much the same as the one in the guide.

But as @Arun-Karunakaran mentioned you may want to configure your hardware separately for each hardware-scaling.

@jkulak
Copy link

jkulak commented Apr 14, 2022

Thank you!

Also, in current Docker Desktop version 4.7.0 there is a tab "Volumes" where you can browse data per container and "Save As..." and delete when needed.

@gagamil
Copy link

gagamil commented Mar 4, 2023

Simply thank you!
Clear and comprehensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment