05 - Docker Data Management
Docker data management
Containers are ephemeral; by default data disappears when a container is removed. Docker offers multiple options to persist and share data.
Why data management matters
- Containers are disposable—data would vanish otherwise.
- Multiple containers often need shared data.
- Some data (e.g., DB files) must survive container lifecycle.
- Performance characteristics differ by storage choice.
Options
- Volumes: Docker-managed storage, lifecycle-independent.
- Bind mounts: host files/directories mounted into containers.
- tmpfs: in-memory temp data on the host.
Volumes (recommended)
Benefits
- Easier to back up/migrate than bind mounts.
- Manageable via Docker CLI/API.
- Safe sharing across containers.
- Drivers can store data remotely.
- New volumes can pre-populate from container contents.
Create/manage volumes
docker volume create my-vol
docker volume ls
docker volume inspect my-vol
docker volume rm my-vol
docker volume prune
Use a volume
docker run -d \
--name devtest \
-v my-vol:/app \
nginx:latest
Bind mounts
Mount host paths into containers—great for dev because you can edit on host and see changes in the container.
Pros
- Fast iteration in dev.
- Share config between host/container.
- Good I/O performance.
Example
docker run -d \
--name devtest \
-v "$(pwd)":/app \
nginx:latest
Note: relies on host path layout; less portable than volumes.
tmpfs
In-memory mounts inside containers; removed when container stops.
Pros
- Good for secrets (not persisted to disk).
- Fits non-persistent temp data.
- Very fast (in RAM).
Example
docker run -d \
--name tmptest \
--tmpfs /app \
nginx:latest
Data volume containers
Dedicated container whose only purpose is to house volumes for others to share.
docker create -v /dbdata --name dbstore postgres:latest
docker run -d --volumes-from dbstore --name db1 postgres:latest
docker run -d --volumes-from dbstore --name db2 postgres:latest
Backup and restore
Backup
docker run --rm \
-v my-vol:/source \
-v $(pwd):/backup \
alpine \
tar -czf /backup/my-vol-backup.tar.gz -C /source .
Restore
docker run --rm \
-v my-vol:/target \
-v $(pwd):/backup \
alpine \
tar -xzf /backup/my-vol-backup.tar.gz -C /target
Best practices
- Use volumes for persistent data (e.g., databases).
- Use bind mounts in dev for fast code iteration.
- Use tmpfs for sensitive or non-persistent data.
- Mind permissions when mounting; ensure proper access for container processes.
- Back up important data regularly and test restores.
Practical examples
MySQL with persistent data
docker volume create mysql-data
docker run -d \
--name mysql-db \
-e MYSQL_ROOT_PASSWORD=secret \
-v mysql-data:/var/lib/mysql \
mysql:5.7
Node.js dev with bind mount
docker run -d \
--name node-app \
-p 3000:3000 \
-v "$(pwd)":/app \
-w /app \
node:14 \
npm start
Summary
Volumes, bind mounts, and tmpfs each serve different needs. Pick volumes for persistence, bind mounts for dev convenience, and tmpfs for in-memory/secret data. Next: Docker networking—how containers talk to each other and the outside world.