Introduction to Containers and CI/CD
Week 5 Assignment: Containerize and Ship a Data Pipeline
Containers and CI bring reliability, but a few mistakes can make your pipeline unstable. Read these before you debug for hours.
latest for every imagelatest always points to the newest image.
latest is just a tag. It can point to anything.
The Fix: Use explicit version tags, and use the commit SHA in CI.
.dockerignoreDocker builds are fast even if you copy the entire repo.
Large folders like .venv/ or .git/ make builds slow and can leak secrets.
The Fix: Add a .dockerignore with virtual envs, caches, and .env.
Docker caching works no matter what order you use.
If you copy all files first, every small code change invalidates the cache.
The Fix: Copy requirements.txt first, install, then copy source code.
Setting ENV API_KEY=... in the Dockerfile is safe.
That value becomes part of the image history and can be extracted.
The Fix: Pass secrets at runtime or use cloud secret managers.
localhost inside containersYour app can connect to localhost inside a container.
localhost points to the container itself, not your host machine.
The Fix: Use the external hostname or IP of the service you are connecting to. In Azure, use the connection string provided by the portal.
CI will use the same Python version as your laptop.
CI uses whatever you specify in the workflow, or a default you did not expect.
The Fix: Pin the Python version in your workflow.
Local tests guarantee CI success.
CI is a clean environment without cached data or uncommitted files.
The Fix: Run tests in a fresh virtual environment before pushing.
All files are available inside the container.
Only files copied in the Dockerfile are available.
The Fix: Ensure your Dockerfile copies any data files the app needs.
If the image builds, it is ready to deploy.
Builds can succeed with broken runtime configuration.
The Fix: Add a smoke test step in CI that runs the container briefly.
"Image" and "container" are the same thing.
An image is a template. A container is a running instance. You can have one image and many containers from it. If you edit files inside a running container, those changes are not saved to the image.
The Fix: Think image = blueprint, container = running process. Rebuild the image when you change code.
docker run my-api will make the API reachable on your host.
By default, container ports are not exposed to your host. For a web app, you must map ports with -p. Also, if the process exits immediately, the container stops and you might miss errors if you did not run in the foreground first.
The Fix: Use -p 8000:8000 (or your app's port) for web services. Run without -d first to see logs; switch to -d once it works.
docker build always picks up your latest code changes.
Docker reuses cached layers. If you change a file but the Dockerfile copies it after an unchanged step, Docker may use the old cached layer and your changes will not appear in the image.
The Fix: Use docker build --no-cache when debugging strange behavior. For production, keep the Dockerfile ordered so code changes invalidate only the final layers.
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

Built with ❤️ by the HackYourFuture community · Thank you, contributors
Found a mistake or have a suggestion? Let us know in the feedback form.