Week 5 - Containers & CI/CD

Intro: Containers and CI/CD

Dependency Management

Docker Fundamentals

Azure Container Registry

Python CI Pipeline

Practice

Assignment: Containerize and Ship

Gotchas & Pitfalls

Slides (PDF)

Career relevance: Week 5

Glossary: Week 5

Going Further

History of Containers and CI/CD

Dependency Management

Your pipeline is only as reliable as its dependencies. A pipeline that runs today can break tomorrow if a library updates or a teammate installs a different version. Dependency management solves that by making your Python environment reproducible.

This chapter compares requirements.txt and uv, then shows how to manage dependencies with either option. In Week 1 you learned the basics of virtual environments, package installs, and lock files. See: Python Setup

By the end of this chapter, you should be able to choose between requirements.txt and uv, generate a lock file, and explain why uv sync --frozen is required in Docker and CI.

Concepts

requirements.txt vs uv

In Week 1 you saw both approaches at a high level. Here we go deeper because this choice matters for Docker and CI/CD as well. Both approaches are valid, but they solve reproducibility at different levels:

A package manager decides how your project installs external libraries. In this chapter, the practical choice is between the classic pip plus requirements.txt workflow and the modern uv workflow with pyproject.toml and uv.lock.

For this track, uv is the recommended route because uv.lock pins the full dependency tree, including upstream dependencies that your direct packages pull in.

<aside> 💡 Pick one workflow and use it consistently across local dev, CI, and Docker. If you are starting fresh, prefer uv.

</aside>

The same problem exists in every ecosystem, and the solution always looks similar.

<aside> 📘 Core program connection: In the Core program with JavaScript you used npm install, package.json, and package-lock.json to manage project dependencies. In the Data Track you solve the same problem in Python with requirements.txt or with pyproject.toml plus uv.lock. The goal is the same in both tracks: declare what your project needs, lock exact versions, and make installs reproducible across machines and CI. Refresh the Core program chapter here: Core Program - Package Managers

</aside>

Option A: requirements.txt (classic workflow)

This is the simplest and most portable approach. You list your direct packages and pin versions.

pandas==2.2.1
requests==2.31.0
pydantic==2.6.1

Then install with pip inside a virtual environment:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

<aside> 💡 On Windows, replace source venv/bin/activate with venv\Scripts\activate.

</aside>

This works well, but it does not give you the same built-in lock behavior for the full dependency tree that uv.lock provides.

Pinning your direct dependencies is a good start, but it does not pin their dependencies. You might list requests==2.31.0, but requests depends on urllib3. If urllib3 releases a breaking change, pip can pull in a newer version the next time someone runs pip install, even though your requirements.txt did not change.

# requirements.txt contains:
# requests==2.31.0
#
# urllib3 is not listed, so pip resolves a version at install time

import requests

response = requests.get("<https://api.example.com/data>")

Two teammates running pip install -r requirements.txt a week apart can end up with different environments. A broken CI run with no code change is often the first sign that this is happening.

<aside> ⚠️ Pinning top-level packages controls what you install directly. It does not fully control what those packages install underneath.

</aside>

The format is simple: each line is a package name with an optional version specifier, a comment starting with #, or a blank line. Tooling that processes requirements.txt files needs to handle all three cases.

<aside> ⌨️ Hands on: Write a function parse_requirements(lines) that takes a list of strings from a requirements.txt and returns only the package names, ignoring blank lines, comments, and version specifiers.

</aside>

<aside> 🚀 Try it in the widget: Parse Requirements exercise

</aside>

Option B: uv with pyproject.toml and uv.lock

uv uses pyproject.toml as the source of truth and writes exact versions, including transitive dependencies, to uv.lock.

[project]
name = "weather-pipeline"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
  "pandas==2.2.1",
  "requests==2.31.0",
]
# Install dependencies pinned in uv.lock
uv sync

To run a command inside that managed environment, prefix it with uv run (for example, uv run pytest), so you do not have to activate the virtual environment by hand.

<aside> 💡 Commit uv.lock. It is the record of the exact versions your CI and teammates should use.

</aside>

This is the main reason uv is recommended in this track: you get a faster workflow and stronger guarantees that CI, your laptop, and production install the same dependency graph.

⌨️ Hands on: generate a lock file and freeze it

The theory only sticks if you produce a lock file and watch --frozen catch drift. In a fresh directory, scaffold a project and add one dependency:

uv init weather-pipeline-lockdemo
cd weather-pipeline-lockdemo
uv add "pandas==2.2.1"

Open uv.lock. Notice it pins not just pandas but every transitive dependency (numpy, python-dateutil, pytz, and more) with exact versions. Your direct dependencies live in pyproject.toml; the full resolved graph lives in uv.lock.

Now simulate a CI install, which should use the lock file exactly:

uv sync --frozen

To see what --frozen protects you against, edit pyproject.toml and bump pandas to 2.2.2 without running uv lock. Then run uv sync --frozen again and read the error. That error is what you want CI to throw: it means somebody changed dependencies without committing an updated lock file, and the next install would silently diverge from what was tested.

Refresh the lock file on purpose with uv lock, confirm uv sync --frozen works again, and commit both files together.

<aside> ⚠️ Plain uv sync (without --frozen) will happily re-resolve dependencies if pyproject.toml and uv.lock disagree. That is convenient locally but exactly the behavior you do not want in CI or Docker. Always add --frozen there.

</aside>

Which package manager should you choose?

Use requirements.txt when:

Use uv when:

For this track, the recommendation is:

For the Week 5 assignment specifically, do not agonize over the choice: use whatever your Week 3 or Week 4 pipeline already has (usually a requirements.txt). Reach for uv only if you are starting that project from scratch. Either option passes the assignment, because the goal is a reproducible install, not a particular tool.

<aside> 💡 Using AI to help: Paste two uv.lock snapshots (before and after a dependency update) into an LLM and ask it to explain what changed and why it matters. (⚠️ Ensure no PII or sensitive company data is included!)

</aside>

Runtime vs dev dependencies

Keep production dependencies separate from development tools like linters and tests. This keeps your container images smaller and your CI runs faster.

requirements.txt approach:

uv approach:

[project.optional-dependencies]
dev = [
  "pytest==8.2.0",
  "ruff==0.5.1",
]

<aside> ⚠️ If you install dev tools in production containers, you increase image size and risk extra vulnerabilities.

</aside>

pytest and ruff appear in every dev dependency list in this track. You set them up in Week 2: Testing with pytest and Linting and Formatting. Separating them from production dependencies is how you keep those tools out of the final container image.

<aside> 📘 Core program connection: The dev/production split mirrors the pattern you saw with JavaScript: devDependencies in package.json stay off the production server. Python's [project.optional-dependencies] groups serve the same purpose. Review: Core Program - Package managers

</aside>

What reproducible CI actually means

A reproducible CI run means the same commit installs the same dependency set every time the pipeline runs. CI is your safety check before a deploy, and the signal gets noisy fast when resolvers pick different transitive packages between runs:

uv sync --frozen is the guardrail: it refuses to install if pyproject.toml and uv.lock have drifted apart, which is exactly the moment you want CI to fail loudly.

The concept of deterministic builds applies across every language and toolchain. Checking whether a dependency is pinned is one of the simplest automated quality gates you can add to a project.

<aside> ⌨️ Hands on: Write a function is_pinned(line) that returns True if a requirements.txt line specifies an exact version with ==, and False for any other specifier or a bare package name.

</aside>

<aside> 🚀 Try it in the widget: Is Pinned Version exercise

</aside>

Lock files formalized this idea across every language ecosystem. The same discipline applies to the tools that check your code: pinning ruff and pytest in your dev dependencies ensures every developer and CI runner uses identical versions of the linter and test runner.

<aside> 📘 Core program connection: In the Core program you were introduced to code style and autoformatting tools in JavaScript. ruff in Python serves the same role. Review the Core program chapter: Core Program - Style: Autoformatting

</aside>

That version pinning story extends to every ecosystem.

<aside> 🤓 Curious Geek: Lock files became standard

</aside>

Exercises

  1. Explain one advantage of requirements.txt and one advantage of uv.
  2. Add a dev dependency group and describe when you would install it.
  3. Identify one dependency in your pipeline in week 3 or 4 that should be pinned and explain why.

Knowledge Check

<aside> 🚀 Try it in the widget: Interactive Quiz: Dependency Management

</aside>

https://lasse.be/simple-hyf-teach-widget/mcq.html?bank=week_5_ch2_dependency_management&embed=1

If lock files and the difference between uv and pip felt unclear, this video walks through uv from a practical angle.

<aside> 🎬 Struggling with this concept? Watch this beginner-friendly video:

</aside>

https://www.youtube.com/watch?v=qh98qOND6MI

Extra reading