Introduction to Containers and CI/CD
Week 5 Assignment: Containerize and Ship a Data Pipeline
History of Containers and CI/CD
This page is optional. Nothing here is required for Week 5's learning goals or the assignment. It exists for students who want to understand how the tools you used this week came to be. Read it in one sitting, or skip it and come back when you are curious.
Week 6's History of Cloud Computing traces how infrastructure evolved: bare metal, virtual machines, cloud, serverless. This page covers a different arc: how applications got packaged, isolated, and shipped. The two stories run in parallel.
Background for Chapter 1: Introduction to Containers and CI/CD
Before containers, deploying software meant copying code onto a machine and hoping the environment matched your laptop. Different OS versions, missing libraries, wrong Python interpreter: every new deployment was a small archaeology project. The phrase "works on my machine" dates from this era and is older than most of your teachers.
Teams tried two things to fix it. Virtual machines (VMware 1999, Xen 2003) let you package an entire operating system so you could ship a known-good environment. They worked, but a full VM is heavy: gigabytes of disk, minutes to boot, a full kernel you do not need. Configuration management tools like Puppet (2005), Chef (2009), and Ansible (2012) took a different approach: describe the desired state of a machine in code, and let the tool install packages and edit configs until reality matched. Also useful, also complicated, and still brittle when OS updates moved the ground underneath you.
Containers split the difference: ship the application and its dependencies, but share the host's kernel. Faster than a VM, more reproducible than config management.
Background for Chapter 3: Docker Fundamentals
Containers are not a single invention. They are a combination of Unix isolation features that were added one at a time over forty years.
chroot (1979). Added to Unix Version 7 by Bill Joy. chroot lets you pin a process to a subtree of the filesystem so it cannot see anything outside. This was the first filesystem isolation primitive. It was never meant as a security boundary, but it was the first crack in the idea that every process shared everything.chroot. You could finally isolate a whole "virtual server" on one physical host.By 2013, Linux had all the building blocks for containers. What it lacked was a developer-friendly way to use them.
Background for Chapter 3: Docker Fundamentals
LXC (Linux Containers, 2008) was the first attempt to package namespaces + cgroups into a single "container" abstraction. It worked, but using it meant writing long configuration files and orchestrating kernel features by hand. LXC stayed a tool for systems engineers.
Docker (2013) was released by Solomon Hykes at a company called dotCloud. It took LXC (later replaced by its own libcontainer), added the aufs layered filesystem, a simple command-line interface, a public image registry (Docker Hub), and the Dockerfile as a declarative image-build recipe. A single docker run redis replaced pages of LXC config. Developer adoption exploded within months.
The critical insight: Docker's breakthrough was not new technology. Every isolation primitive Docker used had existed for years. The breakthrough was developer ergonomics: a Dockerfile, a registry, and a two-word CLI. The container wars from 2013 to 2017 (Docker vs. CoreOS rkt vs. systemd-nspawn) were won by the tool that was easiest to use, not the tool with the best isolation.
Background for Chapter 3: Docker Fundamentals
As Docker dominated, the ecosystem worried about vendor lock-in: what if every container image only worked with Docker? In 2015, Docker, CoreOS, Google, and Red Hat founded the Open Container Initiative (OCI) to standardize two things: the image format (how a container image is laid out on disk) and the runtime (how a container is actually started and stopped).
Docker split its codebase to match the standard:
clone(), sets up namespaces, and starts your container process.Why this matters today: in 2022, Kubernetes removed direct Docker Engine support in favor of talking to containerd directly. Nothing about your images changed; OCI standardization meant the same image runs under Docker, containerd, Podman, CRI-O, and a dozen other runtimes. The Dockerfile you wrote in Chapter 3 is, strictly speaking, an OCI image.
Background for Chapter 3: Docker Fundamentals and the Going Further Kubernetes primer
Once containers were easy to run on one machine, teams wanted to run thousands of them across many machines, with automatic restarts, rolling updates, and load balancing. This is the orchestration problem. Three systems competed for it in the 2010s.
By 2017 Kubernetes had effectively won. Its design choices explain why: declarative state (describe what you want, not the steps to get there), extensible resource types (Pods, Services, Ingress, and third-party ones via CustomResourceDefinition), and vendor neutrality (CNCF ownership made every cloud provider comfortable building on it).
Today every major cloud's container platform runs Kubernetes under the hood: Azure Kubernetes Service (AKS), Amazon EKS, Google GKE, and serverless front-ends like Azure Container Apps, AWS Fargate, and Google Cloud Run. Even when you don't write Kubernetes YAML, you are usually using it indirectly.
Background for Chapter 5: Python CI Pipeline
Containers made deployment reproducible, but you still need something to build, test, and push the image on every commit. That is what continuous integration (CI) and continuous deployment (CD) pipelines do.
.travis.yml and builds ran on their servers for free on open source.The shift from CruiseControl's XML in 2001 to GitHub Actions' YAML in 2019 is the same pattern you saw with containers: the underlying machinery (run commands when code changes) barely changed. What changed was where the config lives (a central server → your repo) and how easy it was to start (install and manage a Java app → push a YAML file).
Every step in this timeline removed a reason a developer might say "works on my machine":
| Era | What shipped | What could still go wrong |
|---|---|---|
| 1990s | Source code + README | Wrong OS, missing libraries, manual steps |
| 2000s | VMs (VMware, Xen) | Heavy, slow to boot, image drift |
| Early 2010s | Config management (Puppet, Chef) | Non-deterministic: works today, not tomorrow |
| 2013+ | Docker images | Pinned dependencies, reproducible builds |
| 2015+ | OCI + containerd + K8s | Runs the same image anywhere, from laptop to cloud |
| 2019+ | CI-built OCI images pushed on every commit | Every commit produces a deployable artifact |
In Week 5 you worked at the last row: a GitHub Actions workflow builds a container image on every push and publishes it to a registry. That pipeline only looks simple because the last forty years of Unix, cloud, and CI history converged to make it so.
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

Built with ❤️ by the HackYourFuture community · Thank you, contributors
Found a mistake or have a suggestion? Let us know in the feedback form.