Project weeks have their own failure modes. Most of them are about planning and scope, not code.
Each gotcha starts with what you might assume (the misconception), followed by what actually happens (the reality).
"I should build something impressive. One data source is too simple, so I will add three APIs, a transformation layer, and a dashboard."
One week is not enough for an ambitious project. If you try to build too much, you finish nothing. A working pipeline with one API is worth more than a half-finished system with three.
The Fix: Define your MVP on Day 1: one data source, one pipeline, one storage target. Only add features after the core pipeline works end to end.
<aside> 💡 Write your MVP in one sentence: "My pipeline fetches X from Y and stores it in Z." If you cannot say it in one sentence, your scope is too large.
</aside>
"I found a cool API online. I will build my whole project around it and figure out the details later."
Some APIs require paid access, have strict rate limits (10 requests/day), need OAuth flows, or return inconsistent data. If you discover this on Day 3, you have lost half your project time.
The Fix: Verify your data source on Day 1. Call the API, inspect the response, and parse it into your Pydantic model. Have a backup data source ready (Open-Meteo is always a safe fallback).
"I will develop directly on Azure. It is faster to push my code and see if it works in the cloud."
Every Azure deployment cycle takes minutes: build image, push to ACR, update job, start execution, wait for logs. If your code has a bug, you wait 5 minutes to see an error you could have caught in 5 seconds locally.
The Fix: Always run docker build and docker run locally first. Confirm the output before pushing to ACR. Use a local .env file with test values for local runs.
"I will get the code working first and deploy to Azure on the last day. Deployment should only take a few commands."
Azure deployment has its own set of issues: firewall rules, missing --registry-server, wrong environment variable names, image pull errors. If you hit these on the last day, you have no time to fix them and prepare for the interview.
The Fix: Deploy a minimal container to Azure on Day 1. Even a "hello world" container that prints a message. This proves your infrastructure works. Then replace it with your real pipeline when the code is ready.
"I will write the README after the code is done. Documentation is the last step."
On the last day, you are fixing bugs, deploying, and preparing for the interview. The README gets forgotten or becomes a single sentence: "Run the pipeline." The interviewer opens your repo and has no idea what your project does.
The Fix: Start the README on Day 1. Write the headings: What, How to Run Locally, How to Deploy, How to Verify. Fill in each section as you build. The README is also your interview script.
"I built the project, so I can explain it. I do not need to practice."
Building and explaining are different skills. If you skip interview prep, you may struggle to articulate your decisions, get flustered by unexpected questions, and forget to show the live deployment.
The Fix: The interview has four parts. Prepare for each one specifically.
Part 1: Technical questions (5-7 min): You will be asked one question at each difficulty level. Examples:
Part 2: Demo (5-7 min): Walk through your project live. Show: the Container App Job in Azure, a successful execution in the job history, logs from that execution, rows in Postgres, and blobs in storage. Practice this sequence so you are not hunting around in the portal during the interview.
Part 3: Code discussion (during demo): The interviewer will ask questions about your code as you show it. "Why did you use a Pydantic model here instead of if statements?" / "What happens if I remove this try/except?" Knowing why you made each decision matters more than what you built.
Part 4: Code comprehension exercise (5 min): You will receive a short piece of unfamiliar Python pipeline code. Read it, explain what it does, find the bug, and suggest improvements. Copilot will be off. Practice reading code you did not write: reviewing the starter template or a classmate's code is good preparation.
<aside> ⌨️ Hands on: Before the interview, do a full dry run: open your terminal, navigate to your project, and walk through all four parts out loud. Time the demo: it should take under 7 minutes. If you stumble or cannot explain a decision, that is the area to prepare.
</aside>
"I removed the connection string from the code before pushing. It is safe now."
Git remembers everything. If you committed a connection string and then deleted it, the secret is still in the commit history. Anyone with access to the repo can find it with git log -p.
The Fix: Use environment variables from the start. Never put real credentials in code, even temporarily. If you accidentally committed a secret, rotate it immediately (ask your teacher for a new connection string).
<aside> ⚠️ Even private repositories are not safe for secrets. If the repo ever becomes public, or if a collaborator's account is compromised, every secret in the history is exposed.
</aside>
"I called my table weather_readings and used CREATE TABLE IF NOT EXISTS. It is safe to run multiple times."
Every student shares the same Postgres server. Without a schema, your table lands in public alongside every other student's tables. Two outcomes, both bad:
public.weather_readings with different columns. Your CREATE TABLE IF NOT EXISTS silently does nothing; your INSERT crashes because the columns do not match.psycopg2.errors.UndefinedColumn: column "city" of relation "weather_readings" does not exist
The Fix: Set your personal schema at connection time using the DB_SCHEMA environment variable. The starter template already does this:
schema = os.environ.get("DB_SCHEMA", "public")
cur.execute(f"CREATE SCHEMA IF NOT EXISTS {schema}")
cur.execute(f"SET search_path TO {schema}")
Set DB_SCHEMA=dev_alice in your .env and in the --env-vars of your Container App Job (replace alice with your GitHub handle). All CREATE TABLE and INSERT statements after SET search_path run in your schema automatically.
You practiced this pattern in Week 6 Chapter 4. If you skipped that exercise, now is the time to revisit it.
<aside>
💡 In production, teams use schemas to isolate environments (dev, staging, prod) or departments (analytics, finance). The pattern is the same.
</aside>
"My tests pass locally, so CI should pass too."
The CI workflow runs ruff format --check src/ as a separate step. If any file would be reformatted, the step fails with 1 file would be reformatted, even if the logic is correct and all tests pass. Common triggers: a long f-string that exceeds the line-length limit, unsorted imports, or trailing whitespace.
Would reformat: src/storage.py
1 file would be reformatted, 2 files already formatted
Error: Process completed with exit code 1.
The Fix: Before every push, run both commands locally:
uv run ruff format src/ # rewrites files in place
uv run ruff check src/ # catches lint errors (add --fix to auto-correct simple ones)
Re-stage any files ruff changed, then push. Adding this to your muscle memory on Day 1 saves repeated CI failures throughout the week.
"I fixed a bug, pushed to GitHub, and CI built a new image. My job will pick it up automatically."
Azure Container Apps Jobs pull the image at job creation time and pin the tag. If your job was created with my-pipeline:1.0 and CI later pushes a new image under a different tag (or latest), the job keeps running the old image indefinitely. Your fix never runs.
The Fix: Use latest as your image tag consistently:
# Every time you push a new version manually:
docker build --platform linux/amd64 -t hyfregistry.azurecr.io/my-pipeline:latest .
docker push hyfregistry.azurecr.io/my-pipeline:latest
Create your job with :latest from the start. Azure pulls a fresh copy of latest on each execution, so every push automatically gets picked up without updating the job.
If you accidentally created your job with a pinned tag (:1.0), update it:
az containerapp job update \
--name my-pipeline-job \
--resource-group rg-hyf-data \
--image hyfregistry.azurecr.io/my-pipeline:latest
"The project is done and graded, so I do not need to do anything else."
Your Container App Job still exists in Azure. If it has a cron trigger, it keeps running and using resources. Even manual jobs consume environment resources. The shared Postgres server costs money when it is running.
The Fix: Delete your Container App Job after evaluation:
az containerapp job delete --name <your-job-name> --resource-group <rg> --yes
This is not optional. See Week 6 Chapter 6.
<aside> 💡 Using AI to help: If you hit a deployment error you do not understand, paste the full error message into an LLM and ask: "What does this Azure Container Apps error mean and how do I fix it?" This is faster than searching documentation for cryptic error codes. (⚠️ Ensure no PII or sensitive company data is included!)
</aside>
The term "scope creep" has a long history in software project management.
--registry-server, SDK log noise)