Week 7 - Mid-Track Project

Project Brief

Project Guidelines and Requirements

Week 7 Gotchas & Pitfalls

Week 7 Gotchas & Pitfalls

Project weeks have their own failure modes. Most of them are about planning and scope, not code.

Each gotcha starts with what you might assume (the misconception), followed by what actually happens (the reality).

1. Scope creep

The misconception

"I should build something impressive. One data source is too simple, so I will add three APIs, a transformation layer, and a dashboard."

The reality

One week is not enough for an ambitious project. If you try to build too much, you finish nothing. A working pipeline with one API is worth more than a half-finished system with three.

The Fix: Define your MVP on Day 1: one data source, one pipeline, one storage target. Only add features after the core pipeline works end to end.

<aside> 💡 Write your MVP in one sentence: "My pipeline fetches X from Y and stores it in Z." If you cannot say it in one sentence, your scope is too large.

</aside>

2. Unreliable data source

The misconception

"I found a cool API online. I will build my whole project around it and figure out the details later."

The reality

Some APIs require paid access, have strict rate limits (10 requests/day), need OAuth flows, or return inconsistent data. If you discover this on Day 3, you have lost half your project time.

The Fix: Verify your data source on Day 1. Call the API, inspect the response, and parse it into your Pydantic model. Have a backup data source ready (Open-Meteo is always a safe fallback).

3. Skipping local testing

The misconception

"I will develop directly on Azure. It is faster to push my code and see if it works in the cloud."

The reality

Every Azure deployment cycle takes minutes: build image, push to ACR, update job, start execution, wait for logs. If your code has a bug, you wait 5 minutes to see an error you could have caught in 5 seconds locally.

The Fix: Always run docker build and docker run locally first. Confirm the output before pushing to ACR. Use a local .env file with test values for local runs.

4. Last-minute deployment

The misconception

"I will get the code working first and deploy to Azure on the last day. Deployment should only take a few commands."

The reality

Azure deployment has its own set of issues: firewall rules, missing --registry-server, wrong environment variable names, image pull errors. If you hit these on the last day, you have no time to fix them and prepare for the interview.

The Fix: Deploy a minimal container to Azure on Day 1. Even a "hello world" container that prints a message. This proves your infrastructure works. Then replace it with your real pipeline when the code is ready.

5. Forgetting documentation

The misconception

"I will write the README after the code is done. Documentation is the last step."

The reality

On the last day, you are fixing bugs, deploying, and preparing for the interview. The README gets forgotten or becomes a single sentence: "Run the pipeline." The interviewer opens your repo and has no idea what your project does.

The Fix: Start the README on Day 1. Write the headings: What, How to Run Locally, How to Deploy, How to Verify. Fill in each section as you build. The README is also your interview script.

6. Not preparing for the interview

The misconception

"I built the project, so I can explain it. I do not need to practice."

The reality

Building and explaining are different skills. If you skip interview prep, you may struggle to articulate your decisions, get flustered by unexpected questions, and forget to show the live deployment.

The Fix: The interview has four parts. Prepare for each one specifically.

Part 1: Technical questions (5-7 min): You will be asked one question at each difficulty level. Examples:

Part 2: Demo (5-7 min): Walk through your project live. Show: the Container App Job in Azure, a successful execution in the job history, logs from that execution, rows in Postgres, and blobs in storage. Practice this sequence so you are not hunting around in the portal during the interview.

Part 3: Code discussion (during demo): The interviewer will ask questions about your code as you show it. "Why did you use a Pydantic model here instead of if statements?" / "What happens if I remove this try/except?" Knowing why you made each decision matters more than what you built.

Part 4: Code comprehension exercise (5 min): You will receive a short piece of unfamiliar Python pipeline code. Read it, explain what it does, find the bug, and suggest improvements. Copilot will be off. Practice reading code you did not write: reviewing the starter template or a classmate's code is good preparation.

<aside> ⌨️ Hands on: Before the interview, do a full dry run: open your terminal, navigate to your project, and walk through all four parts out loud. Time the demo: it should take under 7 minutes. If you stumble or cannot explain a decision, that is the area to prepare.

</aside>

7. Secrets in git history

The misconception

"I removed the connection string from the code before pushing. It is safe now."

The reality

Git remembers everything. If you committed a connection string and then deleted it, the secret is still in the commit history. Anyone with access to the repo can find it with git log -p.

The Fix: Use environment variables from the start. Never put real credentials in code, even temporarily. If you accidentally committed a secret, rotate it immediately (ask your teacher for a new connection string).

<aside> ⚠️ Even private repositories are not safe for secrets. If the repo ever becomes public, or if a collaborator's account is compromised, every secret in the history is exposed.

</aside>

8. Table already exists with different columns

The misconception

"CREATE TABLE IF NOT EXISTS will set up my table. It is safe to run multiple times."

The reality

IF NOT EXISTS checks only whether a table with that name exists. It does not compare columns. If a classmate (or a previous project) already created weather_readings with different columns, the statement silently does nothing. Your INSERT then crashes because the columns do not match.

psycopg2.errors.UndefinedColumn: column "city" of relation "weather_readings" does not exist

This is especially common when multiple students share the same Postgres database.

The Fix: Use a unique table name that includes your project name or username (e.g., alice_weather_readings). Alternatively, check the existing schema before inserting:

SELECT column_name FROM information_schema.columns WHERE table_name = 'weather_readings';

<aside> 💡 When sharing a database, prefix your table names to avoid collisions. This is a real-world practice: production databases often use schemas (analytics.weather_readings) to separate teams.

</aside>

9. Ignoring cost cleanup

The misconception

"The project is done and graded, so I do not need to do anything else."

The reality

Your Container App Job still exists in Azure. If it has a cron trigger, it keeps running and using resources. Even manual jobs consume environment resources. The shared Postgres server costs money when it is running.

The Fix: Delete your Container App Job after evaluation:

az containerapp job delete --name <your-job-name> --resource-group <rg> --yes

This is not optional. See [Week 6 Chapter 6](../Week 6/week_6__6_cost_awareness.md).

<aside> 💡 Using AI to help: If you hit a deployment error you do not understand, paste the full error message into an LLM and ask: "What does this Azure Container Apps error mean and how do I fix it?" This is faster than searching documentation for cryptic error codes. (⚠️ Ensure no PII or sensitive company data is included!)

</aside>

The term "scope creep" has a long history in software project management.

Extra reading


The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

CC BY-NC-SA 4.0 Icons

Built with ❤️ by the HackYourFuture community · Thank you, contributors

Found a mistake or have a suggestion? Let us know in the feedback form.