Week 1 - Foundational Python

Python Setup

Data Types and Variables

Control Flow: Logic and Loops

Functions and Modules

Type Hints for Clearer Code

Command-Line Interface Habits

Errors and Debugging

Logging in Python

File Operations

Azure Setup and Account Access

Practice

Week 1 Assignment: The Data Cleaning Pipeline

Week 1 Gotchas & Pitfalls

Lesson Plan

🎒 Week 1 Assignment: The Data Cleaning Pipeline

In this assignment, you will build a robust command-line tool to clean a "messy" dataset. This mimics a very common real-world task for data engineers: taking raw, inconsistent data and transforming it into a clean, usable format.

Task 1 – The Cleaner Pipeline

week_1__messy_users.csv

You have been given a file data/messy_users.csv. It contains user data, but it is full of errors: whitespace issues, inconsistent capitalization, missing fields, and badly formatted numbers.

Instead of one big script, you will build a modular pipeline:

  1. src/utils.py: Create functions for cleaning individual fields (e.g., clean_salary, clean_name).
  2. src/cleaner.py: The main entry point that uses functions from utils.py to process the entire file.

Cleaning Rules

  1. Name: Remove any leading/trailing whitespace.
  2. Email: Convert to lowercase.
  3. Department: If missing, set to "Unknown".
  4. Salary:
  1. Validation:

Technical Requirements

Task 2 – AI Debugging Report

We want you to practice using AI as a tool for debugging, not just generating code.

  1. Introduce a bug into your code intentionally (or use a real one you encountered).
  2. Ask an LLM (ChatGPT, Claude, etc.) to help you fix it.
  3. Create a file AI_DEBUG.md and document:

Task 3 – Azure Setup

Data engineering often happens in the cloud. We need to verify you are ready for the upcoming cloud modules.

  1. Log into portal.azure.com.
  2. Take a screenshot of the portal dashboard showing your account logged in.
  3. Save the image as assets/azure_proof.png.

Submission

  1. Ensure your project structure looks like this:
    week1-assignment/
    ├── src/
    │   ├── cleaner.py
    │   └── utils.py
    ├── data/
    │   └── messy_users.csv
    ├── output/
    │   └── (clean_users.json will be generated here)
    ├── assets/
    │   └── azure_proof.png
    ├── AI_DEBUG.md
    └── README.md
  1. Create a git branch week1/your-name.
  2. Commit your changes.
  3. Push to the repository and open a Pull Request.

The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0

CC BY-NC-SA 4.0 Icons

*https://hackyourfuture.net/*

Found a mistake or have a suggestion? Let us know in the feedback form.