Week 2 - Structuring Data Pipelines


The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

CC BY-NC-SA 4.0 Icons

Built with ❤️ by the HackYourFuture community · Thank you, contributors

Found a mistake or have a suggestion? Let us know in the feedback form.

Introduction to Data Pipelines

Configuration & Secrets (.env)

Separation of Concerns (I/O vs Logic)

OOP vs Functional Programming

Dataclasses for Data Objects

Functional Composition

Testing with Pytest

Linting and Formatting with Ruff

Practice

Gotchas & Pitfalls

Assignment: Refactoring to a Clean Pipeline

Week 2 - Structuring Data Pipelines

Welcome to Week 2! Now that you know the Python basics, it's time to move from "writing scripts" to "building engineering systems." This week is all about architecture. You will learn how to structure your code so that it is readable, testable, and robust against the messy reality of production data.

By the end of this week, you will have refactored a messy "god script" into a professional, modular pipeline that survives missing files, malformed rows, and missing config: configuration, data modeling, and business logic each live in their own module.

Learning goals


Chapters

  1. Introduction to Data Pipelines
  2. Configuration & Secrets
  3. Separation of Concerns
  4. OOP vs Functional Programming
  5. Dataclasses for Data Objects
  6. Functional Composition
  7. Testing with Pytest
  8. Linting and Formatting with Ruff
  9. Practice
  10. Gotchas & Pitfalls
  11. Assignment

Supplementary

Lesson plan: live-quiz answer cribs, workshop reference solutions, and the assignment rubric. Deliberately excluded from scripts/notion_mapping.json so students cannot read the answer key.