Week 1 - Python Foundations

Python Setup

Data Types and Variables

Control Flow: Logic and Loops

Functions and Modules

Type Hints for Clearer Code

Command-Line Interface Habits

Errors and Debugging

Logging in Python

File Operations

Azure Setup and Account Access

Practice

Week 1 Gotchas & Pitfalls

Week 1 Assignment: The Data Cleaning Pipeline

Career relevance: Week 1 in the NL data job market

Week 1 Glossary

Going Further: Optional Deep Dives

Week 1 Kickoff Slides

Functions and Modules

Organizing code into functions and modules is essential for building maintainable data pipelines.

Functions Review

You learned functions in Core. Here's what's important for data engineering:

def clean_value(value: str, default: str = "") -> str:
    """Clean and normalize a string value.

    Args:
        value: The string to clean
        default: Value to return if input is empty

    Returns:
        Cleaned string, lowercase and stripped
    """
    if not value:
        return default
    return value.strip().lower()

Three things worth noticing in the example above:

<aside> ๐Ÿ’ก Always write docstrings. They help your future self and teammates.

</aside>

Modules

A module is simply a .py file. You can import functions from it.

# `utils.py`
def clean_value(value):
    return value.strip().lower()

# `main.py`
from utils import clean_value
print(clean_value("  HELLO  "))

The __name__ == "__main__" Pattern

This pattern relies on two dunder names (Python's convention for double-underscore identifiers) to let a file work both as a module AND as a script:

# `utils.py`
def clean_value(value):
    return value.strip().lower()

if __name__ == "__main__":
    # Only runs when executed directly
    print(clean_value("  TEST  "))

<aside> โŒจ๏ธ Hands on: Create utils.py with a function, import it in main.py.

</aside>


<aside> ๐Ÿš€ Try it in the widget: https://lasse.be/simple-hyf-teach-widget/?week=1&chapter=functions_and_modules&exercise=modules_demo&lang=python

</aside>

The dunder names in the pattern above have an interesting backstory.

<aside> ๐Ÿค“ Curious Geek: The __name__ == "__main__" idiom

Python files are dual-purpose by design: a .py file can be imported as a library (from utils import clean_value) or run directly as a script (python utils.py). The __name__ variable is how Python tells the file which mode it is in: set to the module's name when imported, set to the literal string "__main__" when executed directly. The idiom dates back to Python's earliest releases in the early 1990s and is unique to Python: most languages either have a dedicated main() function (C, Java, Go) or a separate "entry-point" declaration. Putting setup code under the guard means importers get the function definitions without triggering the file's smoke-test or CLI.

</aside>

You will hit this pattern again every time you write a script that should also be importable: it is a cheap way to make a file dual-purpose.

<aside> ๐Ÿ“ Practice: The week's Practice chapter has two exercises that lean on functions: Ex 1 (the Temperature Logger: write a "production-ready" c_to_f() with a docstring and a default) and Ex 4 (Grade Processor: refactor a 50-line script into named functions). Both run in your venv in a few minutes each.

</aside>

๐Ÿง  Knowledge Check


<aside> ๐Ÿš€ Try it in the widget: Interactive Quiz: Functions and Modules

</aside>

https://lasse.be/simple-hyf-teach-widget/mcq.html?bank=week_1_ch4_functions_modules_quiz&embed=1

Extra reading


Next up: Type Hints for Clearer Code, where you turn the optional value: str annotations into a habit and meet the tooling (mypy, IDE inference) that uses them to catch bugs before your pipeline runs.


The HackYourFuture curriculum is licensed underย CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

CC BY-NC-SA 4.0 Icons

Built with โค๏ธ by the HackYourFuture community ยท Thank you, contributors

Found a mistake or have a suggestion? Let us know in the feedback form.