Week 1 - Python Foundations

Python Setup

Data Types and Variables

Control Flow: Logic and Loops

Functions and Modules

Type Hints for Clearer Code

Command-Line Interface Habits

Errors and Debugging

Logging in Python

File Operations

Azure Setup and Account Access

Practice

Week 1 Gotchas & Pitfalls

Week 1 Assignment: The Data Cleaning Pipeline

Career relevance: Week 1 in the NL data job market

Week 1 Glossary

Going Further: Optional Deep Dives

Week 1 Kickoff Slides

Control Flow: Logic and Loops

Data engineering is all about processing streams of data. To do that, you need to make decisions (conditionals) and repeat actions (loops). This lesson covers the logic you'll use in almost every script.

Conditionals (if/elif/else)

Conditionals let your code make decisions based on data.

Basic Syntax

score = 85

if score >= 90:
    print("Grade: A")
elif score >= 80:
    print("Grade: B")
else:
    print("Grade: C or lower")

Checking for "Truthiness"

Python allows you to check if lists, strings, or numbers are "empty" or "zero" directly:

# Check if a list has items
users = []
if not users:
    print("No users found!")

# Check if a string is missing
name = None
if not name:
    print("Name is missing")

# Check if a number is non-zero
count = 0
if count:
    print(f"Count is {count}")
else:
    print("Count is zero")

Loops

for Loops

Use for loops when you want to iterate over a collection (like a list of files or rows in a CSV). One pass through the loop body is one iteration; the for loop knows how many iterations to run because the collection has a known length. The collection itself is called an iterable: lists, tuples, strings, dicts, files, and range() objects all qualify. To get a running counter alongside each value, wrap the iterable in enumerate().

# Loop over a list
files = ["data1.csv", "data2.csv", "data3.csv"]
for filename in files:
    print(f"Processing {filename}...")

# Loop with an index using enumerate()
for i, filename in enumerate(files):
    print(f"File {i+1}: {filename}")

# Loop over a dictionary
user = {"name": "Alice", "role": "Engineer"}
for key, value in user.items():
    print(f"{key}: {value}")

while Loops

Use while loops when you don't know how many times to repeat, but you have a condition to stop (e.g., waiting for a file to appear, or retrying a network request).

import time

retries = 3
while retries > 0:
    print(f"Connecting to database... ({retries} retries left)")
    # simulate connection attempt
    success = False 

    if success:
        print("Connected!")
        break  # Exit the loop immediately

    retries -= 1
    time.sleep(1) # Wait 1 second before retrying

if retries == 0:
    print("Failed to connect.")

break and continue

records = [10, 20, -1, 30, -5, 40]

valid_records = []
for record in records:
    if record < 0:
        print(f"Skipping invalid record: {record}")
        continue  # Skip negative numbers

    if record > 50:
        print("Limit reached, stopping.")
        break     # Stop processing if value is too high

    valid_records.append(record)

List Comprehensions

List comprehensions are a concise, "Pythonic" way to create lists. They are extremely popular in data engineering for simple transformations.

Syntax

# [expression for item in iterable if condition]

Examples

1. Transform a list (Map)

# Old way
numbers = [1, 2, 3, 4]
squared = []
for num in numbers:
    squared.append(num * 2)

# List comprehension way
squared = [num * 2 for num in numbers]
# Result: [2, 4, 6, 8]

2. Filter a list

# Get only even numbers
evens = [num for num in numbers if num % 2 == 0]

3. Clean data strings

raw_names = ["  Alice ", "Bob", "  Charlie  "]
clean_names = [name.strip().lower() for name in raw_names]
# Result: ['alice', 'bob', 'charlie']

<aside> โš ๏ธ Pro Tip: If your list comprehension is getting too complex (e.g., nested loops or multiple distinct conditions), switch back to a regular for loop for readability.

</aside>

Nested Loops

Sometimes you need to loop loop inside a loop.

departments = {
    "Engineering": ["Alice", "Bob"],
    "Sales": ["Charlie", "David"]
}

for dept, employees in departments.items():
    print(f"--- {dept} ---")
    for employee in employees:
        print(f"  - {employee}")

Output:

--- Engineering ---
  - Alice
  - Bob
--- Sales ---
  - Charlie
  - David

This pattern is useful when processing grouped data (e.g., records by department, transactions by customer).

<aside> ๐Ÿค“ Curious Geek: Why list comprehensions exist

List comprehensions arrived in Python 2.0 in October 2000 via PEP 202, heavily inspired by the same syntax in Haskell and (further back) the set-builder notation from mathematics: {xยฒ | x โˆˆ โ„•, x < 10}. Guido van Rossum championed them because the explicit for-loop-and-append pattern was the most-typed three lines in every Python program. The same impulse later gave Python generator expressions (PEP 289), dict comprehensions, and set comprehensions: each one is "a for loop where the only goal was to build a new collection."

</aside>

Try it yourself

Test your understanding of loops and logic with this interactive exercise:

<aside> ๐Ÿš€ Try it in the widget: https://lasse.be/simple-hyf-teach-widget/?exercise=control_flow

</aside>

Challenge: The code in the widget loops through a list but has logic errors. Use break and continue to filter the data correctly as per the instructions in the widget.

<aside> ๐Ÿ“ Practice: The week's Practice chapter has two exercises that build on this chapter: Ex 2 (the Data Cleaner: loops + conditionals over a dirty list) and Ex 4 (Grade Processor: branching logic on a dictionary). Both take a few minutes and run in your venv.

</aside>

๐Ÿง  Knowledge Check


<aside> ๐Ÿš€ Try it in the widget: Interactive Quiz: Control Flow

</aside>

https://lasse.be/simple-hyf-teach-widget/mcq.html?bank=week_1_ch3_control_flow_quiz&embed=1

Cheatsheet

Conditionals


if x > 10:
    pass
elif x == 5:
    pass
else:
    pass

Loops


## For loop

for item in items:

    # do something

## With index

for index, item in enumerate(items):

    # do something

## List Comprehension

new_list = [transform(x) for x in old_list if condition(x)]

## Dictionary Iteration

for key, value in my_dict.items():

    # do something

Control


Next up: Functions and Modules, where you package the loops and conditionals you just learned into reusable building blocks for your pipelines.


The HackYourFuture curriculum is licensed underย CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

CC BY-NC-SA 4.0 Icons

Built with โค๏ธ by the HackYourFuture community ยท Thank you, contributors

Found a mistake or have a suggestion? Let us know in the feedback form.