Week 1 - Python Foundations

Overview

Python Setup

Data Types and Variables

Control Flow: Logic and Loops

Functions and Modules

Type Hints for Clearer Code

Command-Line Interface Habits

🐛 Errors and Debugging

📝 Logging in Python

File Operations

Azure Setup and Account Access

🛠️ Practice

🎒 Assignment

⚠️ Gotchas & Pitfalls

🗓️ Lesson Plan

🐛 Errors and Debugging

Welcome to the inevitable! No matter how experienced you become, you will write code that crashes, as I’m sure you’ve experienced plenty of times in the core program. In fact, as a Data Engineer, you will often deal with messy data that breaks your pipelines.

Learning to read errors and fix them (debugging) is a superpower. In this chapter, we will learn how Python tells us something went wrong and how to investigate it.


Types of Errors

Just like in JavaScript, errors in Python generally fall into three categories:

1. Syntax Errors

These happen when Python doesn’t understand your code because you broke the rules of the language. Python catches these before it runs your program.

# ❌ SyntaxError: expected ':'
if True
    print("This won't run")

Common Syntax Errors:

Since you have the VSCode Python Extension installed, your IDE will flag these when you write the code by using Pylance, VSCode’s default Python language support tool.

2. Runtime Errors (Exceptions)

These happen while the program is running. The syntax is correct, but something illegal happened during execution.

# ❌ ZeroDivisionError: division by zero
result = 10 / 0

# ❌ NameError: name 'x' is not defined
print(x)

3. Logical Errors

The program runs without crashing, but it does the wrong thing. These are the hardest to catch because Python won’t give you an error message.

# 🐛 Logical Error: Calculating average incorrectly
numbers = [10, 20, 30]
average = sum(numbers) / 2  # Should be divided by len(numbers), which is 3!

Reading Stack Traces

When Python crashes, it prints a “Stack Trace” (or Traceback). It looks intimidating, but it’s actually very helpful. It tells you exactly where the problem is.

Unlike some JavaScript error messages which can be vague, Python’s traceback is usually very precise.

Example Traceback:

Traceback (most recent call last):
  File "main.py", line 5, in <module>
    calculate_total(10, 0)
  File "main.py", line 2, in calculate_total
    return a / b
ZeroDivisionError: division by zero

How to read it:

  1. Start at the bottom: The last line tells you the type of error (ZeroDivisionError) and the message (division by zero).
  2. Look just above it: It tells you the file (main.py), the line number (line 2), and the code that caused the crash.
  3. Go up: If the error happened inside a function, the lines above show you who called that function.

🖐 Hands-on: Be the Detective

Copy the following code into a file named buggy.py and run it. Look at the traceback. Which line actually caused the crash? Which line called the function that crashed?

def greet(name):
    return "Hello " + name

def welcome_users(users):
    for user in users:
        print(greet(user))

# There is a bug here!
user_list = ["Alice", "Bob", 123]
welcome_users(user_list)

Debugging Techniques

Option 1: The “Print” Debugging

The simplest way to debug is often the most effective. If your code isn’t doing what you expect, print() the values of your variables at different steps.

def add_tax(amount):
    print(f"DEBUG: amount is{amount}") # 👀 Check input
    tax = amount * 0.21
    print(f"DEBUG: tax calculated is{tax}") # 👀 Check intermediate value
    return amount + tax

🖐 Hands-on: Fix the Logic

The following code tries to find the largest number in a list, but it returns the wrong answer. Use print() statements to trace the loop and fix the logical error.

numbers = [1, 5, 2, 9, 3]
max_num = 0

for n in numbers:
    if n < max_num:  # 🤔 Is this correct?
        max_num = n

print(f"The largest number is{max_num}")

Option 2: VS Code Debugger

Using print() is fine for small scripts, but for larger applications, it gets messy. Imagine having to delete 50 print statements before committing your code!

Visual Studio Code has a built-in Debugger. It allows you to pause your code in the middle of execution and look at the variables live.

1. Setting a Breakpoint

A breakpoint is a stop sign for your code. When Python reaches this line, it will pause.

2. Starting the Debugger

Instead of clicking the “Play” button at the top right:

  1. Click the Run and Debug icon on the left sidebar (it looks like a bug with a play button ▷🐛).
  2. Click the big blue Run and Debug button.
  3. Select Python Debugger -> Python File if asked.

Your code will start running and freeze at your red dot.

3. Controlling the Flow

Once paused, a floating toolbar appears at the top. Here are the most important buttons:

4. Inspecting Variables

Look at the Variables panel on the left side. You can see the value of every variable at that exact moment. No more print(variable) needed!

🖐 Hands-on: The Runaway Batch Job

You are writing a script to fetch data records from an API. You want to collect exactly 20 records to form a “batch” before saving them. The API gives you records in chunks of 3.

This code runs forever and crashes your terminal (an infinite loop). Do not fix it by guessing! Use the debugger to find out why it misses the target.

  1. Copy the code below into VS Code.
  2. Set a breakpoint 🔴 on the line current_count += 3.
  3. Start the Debugger (Run and Debug).
  4. Watch the current_count variable in the Variables panel on the left.
  5. Keep clicking Continue (▷). What value does current_count have when the loop should stop, but doesn’t?
target_batch_size = 20
current_count = 0

print("--- Starting Batch Collection ---")

# We need exactly 20 records to close the batch
while current_count != target_batch_size:
    print(f"Status: We have{current_count} records...")

    # Simulate fetching 3 records at a time
    current_count += 3

print("Batch successfully collected!")

Once you see the variable skip past 20 in the debugger, you’ll realize why != (not equal) is dangerous here. How would you change the while condition to make it safe?


🧠 Knowledge Check

  1. What is the difference between a SyntaxError and a RuntimeError?
  2. In a Python traceback, where should you look first to understand what went wrong?
  3. What happens to your code execution when it reaches a breakpoint (the red dot 🔴)?
  4. If the debugger is paused at a line that calls a function, and you want to go inside that function to see how it works, should you use Step Over or Step Into?

📚 Extra Reading

If you want to dive deeper into how Python handles errors and how to use the VS Code debugger efficiently, check out these resources: