Week 4 - Data Processing with Pandas

Introduction to Pandas

DataFrame operations

Grouping and Aggregation

Joining and Merging

Different Data Types

Advanced Transformations

Writing Data

Alternatives to Pandas

Practice

Assignment

Gotchas & Pitfalls

Back to Track

Week 4 - Data Processing with Pandas

Welcome to Week 4! You have learned how to structure code (Week 2) and ingest/validate data (Week 3). Now it's time to process it at scale. This week introduces Pandas, the industry-standard tool for high-performance data manipulation in Python. You will also learn about modern data architectures (ETL vs ELT) and efficient storage formats like Parquet.

By the end of this week, you will be able to load complex datasets, transform them efficiently using vectorized operations, and describe the architectural trade-offs between traditional ETL and modern ELT pipelines.

Learning goals



Back to Data Track


CC BY-NC-SA 4.0 Icons

*https://hackyourfuture.net/*

Found a mistake or have a suggestion? Let us know in the feedback form.