Week 9 - SQL for Analytics
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0
*https://hackyourfuture.net/*

Built with ❤️ by the HackYourFuture community · Thank you, contributors
Found a mistake or have a suggestion? Let us know in the feedback form.
Introduction to SQL for Analytics
2. Joins, CTEs, and Aggregations
3. Data Validation Queries
4. Data Modeling Concepts
5. Building Views in Azure PostgreSQL
6. Gotchas & Pitfalls
7. Practice
8. Assignment
Lesson Plan
Week 9 - SQL for Analytics
Welcome to Week 9! This week shifts focus from Python pipelines to the SQL layer. You will learn how to write analytical queries, validate ingested data with SQL, and understand the architecture of modern data warehouses. By the end of the week, you will deliver a fact and dimension view in Azure PostgreSQL.
Dataset
This week uses a curated subset of the NYC Taxi & Limousine Commission (TLC) Trip Record Data. The dataset includes:
- Trips (fact data): ~57K rows of green taxi trips for January 2024 (pickup/dropoff times, locations, fares, payment types)
- Zones (dimension data): 265 rows mapping location IDs to borough and zone names (taxizonelookup.csv)
Trip data is available as monthly Parquet files at:
https://d37ci6vzurychx.cloudfront.net/trip-data/greentripdataYYYY-MM.parquet
A loading script will be provided to load a curated subset into your Azure PostgreSQL database. This same dataset continues into Week 10 (dbt) and Week 11 (dashboarding).
Learning goals
- Write complex SQL queries using joins, CTEs, and aggregations to answer analytical questions
- Build validation queries that detect duplicates, nulls, and row count mismatches in ingested data
- Distinguish between OLAP (data warehouse) and OLTP (transactional database) workloads and choose the right tool
- Describe the architecture of modern data warehouses such as Snowflake, BigQuery, Redshift, and Fabric
- Apply data modeling concepts including raw/staging/mart layers, grain, keys, facts, and dimensions
- Avoid common SQL performance pitfalls such as accidental cartesian joins
- Deliver a fact view and a dimension view in Azure PostgreSQL
Chapters
- Introduction to SQL for Analytics
- Joins, CTEs, and Aggregations
- Data Validation Queries
- Data Modeling Concepts
- Building Views in Azure PostgreSQL
- Practice
- Assignment
- Gotchas & Pitfalls
Lesson plan