Week 9 - SQL for Analytics
Introduction to SQL for Analytics
Joins, CTEs, and Aggregations
3. Data Validation Queries
5. Data Modeling Concepts
6. Building Views in Azure PostgreSQL
7. Gotchas & Pitfalls
8. Practice
9. Assignment
Lesson Plan
8. Practice
Practice
Content coming soon...
Suggested Activities
- Write a JOIN query that combines trips with zones to show the zone name for each pickup location
- Use a CTE to calculate daily trip counts, then query the CTE to find the busiest day
- Refactor a nested subquery into CTEs for readability
- Build validation queries: find trips with NULL pickuplocationid, detect duplicate trip records, check for orphaned location IDs
- Create vwfacttrips and vwdimzones as SQL views in Azure PostgreSQL
- Query your views to answer: "Which borough had the highest total fare revenue?" and "What are the top 5 pickup zones by trip count?"
- Compare query plans for a cartesian join vs a properly filtered join using EXPLAIN
Suggested Conceptual Questions
- What is the grain of the trips table? (Answer: one row per trip)
- If you join trips to zones without specifying a join condition, how many rows do you get and why? (Answer: cartesian product, trips x zones = millions of rows)
- Why should you validate for orphaned location IDs before building a fact view that joins to the zones dimension?
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0
*https://hackyourfuture.net/*

Built with ❤️ by the HackYourFuture community · Thank you, contributors
Found a mistake or have a suggestion? Let us know in the feedback form.