Content

Practice

<aside> 📝 These exercises are optional but strongly recommended. Each one drills a single skill that the chapter hands-on sections did not cover end-to-end. Work through them in order: Exercise 1 is a hard prerequisite for the fct_trips mart, and the later exercises assume the project state from Ch2–Ch6.

</aside>

By the end of this chapter, you should have:

A stg_trips model with tip_pct, fare_per_mile, and payment_type_label columns, plus a safe_divide macro.
One singular test in the tests/ folder that catches a business-rule violation.
Muscle memory for the compile → read error → fix debugging loop.
One hands-on experience of propagating a column change from a staging model all the way to a mart.

Exercise 1: Macros and computed columns for `stg_trips` (SQL and Jinja Templating)

Materializations & Layers builds the fct_trips mart assuming stg_trips exposes tip_pct, fare_per_mile, and payment_type_label. This exercise adds those columns. Do it before you run dbt build --select +fct_trips in the assignment.

Create macros/safe_divide.sql with the macro from SQL and Jinja Templating.
Update models/staging/stg_trips.sql to select two new derived columns:

{{ safe_divide('tip_amount', 'fare_amount') }} as tip_pct
{{ safe_divide('fare_amount', 'trip_distance') }} as fare_per_mile

Add a payment_type_label column using the {% for %} mapping from the Jinja chapter (codes 1-6 → human labels).
Run dbt compile --select stg_trips and open target/compiled/nyc_taxi/models/staging/stg_trips.sql. Confirm every {{ ... }} has been resolved to plain SQL.
Run dbt run --select stg_trips and spot-check:

   SELECT payment_type, payment_type_label, tip_pct, fare_per_mile
   FROM dev_<your_name>.stg_trips
   LIMIT 10;

Success criteria: every payment_type code resolves to a readable label, and tip_pct / fare_per_mile are non-null for rows with non-zero denominators.

Exercise 2: Write a singular test (dbt Tests)

Singular tests are plain .sql files that return the bad rows: introduced in dbt Tests but not drilled. Write one yourself.

Create tests/assert_fare_amount_non_negative.sql containing a select against {{ ref('stg_trips') }} that returns every row whose fare_amount is negative. Recall: an empty result passes, any returned rows fail.
Run dbt test --select test_type:singular. The test will fail with FAIL 182 against the January 2024 dataset: there are 182 rows where the TLC has recorded a negative fare (most often credit-card refunds or voided trips re-entered as corrections).
Open the compiled test at target/compiled/nyc_taxi/tests/assert_fare_amount_non_negative.sql and run it directly in psql to inspect the offending rows. The compiled form is the "show me the bad rows" query you would write by hand: no debugging setup needed.
Decide what to do: the test is working as designed, but a failing error-severity test blocks the pipeline. Real teams either (a) downgrade to warn severity by moving the test into YAML with a custom test or keeping it as a singular test and editing the singular-test config in dbt_project.yml, or (b) filter negative fares out in stg_trips so fct_trips never sees them. For this exercise, edit stg_trips.sql to add and fare_amount >= 0 to the where clause, re-run dbt run --select stg_trips, and re-run dbt test --select test_type:singular. The test should now pass.

Success criteria: you can show that the test originally failed with 182 rows, and after filtering negative fares out of stg_trips the test passes cleanly. You can state in one sentence why a singular test is the right tool for this kind of check rather than a generic YAML test.

Exercise 3: Debug a broken `ref()` (SQL and Jinja Templating)

When dbt run fails because of a bad ref(), the error lands at parse or compile time: not at run time: and the fix lives in a different file than the error points at. Drill the compile-to-find-the-typo workflow now, before you meet it under pressure in the assignment.

In models/marts/fct_trips.sql, introduce a deliberate typo: change one {{ ref('stg_trips') }} to {{ ref('stg_trps') }}.
Run dbt compile --select fct_trips. Read the error message carefully: note which node dbt names and which file it points at.
Fix the typo.
Re-run dbt compile --select fct_trips. Confirm it now compiles cleanly.

Success criteria: you can state in one sentence what dbt's error told you, and can explain why the error appears at compile time rather than at run time. (Hint: compare with what would happen if you had typed the full table name in a plain SQL view instead.)

Exercise 4: Propagate a column change (Materializations & Layers)

Adding a column to a staging model and making it appear in the mart is the single most common dbt change in a real job. The chapter hands-on did not exercise it end to end. Do one now.

Add trip_duration_minutes to models/staging/stg_trips.sql:

   extract(epoch from (dropoff_datetime - pickup_datetime)) / 60 as trip_duration_minutes

Run dbt run --select stg_trips. Confirm the view rebuilds.
Check the column exists in the view:

   SELECT trip_duration_minutes FROM dev_<your_name>.stg_trips LIMIT 3;

Now try to query the same column from the mart:

   SELECT trip_duration_minutes FROM dev_<your_name>.fct_trips LIMIT 3;

This should error. Explain why.

Add t.trip_duration_minutes to the select list in models/marts/fct_trips.sql.
Rebuild the mart and its upstream: dbt run --select +fct_trips.
Re-run the mart query from step 4. It now returns rows.

Success criteria: you can state in one sentence why stg_trips had the column but fct_trips did not, and what the + prefix in --select +fct_trips does that plain --select fct_trips would not.

Conceptual questions

These do not require running dbt: discuss with a classmate or write down your answers.

Exercise 3 had you introduce a typo and catch it at compile time. If fct_trips were a plain SQL CREATE VIEW in Postgres instead of a dbt model, at which stage (write / save / query) would the same typo surface? Why does dbt's model compile its SQL separately from executing it?
In Exercise 4, you added a column to stg_trips but querying it from fct_trips still failed until you also edited the mart. What does that tell you about how dbt propagates changes (and about why --select +fct_trips is the right pattern on a real team)?
You add an accepted_values test on payment_type allowing [1, 2, 3, 4, 5]. A trip with payment_type = 6 (voided) appears in the source. What happens when you run dbt test, and what severity would you set on that test?

The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

CC BY-NC-SA 4.0 Icons

Built with ❤️ by the HackYourFuture community · Thank you, contributors

Found a mistake or have a suggestion? Let us know in the feedback form.