Introduction to Big Data and Streaming
Apache Spark Core Concepts
Databricks
Streaming Theory
Streaming Platforms: Kafka and Azure Event Hubs
Practice
Assignment
Gotchas & Pitfalls
Week 13 Lesson Plan (Teachers)
Assignment
Content coming soon...
Suggested Project Scope
- Part 1 (Big Data — required): Run a data transformation in a Databricks notebook using PySpark on a provided dataset
- Part 2 (Streaming — stretch goal): Send sample messages to Azure Event Hubs and consume them using the Kafka-compatible client
- Document the differences between batch and streaming approaches for your use case (required)
- Deliverables (required): Databricks notebook export, brief write-up comparing batch vs streaming
- Deliverables (stretch goal): Python scripts that send and receive messages via Azure Event Hubs
- Requirements: working Databricks transformation, at least basic error handling, comparative write-up
- Scope note: The Event Hubs streaming exercise is a stretch goal. The required deliverable is the Databricks transformation + the comparative write-up.
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0

*https://hackyourfuture.net/*
Found a mistake or have a suggestion? Let us know in the feedback form.