Week 6 - Cloud and Azure Essentials
Introduction to Cloud and Azure
Quick definitions for every term introduced in bold across Week 6 (Chapters 1-6). Click the chapter backlinks to jump to where the term is first used in context.
Infrastructure as a Service. You rent virtual machines and manage the operating system, runtime, and application yourself. Example: Azure Virtual Machines. (Used in Chapter 1: Introduction to Cloud and Azure)
Platform as a Service. The provider manages the infrastructure and runtime; you deploy your code or container. This week's services (Azure Container Apps, Azure Database for PostgreSQL) are PaaS, which is the sweet spot for data engineering. (Used in Chapter 1: Introduction to Cloud and Azure)
Software as a Service. Ready-to-use applications you log into rather than deploy. Examples: Microsoft 365, Slack. (Used in Chapter 1: Introduction to Cloud and Azure)
The top-level identity for an organization in Azure. In this course, HackYourFuture is the tenant, and every subscription and resource lives under it. (Used in Chapter 1: Introduction to Cloud and Azure)
The billing and access boundary in Azure. Your Azure for Students subscription lives under the shared tenant and controls what you can create and who pays for it. (Used in Chapter 1: Introduction to Cloud and Azure)
A logical container that groups related Azure resources for a project. When you delete a resource group, everything inside it is deleted too. Resource groups follow a naming pattern like rg-hyf-data. (Used in Chapter 1: Introduction to Cloud and Azure)
The physical Azure location where a resource is created (for example westeurope). Keep all resources for a project in the same region to reduce latency and cost. (Used in Chapter 1: Introduction to Cloud and Azure)
A way of storing files (objects) as single units identified by a key, instead of in a directory tree. It is flat, scales to petabytes, and is accessed over an HTTP API. It is the core storage layer for cloud data pipelines. (Used in Chapter 3: Azure Blob Storage)
The top-level namespace for all your storage in Azure Blob Storage. Your teacher created a shared one for the class; it contains the containers that hold your blobs. (Used in Chapter 3: Azure Blob Storage)
A logical grouping of blobs inside a storage account, like a top-level folder. This is a storage container, not a Docker container; the shared account has pre-created raw and processed containers. (Used in Chapter 3: Azure Blob Storage)
A single file stored in Azure Blob Storage: JSON, CSV, Parquet, an image, a log, anything. You write a blob once and replace the whole object rather than editing it in place. (Used in Chapter 3: Azure Blob Storage)
A URL that contains everything needed to connect to a service: for Postgres, the host, port, database name, username, password, and SSL mode. Connection strings contain secrets, so you keep them in environment variables and never commit them to git. (Used in Chapter 4: Azure PostgreSQL Databases)
The encryption layer that protects the connection between your code and the database. Azure Postgres requires it by default; you enable it with sslmode=require in the connection string. (Used in Chapter 4: Azure PostgreSQL Databases)
A service where the provider handles updates, backups, disk space, security patches, and monitoring for you. With a managed database like Azure Database for PostgreSQL, you write SQL and Azure runs the infrastructure. (Used in Chapter 4: Azure PostgreSQL Databases)
A PostgreSQL column type that auto-generates incrementing integer IDs. It replaces SQLite's INTEGER PRIMARY KEY AUTOINCREMENT. (Used in Chapter 4: Azure PostgreSQL Databases)
A PostgreSQL timestamp type that stores a date and time with timezone awareness. Use it instead of storing timestamps as TEXT, which is what you often did in SQLite. (Used in Chapter 4: Azure PostgreSQL Databases)
Insert a row, or update it if a row with the same key already exists. In Postgres you write INSERT ... ON CONFLICT (...) DO UPDATE SET ..., which lets you re-run a pipeline on the same data without creating duplicates or failing on a unique constraint. (Used in Chapter 4: Azure PostgreSQL Databases)
Giving each user or service only the minimum access it needs to do its job. Your pipeline connects with a dedicated user that can create tables and insert rows but cannot drop databases or read other teams' data, so a leaked credential does limited damage. (Used in Chapter 4: Azure PostgreSQL Databases)
A model where you provide a container or function and the cloud provider manages the servers, scaling, and teardown. Servers still exist, but you do not manage them and you pay only for the time your code runs. (Used in Chapter 5: Azure Container Apps Jobs)
The two modes of Azure Container Apps. An App is a long-running service that handles HTTP requests (a web API or dashboard). A Job starts, does its work, and exits, which is the right fit for a data pipeline. (Used in Chapter 5: Azure Container Apps Jobs)
The --replica-timeout setting on a Container App Job: the number of seconds after which Azure kills a run that has not finished. Setting it (300 seconds is a good default) stops a hung pipeline from billing indefinitely. (Used in Chapter 5: Azure Container Apps Jobs)
The amount of usage a service gives you at no cost each month. The Container Apps consumption plan includes 180,000 vCPU-seconds per month free, so small, short pipeline runs typically cost nothing. (Used in Chapter 5: Azure Container Apps Jobs)
A specific size and pricing tier of an Azure resource (for example Standard_B1ms for a Postgres Flexible Server). Picking the smallest SKU that meets your needs is the simplest way to control cost. (Used in Chapter 6: Cost Awareness)
Locally redundant storage: the cheapest Blob Storage replication option, which keeps copies within a single region. It is enough for a dev environment; geo-redundant replication costs more and is for data you cannot afford to lose to a regional outage. (Used in Chapter 6: Cost Awareness)
The private registry where your container images live (introduced in Week 5). A Container App Job pulls its image from ACR using the --registry-server flag when it runs. (Used in Chapter 5: Azure Container Apps Jobs)
Azure's managed store for secrets like connection strings and passwords. Week 6 retrieves Postgres and storage strings from kv-hyf-data with az keyvault secret show. (Used in Chapter 5: Azure Container Apps Jobs)
The command-line interface (az) for managing Azure resources. It follows a consistent az <service> <action> --flags pattern, is a Python wrapper around the Azure REST API, and is the primary tool you use in this track. Every az command is scriptable and reproducible, unlike portal clicks. (Used in Chapter 2: Azure CLI and the Portal)
The unique path that identifies an Azure resource, in the form /subscriptions/<SUB_ID>/resourceGroups/<RG>/providers/<PROVIDER>/<TYPE>/<NAME>. You see resource IDs in logs, error messages, and role assignments; you rarely construct one by hand but recognising the pattern helps when debugging. (Used in Chapter 2: Azure CLI and the Portal)
The web-based dashboard at portal.azure.com where you can browse resource groups, inspect deployments, view logs, and read connection strings. Useful for visual exploration and debugging, but slower than the CLI for repeated work because portal clicks are not scriptable. (Used in Chapter 2: Azure CLI and the Portal)
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

Built with ❤️ by the HackYourFuture community · Thank you, contributors
Found a mistake or have a suggestion? Let us know in the feedback form.