Week 6 - Cloud and Azure Essentials
Introduction to Cloud and Azure
Week 6 Assignment: Deploy Your Pipeline to Azure
As a data engineer, you often have the ability to create databases, virtual machines, and storage accounts. Each of those resources has a price, and costs add up quickly if you are not paying attention. Cost awareness is part of the job.
By the end of this chapter, you should understand how to estimate costs, choose the right resource size, and avoid unnecessary spending.
In this course, your Azure costs are covered by the shared tenant. In a real job, they are not. A pipeline that works but costs ten times more than it should is not a good pipeline.
Azure publishes pricing pages for every service. Before you provision a Postgres server or a storage account, look up what the SKU costs per hour or per month.
The Azure pricing calculator lets you estimate costs before committing. Use it.
<aside> 🖼️ Interactive: Week 6 Cost Calculator
</aside>
Try adjusting the sliders to see how different choices affect your monthly bill.
<aside>
⌨️ Hands on: Open the cost calculator above and experiment with different settings. Then open the official Azure pricing calculator and look up the monthly cost of a Standard_B1ms Postgres Flexible Server in West Europe. Compare the numbers.
</aside>
Here is a realistic estimate for the resources you use in this track:
| Resource | SKU / tier | Monthly cost (West Europe) |
|---|---|---|
| PostgreSQL Flexible Server | Standard_B1ms (1 vCore, 2 GB RAM) | ~€13 |
| Blob Storage (LRS) | 10 GB stored + light usage | ~€0.02 |
| Container Registry (Basic) | Shared across class | ~€5 (shared) |
| Container App Job | 1 run/day, 60 seconds each | Free (within 180,000 vCPU-sec/month free tier) |
| Total | ~€13/month |
Almost all the cost comes from the Postgres server, because it runs 24/7 whether your pipeline uses it or not. Blob storage and Container App Jobs are nearly free at this scale.
Not all Azure resources cost money the same way. Some bill continuously, others only when used:
| Resource | Bills when idle? | Why |
|---|---|---|
| PostgreSQL Flexible Server | Yes: runs 24/7 | It is a running VM with reserved compute |
| Blob Storage | Yes: per GB stored | You pay for data at rest, even if nobody reads it |
| Container App Job | No: only during execution | No compute allocated between runs |
| Container Registry | Yes: fixed monthly fee | The registry exists whether you push images or not |
| Container Apps Environment | No: consumption plan | No charge when no apps or jobs are running |
The key takeaway: Postgres is the expensive one. Your teacher manages the shared server, but in a real project, stopping or scaling down the database when it is not needed is the single biggest cost saver.
A Standard_B1ms Postgres server costs a fraction of a Standard_D4s_v3. Start small and scale up only when you have evidence you need more.
The same applies to storage: do not pick geo-redundant replication for a dev environment where locally redundant storage (LRS) is enough.
A database that runs 24/7 for a pipeline that runs once a day is wasting money. Many Azure services can be stopped without deleting them. When stopped, you keep your data and configuration but stop paying for compute.
For PostgreSQL Flexible Server, you can stop and start it from the CLI:
# Stop the server (no compute charges while stopped)
az postgres flexible-server stop \\
--resource-group rg-weather-dev \\
--name hyf-data-pg
# Start it again when you need it
az postgres flexible-server start \\
--resource-group rg-weather-dev \\
--name hyf-data-pg
A stopped Postgres server still stores your data (you pay for storage), but the compute cost drops to zero. For a Standard_B1ms server, that saves ~€0.018/hour, which adds up to ~€9.50/month if you only run it 8 hours a day instead of 24.
<aside> ⚠️ Azure automatically restarts a stopped Flexible Server after 7 days if you do not start it yourself. Set a reminder or script the stop/start cycle.
</aside>
The same principle applies to other compute resources: VMs, App Service plans, and managed databases all have a stop option. Storage (blobs, disks) always bills for data at rest, but stopping the compute layer is where the real savings are.
When to stop vs delete:
| Action | Use when | Data preserved? |
|---|---|---|
| Stop | You need the resource again soon (e.g. next class) | Yes |
| Delete | You are done with it permanently | No |
For this course, your teacher manages the shared server. But in a real project, scheduling stop/start around your pipeline's actual usage hours is the single easiest cost optimization.
Tags like team=data and project=weather let you filter cost reports and see where money is going.
# Add tags to a resource group
az group update --name rg-weather-dev \\
--tags team=data project=weather env=dev
Add tags early, not at the end.
Azure lets you create budget alerts that notify you when spending crosses a threshold. This catches runaway costs before they become a problem.
<aside>
💡 Using AI to help: Paste your az resource creation commands into an LLM and ask "What will this cost per month in West Europe?" It can estimate based on published pricing. Always verify against the official pricing calculator. (⚠️ Ensure no PII or sensitive company data is included!)
</aside>
In professional settings, cost decisions are often shared:
Knowing what things cost and proactively optimizing is a signal of professional maturity.
<aside> 🤓 Curious Geek: Cloud cost horror stories
Cloud cost surprises are more common than you think. Troy Hunt (creator of Have I Been Pwned) documented how his cloud costs spiralled due to unexpected bandwidth charges on a service he assumed was cheap. A misconfigured auto-scaling rule or a forgotten GPU cluster can generate massive bills overnight. Companies like Netflix and Spotify employ dedicated "FinOps" teams to manage cloud spending. The lesson: always check pricing, always set alerts.
</aside>
Standard_B1ms Postgres server + 10 GB blob storage + a Container App Job running once daily for 60 seconds.team and env tags to your resource group using the CLI.--trigger-type Schedule and a cron expression that runs every minute. They forget about it over the weekend. What happens to the bill, and how could budget alerts have helped?The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

Built with ❤️ by the HackYourFuture community · Thank you, contributors
Found a mistake or have a suggestion? Let us know in the feedback form.