Teachers

Under construction

<aside> 🚧

This page is currently under construction. Please check back later.

</aside>

Implementation plan:

Document-based databases (e.g., MongoDB)
- Schema-less / flexible schema
- Documents and collections
- When to use: varying structure, rapid prototyping, nested data
Key-value stores (e.g., Redis)
- Simple key-value pairs
- In-memory, extremely fast
- When to use: caching, session storage, rate limiting
Comparison: relational vs document vs key-value
Choosing the right database for the use case
Polyglot persistence — using multiple databases in one system

Week 9 — Types of Databases

You have been working with PostgreSQL since Week 5. It is a solid, battle-tested tool — but it is not the right tool for every problem. Real-world backends rarely rely on a single storage solution.

In this section you will learn about two other major categories of databases, understand how they compare to relational databases, and develop the judgement to choose the right one for a given situation.

By the end of this section you will be able to:

Explain what a document database is and when it is a better choice than PostgreSQL
Explain what a key-value store is and describe three real-world use cases
Compare all three database types and articulate the trade-offs
Describe what polyglot persistence means and why production systems use it

1. Document-Based Databases (e.g. MongoDB)

What is a Document Database?

A document database stores data as documents — self-contained, JSON-like objects. Instead of rows and columns in a table, you have documents grouped into collections.

Think of a document as a small package that carries everything related to one thing. A user document contains the user's name, email, preferences, and even their recent activity — all in one place, without needing to join other tables.

MongoDB is the most widely used document database. It stores data in a format called BSON (Binary JSON) — which is JSON under the hood, with some extra data types added for efficiency. From a developer's perspective it looks and feels like plain JSON.

Schema-less / Flexible Schema

This is the biggest conceptual shift coming from PostgreSQL.

In a relational database, every row must follow the same column structure. If you want to add a new field, you must run ALTER TABLE and add the column to every row — even if most rows will leave it empty.

In MongoDB, each document in a collection can have a completely different shape. There is no migration. No ALTER TABLE. You just add the field to the document.

Relational (rigid schema):

products table ┌────┬──────────┬───────────┬────────┬────────────┬────────────┐ │ id │ name │ price │ color │ size │ wattage │ ├────┼──────────┼───────────┼────────┼────────────┼────────────┤ │ 1 │ T-Shirt │ 19.99 │ blue │ M │ NULL │ ← wattage doesn't apply │ 2 │ Lamp │ 49.99 │ NULL │ NULL │ 60W │ ← color/size don't apply └────┴──────────┴───────────┴────────┴────────────┴────────────┘

The table forces every product to have the same columns even if they are irrelevant. The result is many NULL values and a fragile schema that gets messier as you add more product types.

MongoDB (flexible schema):

json

`// A clothing product { "_id": "prod-001", "name": "T-Shirt", "price": 19.99, "color": "blue", "sizes": ["S", "M", "L", "XL"] }

// An electronics product — completely different shape, same collection { "_id": "prod-002", "name": "LED Lamp", "price": 49.99, "wattage": 60, "energyRating": "A+", "compatible": ["E27", "B22"] }`

No NULL columns. No awkward workarounds. Each product carries exactly the fields that make sense for it.

Documents and Collections

MongoDB concept	Relational equivalent	Key difference
Database	Database	Same concept
Collection	Table	No fixed schema
Document	Row	Can have any shape
Field	Column	Does not need to exist in every document
`_id`	Primary key	Auto-generated if not provided

A collection is just a named group of documents. MongoDB does not care if the documents inside share the same fields — that is your responsibility to manage.

Embedding vs Referencing

In a relational database, related data lives in separate tables and is connected with foreign keys. In MongoDB you have a choice: embed the related data inside the document, or reference it by ID.

Embedding — put related data directly inside the document:

json

{ "_id": "post-001", "title": "Getting Started with Spring Boot", "author": "Alice", "tags": ["java", "spring", "backend"], "comments": [ { "user": "Bob", "text": "Great post!", "date": "2024-01-10" }, { "user": "Carol", "text": "Very clear.", "date": "2024-01-11" } ] }

In PostgreSQL this would require a posts table, a comments table, a join, and potentially a tags join table. In MongoDB it is one document, one read — no join needed.

When to embed: the related data always belongs to the parent (comments always belong to a post), the related data is small, and you always read them together.

When to reference: the related data is large, shared across many documents, or updated independently (e.g. a user's full profile referenced from many posts).

Basic MongoDB Operations

javascript

`// Find all documents in a collection db.products.find()

// Find with a filter db.products.find({ color: "blue" })

// Find where an array field contains a value db.products.find({ tags: "java" })

// Find with a nested field db.products.find({ "specs.ram": "16GB" })

// Insert one document db.products.insertOne({ name: "Keyboard", price: 79.99, type: "mechanical" })

// Update a specific field (without replacing the whole document) db.products.updateOne( { _id: "prod-001" }, { $set: { price: 17.99 } } )

// Delete a document db.products.deleteOne({ _id: "prod-001" })`

💡 Note: You will not be using MongoDB in your Spring Boot project this week — these operations are shown to build conceptual understanding of how the query model differs from SQL.

When to Use MongoDB

✅ Good fit:

Data shape varies between records — product catalogs where a laptop has different fields than a t-shirt
Nested / hierarchical data that belongs together — a blog post with its comments, a recipe with its ingredients
Rapid prototyping where the schema is still changing frequently
Data coming from external APIs with unpredictable or evolving structure
High write volumes where schema migrations would slow you down

❌ Not a good fit:

Data with strong, consistent relationships between entities (orders → products → customers → invoices)
Situations where you need transactions across multiple collections with full ACID guarantees
Data that is highly structured and consistent — PostgreSQL will serve you better and be easier to query
Situations where the team is more comfortable with SQL — do not switch for the sake of it

⚠️ A common beginner mistake is using MongoDB to avoid thinking about data modelling. A good schema design matters in MongoDB just as much as in PostgreSQL — it is just expressed differently.

2. Key-Value Stores (e.g. Redis)

What is a Key-Value Store?

A key-value store is the simplest possible database: you store a value under a key, and retrieve it instantly by that key. Think of it as a giant, extremely fast dictionary or HashMap — but one that can be shared across your whole backend.

Redis (Remote Dictionary Server) is the most widely used key-value store. It is open source, runs in memory, and is renowned for being extraordinarily fast.

In-Memory — What That Really Means

Every other database we have covered stores data on disk. When you run a query, PostgreSQL reads data from the hard drive into memory, processes it, and returns the result. This involves disk I/O — and disk is slow compared to RAM.

Redis stores all its data directly in RAM. There is no disk read. The data is already in the fastest storage your computer has.