Remember that the model's knowledge is frozen at training time. It has no access to your company's internal documents, last week's news, or your codebase. RAG is the standard solution to this problem.
RAG (Retrieval-Augmented Generation) is a technique that combines a retrieval system (like a search in a database) with a language model. Instead of relying solely on what the model learned during training, RAG first fetches relevant information from an external source like a database, a document store, or a codebase and injects it into the context window. The model then generates a response grounded in that retrieved content.
Here is a great video explaining the motivation behind RAG. It includes a live demo of a small software written in Python. There is no programming knowledge required for this video.
https://www.youtube.com/watch?v=of4UDMvi2Kw
The retrieval step is typically powered by a vector database, which stores content as numerical representations called ‘embeddings’. Vector databases and embeddings are fascinating topics but out of scope for this course. For now, knowing that RAG exists and why it matters is enough.
https://www.youtube.com/watch?v=UabBYexBD4k
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0

Found a mistake or have a suggestion? Let us know in the feedback form.