Teachers

RAG : Retrieval-Augmented Generation

Making LLMs smarter

Remember that the model's knowledge is frozen at training time. It has no access to information that is not publicly available such as private company documents, last week's news, or your codebase. RAG is the standard solution to this problem.

RAG (Retrieval-Augmented Generation) is a technique that combines a retrieval system (like a search in a database) with a language model. Instead of relying solely on what the model learned during training, RAG first fetches relevant information from an external source like a database, a document store, or a codebase and injects it into the context window. The model then generates a response grounded in that retrieved content.

Watch: A Helping Hand for LLMs (RAG) - Computerphile

Here is a great video explaining the motivation behind RAG. It includes a live demo of a small software written in Python. There is no programming knowledge required for this video.

https://www.youtube.com/watch?v=of4UDMvi2Kw

Key takeaways from the video

RAGs can improve the responses of LLMs with new information
In the demo, the “database” the host used is just a simple HTTP data fetch. It is very similar to using a web search tool.

Vector databases

The retrieval step is typically powered by a vector database, which stores content as numerical representations called ‘embeddings’. Vector databases and embeddings are fascinating topics but out of scope for this course. For now, knowing that RAG exists and why it matters is enough.

Watch: Sometimes, RAG is not necessary

https://www.youtube.com/watch?v=UabBYexBD4k

Additional Resources

Video

https://www.youtube.com/watch?v=YDdKiQNw80c

Read

The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0 *https://hackyourfuture.net/*

CC BY-NC-SA 4.0 Icons

Built with ❤️ by the HackYourFuture community · Thank you, contributors

Found a mistake or have a suggestion? Let us know in the feedback form.

Week 11 - OOP concepts & LLMs