When you ask an AI chatbot to fix your code or explain a concept, it can feel like you're talking to something that understands you. But under the hood, an LLM is doing something much simpler and much stranger. It is not magic - it's math.
At its core, an LLM is a very large mathematical function that takes your prompt as input and calculates, token by token, what the most probable response looks like. No reasoning, no understanding, no knowledge base being queried. Just pattern matching at massive scale that produces something that looks like intelligence. Once you understand that, you'll have a much clearer picture of why LLMs are so capable in some situations, and so confidently wrong in others.
An honest introduction to an LLM would be:
"Hi I am ChatGPT. I am a 1 terabyte zip file. My knowledge comes from the internet, which I read in its entirety about 6 months ago and remember only vaguely. My winning personality was programmed, by example, by human labelers at OpenAI :)"
Source: How I use LLMs by Andrej Karpathy [12:15]
<aside> 💭
Watch the following video up until minute 13:13.
</aside>
https://youtu.be/EWvNQjAaOHw?si=76BFNraP2qBaSfMK&t=164
Training an LLM is a complicated, lengthy, and expensive process. It requires vast amounts of data and powerful computers to process this information into a usable model. In this section, we'll briefly explain the key stages and terms involved in creating an LLM.
<aside> 💭
The result is the assistant you interact with: knowledgeable from pre-training, well-behaved from post-training. Pre-training gives the model its knowledge, post-training gives it its personality.
</aside>
The HackYourFuture curriculum is licensed under CC BY-NC-SA 4.0

Found a mistake or have a suggestion? Let us know in the feedback form.