Teachers

Enhancing your application with LLM

LLMs can significantly enhance your application by adding intelligent features. By integrating LLMs, you can create more intuitive user experiences, automate complex tasks, and provide personalised responses to user queries.

Here are a few examples:

Content generation (e.g. jokes, recipes, game rules, motivational quotes)
Extracting information from text
Summarising long text
And many more

How to access LLMs in code

The most popular way to use an LLM from code is through an existing AI platform via HTTP API. Since we've already learned how to access HTTP APIs, this will feel natural. There are several popular platforms that provide LLM access via HTTP:

OpenAI Platform - GPT-4, GPT-5 models
Claude Platform - Sonnet and Opus models
Ollama - Free open source models

And many more platforms.

OpenAI platform usage dashboard. On the left side, you can see the different menu options available.

The biggest downside of using these platforms is that they are not free and can become very expensive if used extensively. However, some do offer a free limited trail to test it out. In this section we will use Github models to access different LLMs.

Using Github Models

Github models provides a limited yet free usage of popular LLMs via a simple HTTP API.

Model playground

In the Github Models Playground, you can find all the supported LLMs and test out in a simple chat. You can also change to the parameters tab and play with the LLM parameters like temperature or Top-P .

<aside> ⌨️

Hands on:

Open the models playground, choose GPT-4o mini and send a prompt
Go to parameters, add a system prompt and send another prompt. </aside>

Rate limits

<aside> ❗

You can only send a limited number of free requests per day. The limit depends on the model size. Be mindful not to exceed this limit when interacting with LLMs.

Once you reach the limit, you won't be able to access the LLMs without paying or waiting to the next day.

Check out the full list of rate limits to learn more.

</aside>

For our studies, we will use a lower-tier LLM like GPT-4o mini. Lower-tier LLMs have a limit of 150 requests per day (as of 2026). This should be enough to learn and prototype a simple application.

Sending an HTTP Request

Creating a token

Before sending the HTTP request, you will need to generate a Personal access token which we will be using in the HTTP API. To generate a token, follow the following steps:

Go to your Github settings → Developer settings → Personal access tokens → Fine-grained tokens → Generate a new token.
Give it a friendly name, optional expiration date
In the permissions section, add a read-only Models permission
Generate the token.

<aside> ❗

GitHub will show you the token only once. Make sure to save it somewhere secure and do not share this token with anyone.

</aside>

Testing out a token

Let’s send our first HTTP request to Github models. We will be using the following endpoint to send a request:

https://models.github.ai/inference/chat/completions

<aside> 💡

The API description can be found in the following url: https://docs.github.com/en/rest/models/inference

</aside>

Example

curl -L \
  -X POST \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer <YOUR-TOKEN>" \
  -H "X-GitHub-Api-Version: 2026-03-10" \
  -H "Content-Type: application/json" \
  <https://models.github.ai/inference/chat/completions> \
  -d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"What is the capital of France?"}]}'

<aside> ⌨️

Hands on: Use the code above to test out the LLM:

Replace <YOUR-TOKEN> with your newly generate personal access token.
Find the prompt in the data JSON and change it

Tip: You can also copy this code to VSCode, make the changes and paste it in the terminal window.

</aside>

If everything works, you will be able to get a very detailed response with many different properties.

{
  "choices": [
    {
      ...
      "message": {
        "annotations": [],
        "content": "The capital of France is Paris.",
        "refusal": null,
        "role": "assistant"
      }
    }
  ],
  "created": 1773920299,
  "id": "chatcmpl-DL5zfIZgpApSEWFaahtfAkn8OiZxk",
  "model": "gpt-4o-mini-2024-07-18",
  ...  
}

<aside> ⌨️

Hands on: Look at your response, can you find the answer from LLM?

Hint: You can paste the response in a new VSCode window and run a formatter tool

Show answer </aside>

Using JavaScript

In JavaScript, you can use fetch to send an HTTP request like the curl example above. However, OpenAI has created an NPM package that makes it easier to send requests to LLMs via a standard API. Here's a simple example:

import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "<https://models.github.ai/inference/>",
  apiKey: `YOUR-TOKEN`
});

const response = await openai.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [{ role: "user", content: "Tell me a joke about programmers." }]
});

const responseContent = response.choices[0].message.content;
console.log(responseContent);

<aside> ⌨️

Hands on: Run the example above. Don’t forget to run npm install openai and replace YOUR-TOKEN with your personal Github token.

</aside>

To learn more about the openai package, visit the official Github repo.

<aside> 🎉

You've just taken your first steps towards empowering your application with LLMs! Think about the endless applications you can make with this powerful API.

</aside>

Sending prompts like a developer

Returning text as a response may not always be the best case for an application. To solve this, you can ask the LLM to provide you with a reply in JSON format which you can then parse and read easily.

Example for a prompt:

Generate a delicious recipe for a cheese cake. 
Reply in a valid and prasable JSON with the following structure:

{
  "name": "string",
  "description": "string",
  "calories": number,
  "ingredients": [{ "name": "amount": number, "unit": "string"}],
  "preparation_steps": ["array"],
}

do not return anything else besides the JSON

<aside> ⌨️

</aside>

You will then get a well-structured JSON object that you can now use it in your application.

const prompt = `Generate a delicious recipe for a cheese cake. Reply in a ...`
const response = await openai.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [{ role: "user", content: prompt }]
});

const responseContent = response.choices[0].message.content;
const recipie = JSON.parse(responseContent);
console.log(recipie.ingredients.length); // 12
console.log(recipie.preparation_steps[0]); // Preheat the oven to 325°F (165°C).

<aside> 💭

</aside>

Week 11 - OOP concepts & LLMs

Enhancing your application with LLM

How to access LLMs in code

Using Github Models

Model playground

Rate limits

Sending an HTTP Request

Creating a token

Testing out a token

Example

Using JavaScript

Sending prompts like a developer

Additional Resources

Read