Connecting GitHub LLM Models WebUI

I've tried countless LLM interfaces over the past weeks, and honestly, most of them left me wanting more. Either they were locked behind paywalls, limited to single users, or they'd hallucinate so badly I couldn't trust the output. Then I discovered Open WebUI, and it's been a game-changer.

What makes it stand out? First off, it's completely open source, no vendor lock-in, no mysterious pricing tiers that suddenly change. The stability is rock solid, I've had it running for days without a single crash. But here's what really sold me: it's genuinely multi-user. I can set it up once and my entire team can access it with proper authentication and separate conversation histories.

But, the feature that absolutely blew me away, though, is the ability to create custom Tools in plain Python. This isn't just some gadget, it changes how you work with LLMs!!!
Instead of letting the model guess at calculations, fetch outdated information, or make things up, you can write simple Python functions that give it real capabilities. Need to query a database? Write a Tool. Want to perform accurate calculations? Another Tool. It's like giving the LLM a proper toolkit instead of asking it to build a house with its bare hands.

The result? Better accuracy, fewer tokens wasted on trial-and-error, and way fewer hallucinations.

Getting Open WebUI up and running

The installation process is straightforward. The official guide covers multiple deployment methods depending on your setup.

Whether you're using Docker or a bare metal installation, the documentation walks you through it step by step. I won't rehash what they've already explained well, but I will show you something the official docs don't cover in detail: hooking it up to GitHub's free tier LLM models.

Connecting to GitHub Models (The Free Tier everyone forgets about)

Here's where things get interesting. GitHub quietly offers free access to several powerful LLM models through their API, and most people have no idea. Getting Open WebUI to work with these models takes a bit of configuration, but it's absolutely worth it.

The GitHub Tokens situation

First things first: you need the right kind of token. This tripped me up initially. You might think you can use your standard GitHub Personal Access Token (the ones that start with github_pat_), but those won't work here. You need a classic personal access token that starts with ghp_.

Why? The newer fine-grained tokens (github_pat_) are designed for repository-specific permissions and API operations, but they don't have the necessary scopes for accessing GitHub's AI inference endpoints. The classic tokens (ghp_) have broader access patterns that include the Models API. It's a quirk of how GitHub structured their authentication system as they rolled out different features.

To generate your token, head over to: https://github.com/settings/tokens

Click Generate new token and select Generate new token (classic). Give it a descriptive name like "Open WebUI Models Access" and set an expiration that makes sense for your security posture. For scopes, you'll want to enable appropriate permissions, the Models API access is typically covered under the basic token permissions.

Testing your connection

Before diving into Open WebUI configuration, let's make sure everything works from the command line. This saves you from troubleshooting blind later:

curl -s https://models.inference.ai.azure.com/models \
  -H "Authorization: Bearer YOUR_OWN_TOKEN"

Replace YOUR_OWN_TOKEN with your actual token. If everything's configured correctly, you'll get a JSON response listing all available models. It's a beautiful sight, a bunch of models at your fingertips, completely free.

Configuring Open WebUI

Now for the part that took me way too many attempts to figure out. Open WebUI needs to connect via an OpenAI-compatible API endpoint. Here's the configuration that actually works:

API Base URL: https://models.inference.ai.azure.com
API Key: Your ghp_ token from earlier

Go to Admin Panel under User Profile:

And select Settings > Connections > +:

Here's the gotcha that cost me an hour of my life… when you enter that URL, Open WebUI will automatically detect it as Azure OpenAI and set the Provider Type accordingly. Don't let it do this. You need to manually change the Provider Type dropdown to just OpenAI.

Why? Because even though GitHub uses Azure's infrastructure for their Models API, the authentication and endpoint structure follow OpenAI's API conventions, not Azure's. Azure OpenAI expects a different URL pattern with API versions and deployment names baked into the path.

GitHub's implementation is cleaner and more closely mirrors the standard OpenAI API format. If you leave it set to Azure OpenAI, the requests will fail because Open WebUI will try to append Azure-specific parameters that GitHub's endpoint doesn't expect.

You should definitely add a note about the specific URL, because this trips up almost everyone.

The "Why" Explained

I noticed that https://models.inference.ai.azure.com works, but the "official" GitHub endpoint (https://models.github.ai/inference) often fails in Open WebUI.

GitHub Models are actually hosted on Microsoft's Azure infrastructure.

The Azure Endpoint (models.inference.ai.azure.com): Think of this as a "Universal Adapter." It was built specifically to let tools that already speak "OpenAI" (like Open WebUI) connect easily. It is more permissive with the classic ghp_ tokens.
The GitHub Endpoint (models.github.ai): This is the "Native" front door. It is stricter. It usually requires those specific "Fine-Grained" tokens (github_pat_) and often demands specific API headers that generic tools like Open WebUI don't send by default.

Adding your Models

Once the connection is established, you can start adding Model IDs. But which ones? Here's how to get the complete list:

curl -s https://models.inference.ai.azure.com/models \
  -H "Authorization: Bearer YOUR_OWN_TOKEN" \
  | jq -r '.[].name'

This command pipes the JSON response through jq (you might need to install it first) and extracts just the model names. You'll see everything from GPT-4o to Llama models, Mistral models, and more.

NOTE: I've found that not every single model behaves perfectly. If you want a setup that just works without issues, I recommend adding these specific Model IDs first:

gpt-4o-mini
gpt-4o
Meta-Llama-3.1-8B-Instruct
Mistral-Nemo

Copy these IDs, paste them into your Open WebUI configuration, and you are ready to go.

Creating Model Presets: your own custom "Meta-Models"

Here's where Open WebUI really starts to shine. Once you have your base models configured, you can create what its called "meta-models" customized model presets that are purpose-built for specific tasks. Think of them as saving your entire workflow configuration so you don't have to set it up every single time.

When you create a model preset in Open WebUI, you're essentially building a specialized version of your LLM that's pre-configured and ready to go.

This is way more powerful than it sounds. Instead of starting from scratch every conversation and typing the same instructions over and over, you craft the perfect setup once and reuse it infinitely.

What Goes Into a Model Preset?

When you create a model preset, you're packaging together several key components:

Base Model: This is the underlying engine that powers everything. It could be gpt-4o from your GitHub Models connection, llama3:8b running locally via Ollama, or any other model you have configured. The base model provides the raw intelligence, but the preset is where you shape how that intelligence behaves.
System Prompt: This is the personality and instruction set baked into the model. Instead of telling the model "You are a senior security engineer who…." at the start of every conversation, you set it once in the preset. From that point on, whenever you use this preset, the model already knows its role. I have presets for different writing styles, technical depths, and even humor levels.
Knowledge Bases: This is where RAG (Retrieval Augmented Generation) comes into play. You can attach specific document collections to a preset so the model only answers based on that curated knowledge. I have a preset connected to my health documents, another linked to a collection of research papers, and one that only references specific treatments. The model won't hallucinate answers, it'll only work with the documents you've given it.
Tools & Functions: Here's where it gets really practical. You can bind specific capabilities directly to a preset. Maybe you want one preset that always has web search enabled for research tasks, and another that has access to your custom Python tools for data analysis but explicitly can't search the web. You're creating purpose-built assistants with exactly the capabilities they need and nothing more.
Advanced Parameters: Want a creative writing assistant? Lock the Temperature high for more variety. Need consistent, deterministic outputs for code generation? Pin the Temperature low and set a specific Seed. You can tune Top-P, frequency penalties, and all the other knobs that usually require digging through settings menus. Set them once in the preset, forget about them.

Why this matters in Practice

The real magic happens when you build up a library of these presets. I've got about a dozen that I rotate through depending on what I'm working on, some examples:

The interface makes all of this surprisingly intuitive. You're not editing config files or writing YAML. It's a proper UI that guides you through each option. And once you've created a preset, it shows up in your model selector alongside your base models. To anyone else using your Open WebUI instance, it looks like just another model option, they don't need to know you've spent time crafting the perfect setup underneath.

Each one is instantly available. Click, start typing, and you're working with an assistant that's already configured exactly how you need it. No more "act as a..." prompts. No more forgetting which tools you wanted enabled. No more inconsistent outputs because you forgot to set the temperature.

This is the kind of feature that seems minor until you actually use it, and then you wonder how you ever lived without it. It transforms Open WebUI from "yet another chat interface" into a genuinely productive tool that adapts to how you actually work!

Conclusion

What you end up with is incredibly powerful: a stable, multi-user interface with custom Python tools, backed by free-tier models that are actually quite capable.
The custom tools mean you're not wasting tokens or dealing with hallucinations for things that should be deterministic. The GitHub models give you variety and capability without cost. And Open WebUI ties it all together in a package you actually control.

I've been running this setup for weeks now, and it's become an essential part of my workflow. Whether I'm prototyping ideas, analyzing data, or just having a conversation to think through a problem, this combination delivers consistently.

If you're tired of juggling different LLM interfaces or paying through the nose for basic access, give this setup a shot. The initial configuration takes maybe thirty minutes, and what you get in return is a genuinely useful tool that you'll actually want to use every day.

Now, you know! 🚀

The hidden Gem of AI interfaces: Open WebUI

Getting Open WebUI up and running