Running GPT-OSS Locally: What OpenAI Just Made Possible (And What It Didn't)

OpenAI released GPT‑OSS under an open licence. Here's what that really means, how I ran it on a Mac mini, and where you might start experimenting too.

Running GPT-OSS Locally: What OpenAI Just Made Possible (And What It Didn't)

OpenAI just released a pair of open-weight models, GPT‑OSS‑20B and GPT‑OSS‑120B, under the Apache 2.0 licence. It’s a noteworthy shift. For the first time since GPT‑2, OpenAI has put out models that you can download, run locally, fine-tune, and even commercialise. That deserves a closer look.

This post builds on earlier stories where I experimented with running LLaMA models on a Mac mini, explored the fine line between open source and open access, and tested local models on both desktop and mobile devices. GPT‑OSS fits neatly into that thread and marks a pragmatic new step.

What is GPT‑OSS?

GPT‑OSS comes in two versions:

  • GPT‑OSS‑20B: a sparse model (only part of it activates per input)
  • GPT‑OSS‑120B: a much larger MoE (Mixture of Experts) model

Both are trained to support long contexts, tool use, and complex instructions. The weights are publicly released, the models run locally, and the licence is permissive: Apache 2.0.

But OpenAI stops short of calling it open source. And that’s intentional.

Why this is a big deal

This is the first time OpenAI has:

  • Released weights under an OSI-approved licence
  • Enabled full local use with no platform lock-in
  • Signalled support for decentralised deployment, without the gatekeeping of the GPT‑4/ChatGPT API stack

It also positions OpenAI among the growing field of developers contributing to the ecosystem and not just guarding its commercial moat.

What’s open, what’s not

Here’s a simplified breakdown:

Component GPT‑OSS Truly Open Source?
Model weights
Inference code
Training data ✅ (ideal)
Training method ✅ (ideal)
Licence (Apache 2.0)

You can run the models, adapt them, and use them commercially but you can’t yet reproduce them or fully inspect how they were trained.

Running GPT‑OSS on a Mac mini

I used Ollama to run GPT‑OSS‑20B on my Mac mini, the same machine I previously used for LLaMA.

GPT-OSS 20b on my Mac mini through Ollama.

Installation was seamless:

ollama run openai/gpt-oss

Performance was not really impressive. GPT‑OSS‑20B runs ok on Apple Silicon, and the quality is comparable to other 20B-class sparse models. It's usable for writing assistance, coding, and lightweight local inference.

Llama 3.2 on a Mac
I tested Meta’s Lama 3.2 LLM on my Mac Mini, setting it up via Docker. It’s fast, private, and generates code, but lacks memory and multimodal features like ChatGPT.

What didn’t work (yet)

I attempted to run it on an iPhone, out of curiosity. As expected, this was too much, both in terms of size and current tooling. For now, even quantised versions aren’t viable on mobile without careful pruning or architectural compression.

Still, this fits a pattern. We're getting closer to viable mobile LLMs, especially with Apple, Google, and Microsoft all showing movement in this space.

Running a Local LLM on Your iPhone
I explored how far mobile AI has come by running LLMs directly on my iPhone. No cloud, no upload. Here’s what I learned from testing Haplo AI.

What you can do with it

  • Run a private assistant with no server callouts
  • Fine-tune the model for your own domain or language
  • Benchmark its outputs against GPT-3.5 or Mixtral
  • Use it in apps without sending data to OpenAI
  • Build tools, integrations, or even commercial products

All of this is legally allowed under Apache 2.0.

A few cautions

While the model is powerful and free to use:

  • There’s no visibility into training data
  • You are still bound by OpenAI’s use-case policies
  • Model card and system card documentation is limited

If you're deploying this in a safety-critical or regulated context, keep that in mind. I’ve written previously about the role of system cards in communicating model limitations — and GPT‑OSS offers only partial transparency.

Model Cards, System Cards and What They’re Quietly Becoming
What are AI model cards, and why are they becoming the documents regulators will turn to first? I read a few and it taught me more than I expected.

Final thoughts

This release makes local LLMs more accessible than ever and from a source that until now kept tight control. It’s not fully open source, but it is open enough to experiment, build, and even go to market.

OpenAI is making room in the ecosystem it once stood apart from. That’s a good thing.


DeepSeek R-1: Open-Source AI model
DeepSeek R-1 rivals OpenAI’s o1 at 13x less cost. A groundbreaking, open-source model reshaping AI innovation—open for all to explore.
Chrome, Gemini Nano, and the Browser as AI Platform
The AI race is moving into the browser. Experiments with local models helped me recognise Chrome’s Gemini Nano for what it is.