The AI Continuity Problem

AI is load-bearing in my company. That creates a new kind of business risk: what happens when access breaks, gets priced out, or gets cut off?

The AI Continuity Problem

Whisper runs on a Mac mini in my office. It transcribes conversations locally, no cloud, no API call, no dependency on a provider staying available or affordable. It is not the most capable transcription tool I have access to. It is the one I control.

That distinction has started to matter more.

Load-bearing

AI is no longer optional in how I work. It sits inside research, writing, client work, automation. Removing it would not be an inconvenience, it would be a business interruption. That is a useful capability to have built, and also a new kind of exposure.

The failure modes are not exotic. A geopolitical rupture that places European businesses on the wrong side of US export controls. A provider that decides your sector, use case, or geography carries too much compliance risk. An API pricing change that breaks your unit economics overnight. A network outage on a day when the work cannot wait. None of these require imagination. Some are already happening at the edges.

The minimum viable hedge

My response so far is not a full architecture. It is a posture.

The first line is supplier diversity. I use multiple models across multiple providers. No single point of failure, no single commercial relationship that can strand me. This is the cheapest form of continuity planning: just don't consolidate everything onto one platform.

The second line is selective local capability. Whisper for transcription is the working example. A Mac mini running a local LLM for specific workflows is the experiment in progress. The goal is not to replicate cloud AI locally, that is neither feasible nor worth the investment for a small operation. The goal is to keep the door open: to maintain enough local capability that critical workflows can survive a cloud disruption, even in degraded form.

Running Gemma 4 on Your iPhone
Google’s Gemma 4 runs offline on your iPhone. A follow-up to my local LLM experiment, now with a sharper app, a better model, and a clearer sense of what this category is becoming.

The constraint is real. Time spent building local infrastructure is time not spent on client work. The investment has to be proportional, which means choosing carefully which workflows justify it. Transcription earns it: the volume is high, the sensitivity is real, and Whisper works well enough to be a genuine alternative.

Continuity as strategy

Most small companies and solo practitioners haven't made this decision consciously. AI crept in as a convenience and became structural before anyone asked what happens if it stops. That is not a criticism, it is the normal pattern with any infrastructure shift. Electricity, cloud storage, SaaS: each time, dependency arrived before continuity planning did.

⚠️
The risk isn't that AI becomes unavailable everywhere. It's that your specific access, through a specific provider at a specific price point, becomes unavailable. Diversification and partial local capability are the practical responses.

The question for any AI-dependent operation is not whether to use cloud models. It is whether you have thought through what you would do without them, and whether the answer is better than "stop working." A local Whisper instance and a second API account is not a complete answer. It is a start, and it is more than most have.


Llama 3.2 on a Mac
I tested Meta’s Lama 3.2 LLM on my Mac Mini, setting it up via Docker. It’s fast, private, and generates code, but lacks memory and multimodal features like ChatGPT.
Running a Local LLM on Your iPhone
I explored how far mobile AI has come by running LLMs directly on my iPhone. No cloud, no upload. Here’s what I learned from testing Haplo AI.