Models get deprecated, prices change, and better options ship monthly. This chapter shows how to build agents that survive vendor churn — and switch models in an afternoon, not a quarter.
Why portability is a feature, not paranoia
In the eighteen months to mid-2026 the 'best model for the job' changed several times in every price band, two major frameworks were folded into successors, and providers repeatedly adjusted pricing and deprecated model versions on short notice. Teams whose agents were welded to one SDK re-platformed under deadline pressure; teams with a thin abstraction layer re-ran their evals against the new model and switched a config value. For organisations in the Gulf there is a second driver: data-residency and procurement rules can dictate which providers — or which regions of a provider — you may use, and those rules change too. Portability does not mean using every vendor at once. It means keeping the cost of changing your mind low. Three layers deliver that: a gateway, your own interfaces, and disciplined prompts.
Layer 1 — the model gateway
A gateway gives every model the same API shape, so your application code never knows which provider answered. Two mature options dominate. LiteLLM is an open-source proxy you self-host: one OpenAI-format endpoint in front of 100+ providers, with retries, fallback chains, spend caps, and per-key usage logs. OpenRouter is the hosted equivalent — one API key, one bill, automatic failover across providers. Self-host the gateway when data must stay in-country; use the hosted version when speed of setup matters more.
Your agent one client, one format Gateway LiteLLM / OpenRouter retries · fallbacks · spend caps · usage logs OpenAI GPT family Anthropic Claude family Google Gemini family Local Ollama / vLLM
Figure 7.1 — One client, one format. The gateway owns provider differences, retries and spend caps.
Layer 2 — own your interfaces
Frameworks are the layer most likely to churn under you (Chapter 3). The hexagonal answer: define the four or five operations your agents actually need — complete(), embed(), search(), remember(), act() — as interfaces you own, and implement them with thin adapters per framework or vendor. Your domain logic imports only your interfaces. Swapping LangGraph for a vendor SDK then touches adapters, not business rules.
stable Your domain logic workflows, business rules, policies — never imports a vendor SDK Your agent interfaces (ports) complete() / embed() / search() / remember() — contracts you own Adapters thin wrappers: LangGraph node, CrewAI tool, vendor SDK client Gateway LiteLLM (self-hosted) or OpenRouter (hosted) — one API shape, many providers Providers & runtimes OpenAI · Anthropic · Google · Mistral · Ollama / vLLM on your hardware volatile
Figure 7.2 — The portable stack. Volatility increases as you go down; your code lives at the top.
Layer 3 — prompts and the 12-factor mindset
The 12-Factor Agents principles — a widely shared adaptation of Heroku's 12-factor app for agentic software — capture the remaining discipline well. The ones that matter most for portability:
- Own your prompts — prompts are versioned source files in your repo, not strings buried in a framework or a vendor dashboard. Diffs, reviews, and rollbacks apply.
- Own your context window — you decide what enters the context each step — don't let a framework's default memory silently define your product.
- Tools are structured outputs — treat tool use as 'model emits JSON, your code executes it'. That contract is identical across every provider.
- Be a stateless reducer — each step maps (state, event) to new state. State lives in your store, so any provider — or any worker — can resume the run.
- Avoid one-vendor cleverness — provider-exclusive features are fine, but wrap them behind your interface and keep a documented fallback.
The model-swap checklist
You are genuinely portable when all six boxes tick:
- An eval suite (Chapter 11) that defines 'good' independent of any model, runnable against a candidate in under an hour.
- Prompts in version control, with per-model variant files where genuinely needed.
- A gateway or interface so the swap is a config change, not a refactor.
- Per-provider token accounting — same conversation, different tokenizers, different bills.
- Pinned model versions in production ('claude-sonnet-4-6', not 'latest'), upgraded deliberately.
- A documented fallback chain the gateway executes when a provider degrades or rate-limits.
Practice the swap before you need it
Run your evals against two providers from day one — even if you only deploy one. The second provider keeps your design honest and turns every future migration into a rehearsed move.