Platform Independence & Portability | AI Agent Engineering Handbook

Models get deprecated, prices change, and better options ship monthly. This chapter shows how to build agents that survive vendor churn, and switch models in an afternoon, not a quarter.

Why portability is a feature, not paranoia

In the eighteen months to mid-2026 the 'best model for the job' changed several times in every price band, two major frameworks were folded into successors, and providers repeatedly adjusted pricing and deprecated model versions on short notice. Teams whose agents were welded to one SDK re-platformed under deadline pressure; teams with a thin abstraction layer re-ran their evals against the new model and switched a config value. For organisations in the Gulf there is a second driver: data-residency and procurement rules can dictate which providers, or which regions of a provider, you may use, and those rules change too. Portability does not mean using every vendor at once. It means keeping the cost of changing your mind low. Three layers deliver that: a gateway, your own interfaces, and disciplined prompts.

Layer 1, the model gateway

A gateway gives every model the same API shape, so your application code never knows which provider answered. Two mature options dominate. LiteLLM is an open-source proxy you self-host: one OpenAI-format endpoint in front of 100+ providers, with retries, fallback chains, spend caps, and per-key usage logs. OpenRouter is the hosted equivalent, one API key, one bill, automatic failover across providers. Self-host the gateway when data must stay in-country; use the hosted version when speed of setup matters more.

Your agent
one client, one format
Gateway
LiteLLM / OpenRouter
retries · fallbacks · spend caps · usage logs
OpenAI
GPT family
Anthropic
Claude family
Google
Gemini family
Local
Ollama / vLLM

Figure 7.1. One client, one format. The gateway owns provider differences, retries and spend caps.

Layer 2, own your interfaces

Frameworks are the layer most likely to churn under you (Chapter 3). The hexagonal answer: define the four or five operations your agents actually need, complete(), embed(), search(), remember(), act(), as interfaces you own, and implement them with thin adapters per framework or vendor. Your domain logic imports only your interfaces. Swapping LangGraph for a vendor SDK then touches adapters, not business rules.

stable
Your domain logic
workflows, business rules, policies — never imports a vendor SDK
Your agent interfaces (ports)
complete() / embed() / search() / remember() — contracts you own
Adapters
thin wrappers: LangGraph node, CrewAI tool, vendor SDK client
Gateway
LiteLLM (self-hosted) or OpenRouter (hosted) — one API shape, many providers
Providers & runtimes
OpenAI · Anthropic · Google · Mistral · Ollama / vLLM on your hardware
volatile

Figure 7.2. The portable stack. Volatility increases as you go down; your code lives at the top.

Layer 3, prompts and the 12-factor mindset

The 12-Factor Agents principles, a widely shared adaptation of Heroku's 12-factor app for agentic software, capture the remaining discipline well. The ones that matter most for portability:

Own your prompts, prompts are versioned source files in your repo, not strings buried in a framework or a vendor dashboard. Diffs, reviews, and rollbacks apply.
Own your context window, you decide what enters the context each step, don't let a framework's default memory silently define your product.
Tools are structured outputs, treat tool use as 'model emits JSON, your code executes it'. That contract is identical across every provider.
Be a stateless reducer, each step maps (state, event) to new state. State lives in your store, so any provider, or any worker, can resume the run.
Avoid one-vendor cleverness, provider-exclusive features are fine, but wrap them behind your interface and keep a documented fallback.

The model-swap checklist

You are genuinely portable when all six boxes tick:

An eval suite (Chapter 11) that defines 'good' independent of any model, runnable against a candidate in under an hour.
Prompts in version control, with per-model variant files where genuinely needed.
A gateway or interface so the swap is a config change, not a refactor.
Per-provider token accounting, same conversation, different tokenizers, different bills.
Pinned model versions in production ('claude-sonnet-4-6', not 'latest'), upgraded deliberately.
A documented fallback chain the gateway executes when a provider degrades or rate-limits.

Practice the swap before you need it

Run your evals against two providers from day one — even if you only deploy one. The
second provider keeps your design honest and turns every future migration into a
rehearsed move.

← Orchestration: Single Agents to Multi-Agent Systems Offline-First & Local Agents →