An agent that is 95% right per step is 60% right after ten steps. This chapter is about closing that gap — and making sure the failures that remain are cheap, contained, and recoverable.
The compounding-error problem
Reliability in agents is multiplicative. At 95% per-step accuracy, a ten-step run succeeds about 60% of the time (0.95^10 ≈ 0.60); at twenty steps it drops toward a coin flip. Production teams attack this from both sides: raise per-step accuracy (better tools, tighter prompts, structured outputs) and cut the number of unchecked steps (checkpoints, validations, early exits). Design for the math, not against it.
0.95^10 ≈ 60%
40%+
ten 95%-reliable steps, compounded agentic projects Gartner expects cancelled by 2027 basic probability Gartner, 2025
Scaling the boring, proven way
Agents scale like any other workload once you make them stateless: workers pull runs from a queue, every step reads and writes state in a store (the durable-execution pattern from Chapter 6), and any worker can resume any run. From there the standard toolkit applies — horizontal autoscaling, rate-limit-aware backpressure, and circuit breakers per provider. Three agent-specific additions matter:
- Budgets on everything — max steps, max tokens, max wall-clock, max spend per run. A runaway loop should hit a wall in seconds, not show up on an invoice.
- Idempotent tools — checkpoint IDs double as deduplication keys, so a retried step cannot send two refunds or two emails.
- Graceful degradation — when a provider or tool fails, the agent should fall back — smaller model, cached answer, or a clean handoff to a human — rather than erroring out.
Security: prompt injection and the lethal trifecta
The defining security problem of agents is prompt injection: instructions hidden in content the agent reads — a web page, an email, a PDF, a tool result — that the model treats as commands. The highest-risk shape is what security researcher Simon Willison calls the lethal trifecta: one agent that combines access to private data, exposure to untrusted content, and
the ability to communicate externally. With all three, a poisoned input can exfiltrate whatever the agent can read. No reliable model-level fix exists as of 2026, so the answer is architectural: break the trifecta (does the email-reading agent really need outbound web access?), and layer defenses so no single failure is fatal.
Inputs sanitise, tag untrusted content, strip secrets Model & prompt system-prompt hygiene, spotlighting untrusted spans Tools least privilege, allowlists, sandboxed execution, read-only by default Outputs schema validation, moderation, no raw HTML/SQL pass-through Humans & limits approval gates for irreversible acts; token, time and spend budgets
Figure 10.1 — Defense in depth: every layer assumes the one above it can fail.
- Treat all retrieved content as data, never instructions — tag or 'spotlight' untrusted spans so the model can tell them apart.
- Least-privilege tools: read-only by default, allowlisted domains and tables, sandboxed code execution.
- Schema-validate every tool call and every output; reject rather than repair on policy violations.
- Human approval gates on irreversible or high-value actions — payments, deletions, external sends.
- Log every step with inputs and outputs; an agent you cannot audit is an agent you cannot trust.
The capability budget
Write down, per agent: what it may read, what it may do, what it may spend, and who approves the exceptions. If a capability is not on the list, the agent does not get it. Most production incidents trace back to capabilities nobody remembered granting.