Appendices

Chapter B

Glossary of Agent Engineering Terms

Working definitions for AI agent engineering terms including MCP, A2A, ReAct, durable execution, memory, orchestration, and evaluation.

Working definitions as used in this book.

  • A2A (Agent2Agent) — open protocol for discovery and delegation between independent agents; Agent Cards describe capabilities.
  • Agent — an LLM that directs its own tool use in a loop toward a goal, within budgets and policies.
  • Agentic RAG — retrieval where the agent decides what to fetch, when, and whether to search again.
  • Cascade / routing — sending each task to the cheapest model likely to succeed, escalating on low confidence.
  • Checkpoint — persisted run state allowing pause, resume, retry, and human approval gates.
  • Context engineering — deciding what enters the model's window each step: compaction, scratchpads, JIT retrieval.
  • Context window — the model's working set per call — finite, priced per token, and not memory.
  • Durable execution — running agents as resumable state machines so crashes and waits don't lose work.
  • Eval (evaluation) — a repeatable test of agent behaviour: unit, trajectory, outcome, or LLM-as-judge.
  • Function / tool calling — the model emitting structured arguments for code your system executes.
  • Guardrails — input/output validation, policy checks and budgets wrapped around model behaviour.
  • HITL (human-in-the-loop) — a person approves, samples, or receives escalations from the agent.
  • Idempotency — designing actions so a retried step cannot apply twice (no double refunds).
  • Lethal trifecta — private data + untrusted content + external comms in one agent — the prompt-injection worst case.
  • LLM-as-judge — using a model to score outputs against a rubric; calibrate against human labels.
  • MCP (Model Context Protocol) — open standard connecting agents to tools, resources and prompts via client-server.
  • Memory (agent) — engineered long-term store — working, episodic, semantic, procedural — with write policies.
  • Multi-agent system — several agents coordinating via supervisor, pipeline, network, or hierarchy topologies.
  • Orchestration — the control layer sequencing steps, agents, tools, and approvals.
  • Prompt caching — provider-side reuse of a processed prompt prefix; reads bill at a fraction of input price.
  • Prompt injection — instructions hidden in content the agent reads, treated as commands.
  • Quantization — compressing model weights (e.g. 4-bit) to cut memory and speed up inference at small quality cost.
  • ReAct — the reason-act-observe loop pattern underlying most single-agent designs.
  • Semantic cache — answering near-duplicate requests from stored responses without a model call.
  • Trace / trajectory — the recorded sequence of model calls, tool calls and results for one run.