Part II: The Toolkit

Chapter 03

The Framework Landscape, 2026

A field guide to twelve production AI agent frameworks, including LangGraph, CrewAI, vendor SDKs, Pydantic AI, LlamaIndex, Haystack, and Microsoft Agent Framework.

From two serious options to twelve production-grade frameworks in eighteen months. What each one is actually for, how to choose, and how to avoid marrying one.

How we got here

Through 2024 the choice was effectively LangChain or roll-your-own. By mid-2026 the field looks completely different: industry surveys such as Uvik's 2026 comparison count a dozen production-viable Python frameworks, and the three big model vendors each shipped a first-party agent SDK within weeks of one another. Microsoft consolidated AutoGen and Semantic Kernel into a unified Agent Framework, moving classic AutoGen into maintenance (the community continues it as AG2). Meanwhile TypeScript builders got Mastra and the vendor SDKs' JS ports, and low-code platforms (n8n, Dify, Copilot Studio) made simple agents a configuration exercise. Two findings from the field matter more than any ranking. First: there is no winner. Across one consultancy's twelve 2025-26 client engagements, no single framework appeared more than four times — the right answer tracked workflow complexity, vendor commitment, and appetite for abstraction. Second: composition is normal. Teams routinely run a LangGraph orchestration spine with CrewAI-style role agents inside it, or a vendor SDK agent that calls tools shared with everything else over MCP.

The twelve, in one table

Framework Core abstraction Sweet spot Watch out for
LangGraph Graph / state machine
with checkpoints
Complex routing, approvals, durable
long-running flows; the production
default for many teams
Steeper learning curve;
graph thinking is mandatory
LangChain Chains + integration
library
Rapid prototyping atop the largest
integration catalog
Abstraction churn; many
teams graduate to
LangGraph
CrewAI Role-based crews (role,
goal, backstory)
Content pipelines,
research-write-review, role-shaped
business processes; A2A delegation
added in 2026
Role metaphor strains on
highly dynamic tasks
OpenAI Agents
SDK
Lightweight agents +
handoffs + guardrails
Fast builds inside the OpenAI
ecosystem; clean tracing
Vendor-centric; portability
needs discipline
Framework Core abstraction Sweet spot Watch out for
Claude Agent
SDK
The Claude Code harness
as a library: files, terminal,
computer use, sub-agents
Coding agents and desk-work
automation with strong tool
ergonomics; MCP-native
Anthropic-centric by design
Google ADK Multi-agent hierarchies;
native A2A with auto
Agent Cards
Cross-vendor agent interop, Google
Cloud estates
Heavier; assumes Google
tooling
Pydantic AI Typed agents, validated
structured outputs
Teams that want compile-time-ish
safety, testing, and clean dependency
injection
Smaller ecosystem than the
giants
smolagents Minimal code-acting
agents (~1K LOC core)
Hugging Face stack, research,
learning the loop; agents that write
code as their action format
Code-execution security
needs sandboxing
Agent
Framework (MS)
Unified successor to
Semantic Kernel +
AutoGen
.NET / Azure enterprises,
compliance-heavy estates
Newest of the set; migration
from SK/AutoGen ongoing
AG2 (AutoGen
fork)
Conversational
multi-agent chat
Research-style agent dialogues,
code-execution loops
Classic AutoGen itself is in
maintenance
LlamaIndex Data-centric agents over
indexes
Document workflows, agentic RAG,
knowledge assistants
Less suited to general
orchestration
Haystack Composable pipelines
(Deepset)
Production search + RAG with agent
steps; strong eval tooling
Pipeline mindset, not
free-form autonomy
Honourable mentions: Mastra (TypeScript-first, batteries included), DSPy (programmatic prompt
optimization rather than an agent runtime), and the low-code tier — n8n, Dify, Microsoft Copilot Studio
— which is genuinely sufficient for linear, low-risk internal automations.

A decision guide

Complex branching, audits, pause/resume?
yes
LangGraph
no
no
no
no
no
Process maps to roles (research, write, review)?
yes
CrewAI
Committed to one model vendor, want speed?
yes
Vendor SDK (OpenAI / Claude /

ADK)

Type-safety and testability first?
yes
Pydantic AI
Microsoft / .NET enterprise estate?
yes
Agent Framework (SK)
Document- and data-centric agents?
yes
LlamaIndex / Haystack
None of the above: start with a minimal loop (smolagents or ~100 lines of your own) and graduate only when
you feel the ceiling

Figure 3.1 — A pragmatic selection ladder. First 'yes' wins; composition across answers is legitimate.

The lock-in question

Frameworks are the layer most likely to churn under you — abstractions get deprecated, pricing and licensing shift, a better fit appears. The teams that switch painlessly all did the same thing: they kept their own thin interfaces for 'agent', 'tool', and 'memory', and treated the framework as an implementation detail behind them. That hexagonal discipline costs a few hundred lines up front and buys you the right to change your mind. Chapter 7 turns this into a full portability playbook.

Selection method that works

Pick by elimination, not attraction. List your hard constraints — vendor commitments,
language, compliance, team skills, durability needs — and strike frameworks that fail any of
them. Whatever survives, prototype the riskiest slice of your real workload in two days
before committing. A framework that demos well on toy tasks can still fight you on yours.