Agentic Workflows

Agentic Workflows

Systems where an LLM drives a multi-step process autonomously — making decisions, calling tools, observing results, and continuing until a goal is achieved. Distinct from single-turn completions where the model answers once and stops.

Anthropic's canonical reference on this topic is "Building Effective Agents" (December 19, 2024), which defines six composable patterns and recommends preferring simple, composable patterns over complex frameworks.

Core Components

Agent loop: The model runs in a loop: reason → act → observe → reason. Each cycle can involve tool calls, code execution, file operations, or API calls. The loop continues until the task is complete or the agent hits a stopping condition.

Tools: Functions the agent can call. Common tools in coding agents: read_file, write_file, run_command, search, web_fetch. mcp standardizes how tools are exposed.

Memory: What the agent knows about the current task. Short-term: the current context window. Long-term: files, databases, or external stores the agent reads/writes persistently.

Stopping conditions: When does the agent stop? When it declares success, when it hits an error it can't recover from, when it runs out of context, or when a hook/human says stop.

Patterns

Anthropic's "Building Effective Agents" identifies six core patterns:

Prompt Chaining: Breaking a task into sequential steps, where each model call uses the output of the previous one. Useful for tasks with a clear pipeline structure.

Routing: A classifier step directs inputs to different specialized sub-workflows. Keeps each path focused and avoids forcing a single model to handle unrelated task types.

Parallelization: Multiple model calls run concurrently — either the same task split across chunks, or different perspectives on the same input aggregated afterward.

Orchestrator-Workers: One model plans and delegates, multiple models (or model instances) execute subtasks. Better for very large or diverse tasks. See multi-agent-setup.

Evaluator-Optimizer: A generator model produces output; a separate evaluator model scores it and provides feedback; the generator revises. Iterates until quality threshold is met.

Autonomous Agents: The model operates in a loop with full tool access and no human checkpoints. Core use case for overnight-runs.

Anthropic's design philosophy: prefer simple, composable patterns over complex frameworks.

Additional Patterns in Practice

Single agent: One model in a loop. claude-code default mode. Works for most tasks.

Supervisor: Human in the loop at defined checkpoints. Agent runs autonomously but pauses for approval at key decision points. Safer for high-stakes tasks.

Failure Modes

  • Compounding errors: A small mistake early leads to bigger mistakes later as the agent builds on a wrong assumption. Long sessions need checkpointing.
  • Context loss: As context fills up, earlier information gets compressed or lost. claude-code's compaction helps but isn't perfect.
  • Tool failures: External APIs fail, file permissions change, tests run out of memory. The agent needs to handle these gracefully.
  • Hallucinated success: The agent convinces itself the task is done when it isn't. Verification steps help — always run the tests.

Practical Design Principles

  1. Give the agent a clear, verifiable success condition (a passing test, a specific file existing, an API returning 200)
  2. Prefer reversible actions over irreversible ones — commit before major changes, don't delete without backup
  3. Hook into the agent loop for observability — log tool calls, send notifications on stop
  4. Start with supervised runs; automate after you've seen the failure modes

Related

claude-code · multi-agent-setup · overnight-runs · mcp · extended-thinking · exe-dev

Sources