Tool Use

3 min · concept

Tool Use

A structured mechanism for language models to call external functions or APIs during inference. The model does not execute code itself; it emits a structured call, the host application executes it, and the result is fed back into the conversation. The loop repeats until the model produces a final response without calling a tool.

How It Works

The caller provides a tools array in the API request, each tool described by a JSON schema (name, description, input parameters).
The model decides whether to call a tool. If yes, it returns a response with stop_reason: "tool_use" and one or more tool_use content blocks containing the tool name and input arguments.
The caller executes the tool and sends back a tool_result message with the output.
The model continues from there, potentially calling another tool or producing a final answer.

This loop is the foundation of agentic-workflows.

Anthropic API Format

tools = [
    {
        "name": "get_weather",
        "description": "Returns current weather for a city.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }
]

# Model response when it decides to call the tool:
# {
#   "type": "tool_use",
#   "id": "toolu_01...",
#   "name": "get_weather",
#   "input": {"location": "San Francisco, CA"}
# }

# Caller sends back:
# {"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_01...", "content": "72°F, sunny"}]}

Client vs. Server Tools

Client tools — defined by the developer, executed in the caller's application. Claude responds with tool_use blocks; caller handles execution.

Server tools — provided by Anthropic (e.g., web_search, code_execution). Execution happens on Anthropic infrastructure; results appear in the response without the caller writing any execution logic.

Parallel Tool Calls

Claude and OpenAI models can emit multiple tool_use blocks in a single response, allowing parallel calls in one turn. The caller executes them concurrently and returns all results before the model continues. This reduces round-trips in tasks that require multiple independent lookups.

Strict Mode

Adding strict: true to a tool definition constrains the model's output to match the JSON schema exactly. Useful in production to guarantee parseable, schema-compliant tool calls.

Tool Choice

The tool_choice parameter controls when the model calls tools:

{"type": "auto"} — model decides (default)
{"type": "any"} — model must call at least one tool
{"type": "tool", "name": "..."} — forces a specific tool

Connection to MCP

mcp (Model Context Protocol) standardizes how tool definitions are shared across frameworks, hosts, and clients. Rather than defining tools inline in every request, an MCP server exposes a set of tools that any MCP-compatible client can discover and use. Tool use is the underlying mechanism; MCP is the standardization layer on top.

Agentic Loops

Tool use + re-prompting = agentic-workflows. Each tool call is one step in a multi-turn loop. With extended-thinking, the model can reason about which tools to call and in what order before committing to the first action, reducing cascading errors in long tasks.

Cost

Tool definitions (names, descriptions, schemas) count as input tokens. tool_use and tool_result blocks also count. On Anthropic's API, client-side tools carry no additional per-call charge beyond the token cost; server-side tools like web search incur usage-based fees on top of token costs.

mcp · agentic-workflows · extended-thinking · prompt-caching

Sources

linked from

Prompt Caching Claude Sonnet 5

Tool Use

Tool Use

How It Works

Anthropic API Format

Client vs. Server Tools

Parallel Tool Calls

Strict Mode

Tool Choice

Connection to MCP

Agentic Loops

Cost

Related

Sources