Tool Use is the capability by which a large language model produces a structured, machine-readable request to invoke an external function or API, receives the result, and incorporates that result into its ongoing generation. The model does not execute anything itself; it emits a JSON payload naming a tool and its arguments, and the surrounding application—the harness—runs the actual code, database query, or HTTP call. Results are fed back into the context window as a new message, and the model continues, possibly issuing further calls. This closes the gap between a model's frozen training data and live systems: current prices, a customer record, a calculator, a code interpreter. Tool Use is the primitive that turns a text predictor into something that can act. Repeat it in a loop—call, observe, decide, call again—and you have the mechanical core of every agent, from a single retrieval step to a multi-turn coding assistant that reads files, runs tests, and edits source.
How it works
The developer supplies tool definitions—name, description, and a JSON Schema for the arguments—alongside the prompt. When the model decides a tool is needed, it stops generating prose and instead outputs a structured call matching that schema, which the API surfaces as a distinct response type rather than free text. The application parses the call, executes the corresponding code, and returns the output as a tool-result message appended to the conversation. The model reads that result and either answers the user or issues another call. The schema constrains the arguments, but the model still chooses which tool and when.
Why it matters for AI engineers
Every tool call is a round trip: added latency, added tokens, and a new failure surface where the model can hallucinate arguments or pick the wrong tool. Costs compound in agentic loops, since each observation is re-sent through the context window on the next turn—prompt caching and tight tool outputs matter. Reliability hinges on validating arguments before execution and handling tool errors gracefully rather than trusting the model's output. Security is acute: a tool that reads untrusted content can carry prompt-injection payloads that hijack subsequent calls, so sandboxing and least-privilege tool scopes are not optional. Fewer, well-described tools generally outperform a sprawling catalog.
Tool Use vs. alternatives
| Concept | What it is | Scope |
|---|---|---|
| Tool Use | Model emitting calls and consuming results | The capability/pattern |
| Function Calling | The API mechanism that surfaces the call | The plumbing |
| Agentic Loop | Repeated tool use with a stopping condition | The control flow |
| MCP | Standard protocol for exposing tools | The integration layer |
Related terms
Definitions are the start. Ask the Research Desk for a cited, multi-source brief on Tool Use — real sources, verified claims, delivered in minutes.
Ask the Research Desk →