Understanding
Agents
In Section 03 you ran your first tool loop. Now you go one level deeper: understanding the architectural patterns behind every real-world agent, how an agent perceives and acts on its environment, how to make reasoning visible through explicit traces, and how to decide whether a task actually needs an agent at all. The lab builds a ReAct agent with step-by-step reasoning you can read and debug.
Four Patterns — One Spectrum of Complexity
Not all agents are equal. The architecture you choose determines how the agent reasons, how much it costs to run, and what failure modes you inherit. There are four patterns in wide production use as of 2024–2026, ranging from a simple reactive loop to agents that self-critique and revise their own plans.
How an Agent Experiences Its Environment
Every agent operates within a perception-action cycle — a concept borrowed from cognitive science and robotics, applied to LLM-based systems. The agent does not "see" the world directly; it perceives a representation of the world through its context window, decides what to do, acts via tools, and receives a new observation that updates its representation.
Five Components Every Agent Has
Whether you write a raw-SDK agent in 80 lines or use a framework like LangGraph, every production agent is composed of the same five functional components. Understanding each one lets you reason about agent behavior independently of which library you use.
| COMPONENT | WHAT IT DOES | IMPLEMENTED AS |
|---|---|---|
| System Prompt | Defines the agent's role, capabilities, constraints, output format, and available tools. The agent's "personality and rules." | The system parameter in the API call |
| Tool Registry | The list of tools available to the agent — their names, descriptions, and input schemas. The agent can only call what is registered. | The tools list in the API call |
| Message History | The accumulated conversation — user messages, assistant reasoning, tool calls, and tool results. This is the agent's working memory. | The messages list, grown each iteration |
| Loop Controller | The application code that calls the API, inspects stop_reason, routes tool calls to executors, appends results, and decides when to stop. |
Your Python while / for loop |
| Tool Executors | The functions that actually run when a tool is called — hitting APIs, running code, reading files. They return a result string that goes back into the message history. | Your Python functions, dispatched by tool name |
The Decision Framework
Agents are powerful but expensive — in tokens, latency, and error surface. Anthropic's official guidance states plainly: "use the simplest solution that works." An agent loop where a single prompt would suffice is over-engineering, not good architecture. Use this decision tree before choosing an agent.
✓ Task requires dynamic tool selection
✓ Output of one step determines the next
✓ Recovery from tool failures is needed
✓ Task may need to ask for clarification
✗ The pipeline is always the same N steps
✗ Latency is critical and you can't afford multiple API calls
✗ The task is purely generative (writing, summarizing)
✗ You don't have a way to verify tool outputs
You Cannot Debug What You Cannot See
An agent that fails silently is worse than no agent at all. A core practice in production agent development is structured tracing — logging every iteration's inputs, tool calls, tool results, and stop reasons in a format that can be replayed and inspected. Without it, debugging a multi-step failure is nearly impossible.
Verified References
Every claim in this section is grounded in one of these sources.
| Source | Type | Covers | Recency |
|---|---|---|---|
| Yao et al. — ReAct | Academic paper | ReAct pattern, Thought/Action/Observation trace, ICLR 2023 | Oct 2022 / ICLR 2023 |
| Shinn et al. — Reflexion | Academic paper | Reflection pattern, self-critique, verbal reinforcement | 2023 |
| Lilian Weng — LLM Powered Autonomous Agents | Blog / Survey | Architecture patterns, perception-action, agent taxonomy | June 2023 |
| Anthropic — Tool Use & Agent Docs | Official docs | Tool loop mechanics, stop_reason, agent design guidance | Maintained 2024–2026 |
| LangGraph Documentation | Official docs | Graph-based agent architecture, state management | Maintained 2024–2026 |
Build a ReAct Agent with Explicit Reasoning Traces
You will build a research agent that uses the ReAct pattern explicitly — printing Thought, Action, and Observation at every step so you can watch the agent reason in real time. It has two tools: a simulated web search and a word counter. The goal is to understand the loop deeply enough that you can trace any agent failure back to a specific step.
If you already have the agent-lab directory from Section 03, just create a new file in it. Otherwise, set up fresh.
mkdir -p agent-lab && cd agent-lab python -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate pip install anthropic python-dotenv echo "ANTHROPIC_API_KEY=your_key_here" > .env
Get your API key from console.anthropic.com → API Keys.
Create react_agent.py. The system prompt explicitly instructs the model to reason step by step before every action — this is what makes it a ReAct agent rather than a simple reactive loop.
import os import anthropic from dotenv import load_dotenv load_dotenv() client = anthropic.Anthropic() # The system prompt enforces ReAct: think before every action SYSTEM = """You are a research agent that follows the ReAct pattern strictly. Before every tool call, write a Thought explaining your reasoning. After receiving a tool result, write an Observation summarising what you learned. Keep thinking and acting until you have enough information to give a complete answer. When you are done, write your final answer clearly.""" # Two tools: simulated search and a word counter TOOLS = [ { "name": "web_search", "description": ( "Search the web for information on a topic. " "Returns a short summary of the top result. " "Use this when you need factual information you don't know." ), "input_schema": { "type": "object", "properties": { "query": { "type": "string", "description": "The search query" } }, "required": ["query"] } }, { "name": "count_words", "description": "Counts the number of words in a given text string.", "input_schema": { "type": "object", "properties": { "text": { "type": "string", "description": "The text to count words in" } }, "required": ["text"] } } ]
The web search is simulated with a static lookup — in a real agent you would call a search API like Brave or Serper. The word counter is exact Python logic. Notice the pattern: both return a plain string to pass back to the model.
# Simulated search DB — swap for a real search API in production SEARCH_DB = { "ReAct paper": ( "ReAct: Synergizing Reasoning and Acting in Language Models. " "Yao et al., arXiv:2210.03629, published at ICLR 2023. " "Interleaves chain-of-thought reasoning with action steps." ), "LangGraph": ( "LangGraph is a library for building stateful, multi-actor applications with LLMs. " "Built by LangChain. Uses a graph of nodes and edges to model agent workflows. " "Supports cycles, branching, and human-in-the-loop patterns." ), "Anthropic": ( "Anthropic is an AI safety company founded in 2021. " "Creators of the Claude model family. Developed Constitutional AI (CAI). " "Focused on AI safety research and interpretability." ), } def web_search(query: str) -> str: for key, result in SEARCH_DB.items(): if key.lower() in query.lower(): return result return f'No results found for "{query}". Try a different query.' def count_words(text: str) -> str: count = len(text.split()) return f"Word count: {count}" def execute_tool(name: str, tool_input: dict) -> str: if name == "web_search": return web_search(tool_input["query"]) if name == "count_words": return count_words(tool_input["text"]) return f"Unknown tool: {name}"
This loop is similar to Section 03's, but it prints the full reasoning trace — every Thought block and every tool call — so you can see the agent's reasoning in real time. Study how the token budget accumulates.
def run_react_agent(user_message: str, max_iterations: int = 8) -> str: messages = [{"role": "user", "content": user_message}] total_input_tokens = 0 print(f"\n{'='*60}") print(f"USER: {user_message}") print(f"{'='*60}") for iteration in range(max_iterations): print(f"\n[iteration {iteration + 1}]") response = client.messages.create( model="claude-opus-4-6", # check docs.anthropic.com for current models max_tokens=1024, system=SYSTEM, tools=TOOLS, messages=messages ) total_input_tokens += response.usage.input_tokens print(f" stop_reason : {response.stop_reason}") print(f" input_tokens : {response.usage.input_tokens} (total: {total_input_tokens})") # Print the full reasoning trace from this iteration for block in response.content: if hasattr(block, "text") and block.text.strip(): print(f"\n THOUGHT/TEXT:\n {block.text.strip()}") elif block.type == "tool_use": print(f"\n ACTION: {block.name}({block.input})") # Append assistant turn to history messages.append({"role": "assistant", "content": response.content}) # Done if response.stop_reason == "end_turn": for block in response.content: if hasattr(block, "text"): print(f"\n{'='*60}") print(f"FINAL ANSWER:\n{block.text}") print(f"{'='*60}") print(f"Total input tokens used: {total_input_tokens}") return block.text return "(end_turn with no text)" # Execute tool calls, collect observations if response.stop_reason == "tool_use": tool_results = [] for block in response.content: if block.type == "tool_use": result = execute_tool(block.name, block.input) print(f" OBSERVATION: {result}") tool_results.append({ "type": "tool_result", "tool_use_id": block.id, "content": result }) messages.append({"role": "user", "content": tool_results}) return "Max iterations reached." if __name__ == "__main__": run_react_agent( "Search for the ReAct paper, then count how many words " "are in the result you find." )
python react_agent.py
============================================================
USER: Search for the ReAct paper, then count how many words are in the result.
============================================================
[iteration 1]
stop_reason : tool_use
input_tokens : 612 (total: 612)
THOUGHT/TEXT:
I'll search for the ReAct paper first.
ACTION: web_search({'query': 'ReAct paper'})
OBSERVATION: ReAct: Synergizing Reasoning and Acting in Language Models.
Yao et al., arXiv:2210.03629, published at ICLR 2023.
Interleaves chain-of-thought reasoning with action steps.
[iteration 2]
stop_reason : tool_use
input_tokens : 743 (total: 1355)
THOUGHT/TEXT:
I found the ReAct paper. Now I'll count the words in that result.
ACTION: count_words({'text': 'ReAct: Synergizing Reasoning ...'})
OBSERVATION: Word count: 30
[iteration 3]
stop_reason : end_turn
input_tokens : 821 (total: 2176)
THOUGHT/TEXT:
I have all the information needed to answer.
============================================================
FINAL ANSWER:
The ReAct paper (Yao et al., arXiv:2210.03629, ICLR 2023) describes a pattern
that interleaves chain-of-thought reasoning with action steps. The search result
summary contains 30 words.
============================================================
Total input tokens used: 2176
Add a get_current_date tool that returns today's date. Then run the agent with a query the search DB doesn't have — observe how it handles the "No results found" observation and either retries with a different query or gracefully reports the failure. This is how you discover your agent's error recovery behavior before production.
{
"name": "get_current_date",
"description": "Returns today's date in ISO 8601 format (YYYY-MM-DD).",
"input_schema": {"type": "object", "properties": {}}
}
from datetime import date if name == "get_current_date": return str(date.today())
run_react_agent("What is today's date, and search for 'quantum computing'.")
Watch how the agent handles the "No results found" response — does it give up, retry with a different query, or tell the user it couldn't find anything? This reveals whether your system prompt provides adequate guidance for failure recovery.