How the Claude Code team designs agent tools

(Part of my Today I Learned series. Short posts on things that made me think.)

When Claude Code shipped, the team gave their agent a todo list tool to keep it on track. It worked perfectly. Then models got smarter and the tool became the constraint. System reminders "made Claude think that it had to stick to the list instead of modifying it." The tool built to help the agent plan was now preventing it from planning. So they ripped it out and replaced it with something designed for agents to coordinate with each other, not babysit themselves.

But the deeper insight isn't about swapping one tool for another. When the team wanted to expand what Claude could do, they often didn't add new tools at all. They used progressive disclosure: skill files that reference other files, letting the agent recursively discover its own context across several layers. More capability, zero new tools added. Apple's CodeAct research found the same pattern independently: a single code execution primitive outperformed sprawling specialised toolkits by up to 20% on complex tasks. Fewer, more expressive tools keep beating a long menu of narrow ones.

Most agent builders reach for more tools when something breaks. The Claude Code team's experience suggests the opposite. Your tooling has to evolve with the model, and sometimes evolution means subtraction. Designing an agent's action space, as Thariq puts it, "is as much an art as it is a science." The only reliable method is reading the model's outputs and watching where it struggles. You have to learn to see like an agent. Anthropic's own Building Effective Agents guide reinforces this: the most successful implementations use simple, composable patterns rather than complex frameworks.

👉🏽 https://x.com/trq212/status/2027463795355095314

Join engineers getting weekly insights on agents, RAG & production LLM systems

No spam, no sharing to third party. Only you and me.

Member discussion