Redesigning Statamic MCP: Fewer Tools, Better AI Reasoning

When I released Statamic MCP v1, it had 140+ individual tools. One for creating an entry, one for updating, one for deleting, one for listing, one for getting a single entry, and the same pattern repeated for blueprints, terms, globals, structures, assets, users, and forms. Each tool had its own parameter schema, description, and validation logic.

It worked. AI assistants could manage Statamic content. But something was off. Claude and Cursor would sometimes pick the wrong tool, confuse update_entry with patch_entry_fields, or fail to discover that a tool existed at all. The more tools I added, the worse the problem got.

The tool selection problem

LLMs choose tools by reading descriptions and matching them to the user's intent. When you have 10 tools, this works well. The model reads all 10 descriptions, picks the best match, and moves on. When you have 140+ tools, the model is reading thousands of tokens of tool descriptions before it even starts thinking about the task.

This has two effects. First, the model spends context window capacity on tool descriptions instead of reasoning. Second, the probability of selecting the correct tool decreases as the number of similar tools increases. create_entry, make_entry, store_entry, and add_entry might all sound plausible to an LLM that's pattern-matching on natural language.

The router architecture

Version 2 consolidates everything into 11 domain routers: statamic-entries, statamic-blueprints, statamic-terms, statamic-globals, statamic-structures, statamic-assets, statamic-users, statamic-system, statamic-content-facade, statamic-system-discover, and statamic-system-schema.

Instead of calling create_entry(collection: 'blog', data: {...}), the AI calls entries(action: 'create', collection: 'blog', data: {...}). The action is a parameter, not a separate tool.

This means the LLM only needs to understand 11 tools instead of 140+. Each tool description is short and domain-focused. The model picks the right domain router almost every time, and then specifying the action within that domain is straightforward.

Measuring the improvement

I tested both versions against a suite of 50 common Statamic operations using Claude Sonnet. The results were clear:

Tool selection accuracy: 78% (v1) vs 96% (v2)
Average tokens per operation: 2,400 (v1) vs 1,100 (v2)
First-attempt success rate: 65% (v1) vs 89% (v2)

The token reduction alone is significant. Every MCP request includes the full tool schema, and sending 11 tool descriptions instead of 140+ saves over a thousand tokens per round trip.

Intent-based discovery

The system-discover router is new in v2. Instead of the AI guessing which tool handles a particular operation, it can ask: "I want to change the fields on a blueprint. What should I use?" The discover router returns the exact tool, action, and parameter schema needed.

This is particularly useful for operations that span multiple domains. Reordering a navigation tree involves the structures router, but an AI might first look at entries or blueprints. The discovery router shortcuts this by mapping intents to tools.

Migration from v1

All individual v1 tool names (statamic.entries.create, etc.) are replaced by the router format in v2. Legacy authentication methods (Basic Auth and base64 Bearer tokens) still work but log deprecation warnings.

The upgrade path is straightforward: update the package, re-publish the config, and regenerate your AI client configuration from the CP dashboard. Token usage drops immediately, and tool selection accuracy improves without any changes to your prompts or workflows. See the v2 release post for the full changelog.