All posts

MCP fragmentation, context efficiency, and the rise of curated skills

Published on

  • ai
  • api
  • architecture
  • composable
  • devops
  • performance
  • buzzwords

We're in the middle of something that feels uncomfortably familiar. The Model Context Protocol (MCP) promised to be the universal interface for connecting AI models to external tools and data sources. A year into its existence, we're watching the pattern repeat: different vendors building different approaches to solve the same problems, each claiming theirs is the "right" way forward.

I've spent months building production-grade MCP implementations, both server and client side, SSE streaming and STDIO variants, experimenting with tool counts and context window consumption. I've created highly contextualized MCPs that dynamically reshape their tool descriptions based on connected data. I've built testing frameworks to compare implementations across time, tokens, and context usage. What I've learned: we're fragmenting fast, and nobody seems worried enough about it.

The context window problem

Let's start with the elephant in the room: MCP tools are context-hungry. Every tool you expose comes with descriptions, schemas, parameters, and examples. Add ten tools and you've consumed thousands of tokens before the model even starts reasoning. Add fifty tools and you're facing a choice, either dramatically increase your context window costs or start making hard decisions about which capabilities to expose.

I've built interfaces where users authenticate via OAuth and select only the tools they need, creating lean MCP instances optimized for specific tasks. The performance difference is dramatic. A focused toolset with five carefully chosen tools outperforms a kitchen-sink approach with fifty tools every single time, faster responses, lower costs, better accuracy.

But here's the thing: this optimization shouldn't be the developer's job. We're solving a protocol design problem with application-layer workarounds.

Claude Skills vs MCP: efficiency through curation

This is where Claude Skills start to look interesting. Instead of exposing every possible MCP tool and letting the model sort it out, Skills provide a curated, context-efficient abstraction layer. The skill documentation lives in a separate, structured space. The model loads only what it needs when it needs it, rather than carrying the full tool catalog in every single request.

From a cost and performance perspective, this matters enormously. Cheaper models, the ones we'd actually want to use at scale, struggle with complex tool selection across large catalogs. They burn reasoning tokens trying to figure out which tools to use. A well-structured Skill can guide a less capable model to success with dramatically fewer tokens.

Is this the right long-term approach? I honestly don't know. But it's solving a real problem that raw MCP implementations struggle with.

The fragmentation we're not addressing

It gets messier. We're not just seeing Anthropic's Skills vs MCP. We're seeing different vendors, Anthropic, OpenAI, Google, and others, each building their own tool-calling patterns, their own optimization strategies, their own "better ways" to connect models to external capabilities.

Sound familiar? It should. We've been here before with every significant protocol in tech history. Remember the early days of GraphQL implementations? REST "standards"? Microservices patterns? The initial fragmentation phase where everyone builds their own interpretation before painful consolidation happens.

The difference is that this time, the stakes are higher. Model costs, context window limitations, and inference speed directly impact whether AI applications are economically viable at scale. Getting this wrong is expensive.

Making determinism work

One pattern I've found genuinely useful: building MCP tools that wrap deterministic automation platforms. I've created MCP servers that expose Contentstack Automate workflows as tools. Instead of the model trying to chain together primitive operations, it triggers pre-built, tested automations that handle complex multi-step processes reliably.

This hybrid approachm, LLM reasoning for high-level decisions, deterministic automations for critical operations, feels more production-ready than pure model-driven workflows. The model doesn't need to reason through every step of publishing content across environments. It just needs to understand when to trigger the "publish content" automation.

This pattern reduces token consumption, improves reliability, and makes debugging actually possible. But again, this is an application-layer solution to protocol-layer problems.

Complexity without payoff

What frustrates me most isn't the technical challenges, those are solvable. It's the unnecessary complexity we're asking developers to manage. Building production MCP implementations requires understanding streaming protocols, managing stateful connections, optimizing tool catalogs, implementing smart caching, and constantly tuning for context efficiency.

These are all symptoms of an immature protocol trying to solve too many problems at once.

We need cheaper models to successfully handle tool-based tasks without burning excessive reasoning tokens. We need context efficiency to make AI applications economically sustainable. We need interoperability so tools built for one vendor work with others. Right now, we're getting none of these by default.

Where do we go from here?

Honestly? I don't know, and I don't think anyone else does either. We could see MCP standardization happen through vendor collaboration. We could see context window breakthroughs that make current optimization efforts obsolete. We could see Claude Skills or similar abstractions become the pragmatic winner simply because they work better in practice. We could see an entirely new protocol emerge that learns from MCP's limitations.

What I do know is that we're in a challeging state. The current fragmentation benefits nobody except perhaps vendors trying to build moats around their ecosystems. Developers are stuck building integration layers and workarounds. Users are paying more for slower, less reliable experiences.

We've collectively decided that connecting AI models to tools is important enough to invest serious engineering effort. Now we need to decide whether we're building toward a coherent ecosystem or just repeating the standards wars of the past decade.

The smart money isn't on predicting which approach wins. It's on building systems flexible enough to adapt when the consolidation inevitably happens.

Frequently asked questions

  • What is the main problem with current MCP implementations?

    The core issue is that MCP tools are extremely context-hungry. Each tool adds descriptions, schemas, parameters, and examples that must fit into the model’s context window. Large catalogs quickly consume thousands of tokens before any reasoning happens, driving up costs and latency while degrading accuracy. Developers are forced to manually curate toolsets, optimize catalogs, and build complex infrastructure to keep context usage under control—work that should ideally be handled at the protocol or platform level, not by every individual application team.

  • How do Claude Skills help compared to raw MCP tool exposure?

    Claude Skills introduce a curated abstraction layer over raw tools. Instead of sending the entire tool catalog in every request, Skills keep structured documentation and behavior in a separate space and load only what is needed when it is needed. This reduces context usage, improves performance, and lets cheaper models succeed with fewer reasoning tokens. In effect, Skills act as a context-efficient interface that guides the model, making tool selection easier and less expensive than dealing with a large, flat MCP tool inventory on every call.

  • Why combine MCP with deterministic automation platforms like Contentstack Automate?

    Wrapping deterministic automation platforms in MCP tools lets models trigger robust, pre-tested workflows instead of reasoning through every low-level step. The LLM focuses on high-level decisions—such as when to publish content—while the automation platform executes the complex, multi-step process reliably. This hybrid pattern reduces token consumption, increases reliability, and makes debugging far easier than fully model-driven workflows. It is a pragmatic way to make production systems more stable while current MCP designs and vendor-specific tool patterns remain immature and fragmented.