Designing for Multiple AI Consumers

MCP lets any AI tool access your data — Bring Your Own LLM. The value: your product reaches every model and every client without rebuilding for each. The trade-off: when you give up control of the LLM, data quality and tool design become your only control surfaces. What MCP enables, what changes, and what compensates.

What MCP Enables

Model Context Protocol is a standard for connecting AI tools to external data. An MCP server exposes data and capabilities through tools and resources. Any AI client that speaks MCP — Claude Desktop, Claude Code, Cursor, and eventually others — can connect to the server and access that data through its own model.

This is what we call BYOLLM — Bring Your Own LLM. Instead of only experiencing Mosaic through our web app (where we control the model, the prompts, and the presentation), visitors can connect their own AI tool and interact with persona data through whatever model they prefer. Your model, your conversation, our data.

The value proposition is reach without rebuilding. A single MCP server makes your data accessible to every AI client that implements the protocol. You don't build a Claude integration, then a Cursor integration, then a ChatGPT integration. You build one data layer and expose it via MCP. The clients do the rest.

But this creates a fundamental trade-off.

The Trade-Off

Mosaic's web app controls the full pipeline. We choose the model. We write the system prompt. We set the temperature. We control how the agent presents persona data — the voice, the tone, the format. When something's off, we adjust.

Via MCP, all of that goes to the visitor. They choose the model. Their client sets the parameters. Their agent decides how to present the data. We control what we share, not how it's interpreted.

This isn't a minor distinction. In the web app, the system prompt says "You are Andrew Henry. Respond in first person." The model must follow. Via MCP, the same instruction arrives as tool output — content in the conversation, not a privileged system prompt. A capable model will likely follow it. A less capable one might not. A misconfigured client might truncate it.

What you lose: per-turn prompt modification, guaranteed response format, unified voice across interactions, mid-conversation auth upgrades. What you gain: heterogeneous consumption across every AI client, user autonomy over their tools, reduced infrastructure complexity, and a natural path toward agent-to-agent interaction.

When you give up control of the LLM, data quality and tool design become your only control surfaces.

Client Behavioural Variance

The theory of MCP is a uniform protocol. The reality is a matrix of capabilities.

Claude Web calls tools reliably but ignores resources entirely — even when they're registered and well-described. Cursor is the inverse: resources work, tools don't. ChatGPT's connection failed silently during tool discovery, likely a transport incompatibility. Claude Code requires an explicit command to trigger OAuth rather than connecting automatically.

This creates practical consequences. When Claude Web asked about a persona, it tried guessing IDs — drew-hank, drewhank, drew_hank — rather than reading the resource that listed available personas. It didn't read the resource because Claude Web doesn't use MCP resources. The fix wasn't better documentation. It was making the persona ID parameter optional, defaulting to the authenticated user. Remove the friction that the client creates.

Cursor's agent, faced with a similar task, tried diving into the local codebase to query the database directly. When nudged toward MCP, it read the personas resource successfully and self-corrected: "I can't call MCP tools from here... I should have used the Mosaic resource first." The agent understood its own limitations and adapted.

You can't design for a uniform client. You design for the variance — support both tools and resources, make each path independently functional, and don't assume the agent will read anything proactively.

Data Quality as Compensation

When you can't control the model, you must design the data so that any reasonable model produces good output.

In the web app, if the agent's tone is wrong, you adjust the system prompt. Via MCP, the tone is encoded in the data itself. First-person content — "I built a distributed system that handled 10M QPS" — naturally encourages first-person responses from any model. Preferences that say "emphasise practical experience over credentials" and "prefer concrete examples to abstractions" signal style without requiring a system prompt.

Content structure becomes implicit guidance. Bullet-pointed skills produce bullet-pointed responses. Narrative paragraphs produce storytelling. Clear headings make sections citable. Dense, unstructured blocks get lost. How the owner writes their content — the structure, the tone, the detail level — matters as much as what preferences they set. Every model that reads the data receives these signals.

This is where content engineering (Article 2) becomes practical in an immediate way. The quality of AI-mediated interactions depends not just on the model, but on the data it's working with. Content engineering matters more when you have less control — and it's harder to test, because you're at the mercy of how each client's model interprets your data.

Error Messages as Agent Instructions

In MCP, error messages aren't just for humans. They're instructions for the agent.

When an agent calls a tool and gets "Permission denied," it tells the user "permission denied" and stops. Dead end. When it gets "Permission denied: owner authentication required. This tool is for owners to preview their own persona. Use get_persona_context to look up other personas by handle or name" — the agent extracts the blocker, the explanation, and the alternative path. It creates a helpful workaround without being told explicitly.

We observed this directly. Given an actionable error message, Claude's thinking trace showed: "The user isn't authenticated as an owner. I should let them know they need to authenticate, or I can look it up by name or handle." The agent synthesised a recovery path from the error text alone.

The design principle: every error message should state the action required, suggest alternative tools when available, and explain the scope model. Keep it concise — one or two sentences with clear next steps. Different LLMs will present the message differently, but if the source text is clear and prescriptive, any reasonable model will convey the right information. The error message is your only lever.

Error messages in MCP aren't diagnostics — they're instructions. The agent's only context is what you give it.

Tool Design as a Lever

The most impactful tool design decision we made was small: making the persona ID parameter optional, defaulting to the authenticated user.

Before the change, every tool call required the agent to know and provide the persona ID. Claude Web couldn't read the resource listing personas, so it guessed. After the change, the agent calls get_persona_context() with no parameters and gets the right data. The friction vanishes.

This generalises. "How much context does the agent need to take the first action?" is as important a design question as "What can the agent do?" The easier the first action, the more likely agents will use the tool. Zero required parameters for the common case. Obvious tool descriptions that signal when to use them. Don't rely on documentation — make behaviour self-evident from the tool interface itself.

The same principle applies to the tools-versus-resources duality. Some clients use tools, others use resources. Expose the same data via both. The directory of personas is available as a resource (for Cursor) and as a tool (for Claude). Each path works independently. The agent accesses data through whichever mechanism its client supports.

Protocol Constraints Shape Product Decisions

MCP authentication happens at connection time only. There's no standard way to upgrade permissions mid-conversation. If an owner connects with read-only access and later wants to edit, they must disconnect, reconfigure, and reconnect — losing conversation context.

This isn't a UX problem you can design away. It's a protocol constraint. The web app handles this seamlessly — an inline auth prompt, the conversation continues. MCP can't do that yet.

The mitigation is to nudge upfront: a clear authorisation form that explains what read-only means and prompts owners to enter credentials before connecting. Informative error messages when someone hits a permission wall. Accept that some friction is baked into the protocol and design around it rather than fighting it.

Future MCP features — mid-session elicitation, custom UI rendering — may solve this. The protocol is young. But today, product decisions are shaped by protocol constraints, and the honest approach is to acknowledge them rather than pretend they don't exist.

What BYOLLM Actually Gives You

The trade-offs are real, but so is the upside. A single data layer reaches every AI client without per-client infrastructure. Users trust the platform because their AI tool is theirs — no vendor lock-in to a specific model. Your product survives model obsolescence because you're a data provider, not an AI proxy.

The architecture also creates a natural path toward agent-to-agent interaction. Mosaic's worker already reasons autonomously — it has content, preferences, voice, judgment. What's missing isn't intelligence; it's infrastructure: discovery, machine-to-machine auth, structured protocols for multi-turn exchange. MCP is agent-to-tool today. Agent-to-agent is the next step, and the data layer built for BYOLLM is the foundation it builds on.

The lesson that runs through all of this: when you give up control of the model, you gain leverage through data and design. Content engineering, tool design, error messages, metadata — these are your control surfaces. Make them excellent, and any reasonable model will produce good output from your data.

Questions to explore

Architecture

Practical

← All Thinking