Remote Subagents and the Protocol That Changed Everything: Lessons from Building Gossiper

When we started building Gossiper, we believed the future of AI would be about connecting intelligent agents to everything. Users wanted their AI to read their Shopify orders, draft emails, manage their CRM, and automate their workflows—all in one place. The problem was obvious. The solution was not.

Like many teams before us, we built custom integrations. One for Shopify. One for Google Calendar. One for Notion. Each required its own authentication flow, its own data transformation layer, its own maintenance burden. By the time we had fifteen integrations, we had fifteen different codebases, fifteen different failure modes, and a team that spent more time maintaining plumbing than building intelligence.

Then we discovered MCP.

The Monolithic Integration Trap

The standard approach to AI-tool integration follows a predictable pattern: your agent calls an API, gets a response, and stuffs it into context. Simple enough for one tool. Utterly untenable at scale.

Our early architecture looked like this: a central orchestrator managing direct connections to each external service. Every new integration meant new authentication handling, new rate limiting logic, new error recovery patterns. The orchestrator grew into a 12,000-line behemoth that knew too much about too many things.

Worse, our agents couldn't specialize. A single model had to understand Shopify's inventory semantics, Google's calendar quirks, and Slack's threading model—all simultaneously. We were asking one mind to be an expert in everything. The context window wasn't the bottleneck; cognitive load was.

We rebuilt our integration layer three times trying to solve this. Each iteration was marginally better, but we were optimizing a fundamentally wrong architecture.

The Subagent Revelation

The insight came from observing how human organizations scale: through delegation. A CEO doesn't personally manage payroll, customer support, and product development. They delegate to specialists who own their domains completely.

What if our AI could do the same?

We started experimenting with spawning specialized subagents—smaller, focused models that owned specific domains. A "Shopify Agent" that understood nothing but e-commerce. A "Calendar Agent" that lived and breathed scheduling. The orchestrator's job simplified dramatically: route requests to the right specialist, aggregate responses.

This worked beautifully in our development environment. Then we tried to deploy it.

The subagent pattern introduced a new problem: communication. How does the orchestrator talk to a Shopify subagent? If they're running in the same process, simple function calls suffice. But what about when the Shopify subagent runs on dedicated infrastructure optimized for high-throughput order processing? What about when a customer wants to host their own subagents behind their firewall for compliance reasons?

We needed a protocol.

Enter MCP: The USB-C of Agent Communication

The Model Context Protocol emerged from Anthropic's work on Claude, and when we first read the specification, the resonance was immediate. MCP isn't trying to be clever. It's trying to be obvious—and that's precisely its power.

At its core, MCP defines a simple contract: a host (our orchestrator) connects to servers (our subagents), which expose tools, resources, and prompts. The transport layer is pluggable—stdio for local processes, HTTP with SSE for remote services, WebSocket for bidirectional streaming. The protocol handles capability negotiation, lifecycle management, and structured error propagation.

What matters isn't the technical elegance (though it exists). What matters is what it enables architecturally.

Before MCP, adding a new subagent meant:

Defining a bespoke communication interface
Implementing serialization logic for that interface
Building authentication between orchestrator and subagent
Creating monitoring for the custom protocol
Documenting the interface for future developers

After MCP, adding a new subagent means:

Implementing the MCP server interface
Done

We went from 2-3 engineering days per integration to 2-3 hours. Not because MCP is magic, but because standardization compounds.

Lessons from Production

Running MCP-based subagents in production for the past year has taught us several non-obvious lessons.

1. Treat Tool Schemas as API Contracts

MCP tool definitions use JSON Schema, and it's tempting to treat them casually. Don't. These schemas are the interface contract between your orchestrator and subagents. We version our tool schemas independently of our code, and breaking changes go through the same review process as public API changes.

A typical tool definition in our Shopify subagent:

{
  name: "get_order_details",
  description: "Retrieve complete order information including line items, customer data, and fulfillment status",
  inputSchema: {
    type: "object",
    properties: {
      order_id: { type: "string", pattern: "^[0-9]+$" },
      include_customer: { type: "boolean", default: true },
      include_fulfillments: { type: "boolean", default: true }
    },
    required: ["order_id"]
  }
}

That pattern constraint on order_id catches malformed requests before they hit Shopify's API. The defaults on boolean flags establish sensible behavior without requiring the orchestrator to micromanage every call. Small details, but they eliminate entire categories of runtime errors.

2. Resources Are Underrated

Most MCP implementations focus on tools—callable functions that do things. But MCP also defines resources: readable data with URIs. We initially ignored resources entirely. That was a mistake.

Resources enable a fundamentally different interaction pattern. Instead of the orchestrator asking "what are the recent orders?" and receiving a blob of JSON, it can browse shopify://orders/recent as a navigable resource. The subagent can expose semantically meaningful URIs that the orchestrator understands without knowing Shopify's API structure.

Our Shopify subagent now exposes:

shopify://orders/{id} — individual order details
shopify://orders/recent — last 50 orders
shopify://products/low-stock — inventory alerts
shopify://customers/{id}/history — customer order history

The orchestrator treats these as a filesystem it can explore. When a user asks "show me orders from last week that haven't shipped," the orchestrator can navigate the resource hierarchy rather than constructing complex tool calls.

3. Prompt Templates Are Agent Personalities

MCP servers can expose prompts—parameterized templates that guide how the host should frame requests. We use these to encode domain expertise that would otherwise live in the orchestrator.

Our calendar subagent exposes a prompt called schedule_meeting:

{
  name: "schedule_meeting",
  description: "Find optimal meeting time considering all participants",
  arguments: [
    { name: "participants", description: "Email addresses", required: true },
    { name: "duration_minutes", required: true },
    { name: "preferred_time", description: "morning/afternoon/evening" }
  ]
}

When the orchestrator needs to schedule a meeting, it doesn't craft a generic request. It invokes this prompt, and the subagent responds with a structured template that encodes years of calendar-wrangling wisdom: buffer time between meetings, timezone handling, preference for video call links, etc.

The prompt template acts as the subagent's personality—its opinionated stance on how calendar operations should work. The orchestrator remains domain-agnostic while benefiting from specialized expertise.

4. The OAuth Dance is Worth It

Authentication was our biggest implementation headache. MCP supports pluggable authentication, and the specification recommends OAuth 2.1 for remote servers. We initially tried to shortcut this with API keys. Within weeks, we had a key rotation nightmare across dozens of customer deployments.

We bit the bullet and implemented proper OAuth with device flow for headless environments. The initial investment was substantial—roughly three weeks of engineering time including the authorization server. But the payoff has been immense:

Customers can revoke subagent access without touching their primary credentials
Scopes limit what each subagent can do (our Shopify agent gets read_products and read_orders, never write_products)
Token refresh happens automatically; no more manual key rotation

If you're building for production, do OAuth from day one. You'll thank yourself.

5. Sampling Enables Agent Composition

Perhaps MCP's most subtle feature is sampling: the ability for a server to request LLM completions from the host. This inverts the typical control flow. Instead of the orchestrator always driving, subagents can ask the orchestrator's model for help.

We use this for what we call "clarification loops." When our Shopify subagent receives an ambiguous request—"refund the customer's last order"—it can sample from the orchestrator's model to generate a clarifying question: "I found two recent orders for this customer. Should I refund order #4521 ($127.50, 3 days ago) or order #4519 ($89.00, last week)?"

The subagent remains focused on Shopify domain logic while delegating natural language generation to the host's model. This separation keeps subagent implementations simple and deterministic.

The Architecture That Emerged

After a year of iteration, our architecture has settled into a pattern we call the "constellation model."

The orchestrator (our core Gossiper service) maintains connections to a fleet of MCP servers. Some run as local processes for low-latency operations. Others run on dedicated infrastructure—our Shopify subagent processes millions of webhooks per day and needs its own scaling story. A few run in customer environments behind corporate firewalls.

Tool routing happens through capability matching. When a user request arrives, the orchestrator examines the registered tools across all connected servers and selects the appropriate subagent. This selection itself uses the model—we found hand-coded routing rules too brittle for the variety of user requests.

Context management follows a "need to know" principle. The orchestrator maintains the full conversation history, but subagents receive only the context relevant to their operation. A Shopify query includes recent order IDs mentioned in conversation, but not unrelated calendar details. This reduces token usage in subagent calls by 60-70% while improving response relevance.

Error handling propagates structured errors through MCP's error response format. When a subagent fails—Shopify API timeout, rate limit, authentication expiry—the error includes enough context for the orchestrator to either retry, fall back, or explain the failure to the user gracefully.

What We Got Wrong

Not everything was a success story. Some lessons came from painful mistakes.

We over-indexed on tool count. Early versions of our Shopify subagent exposed 47 tools—one for every Shopify API endpoint we could wrap. The orchestrator couldn't effectively choose between them. We consolidated to 12 high-level tools with richer parameter schemas. Fewer, smarter tools beat many simple ones.

We underestimated cold start. MCP connections have handshake overhead. Our first remote subagents took 2-3 seconds to respond to initial requests while capability negotiation completed. We implemented connection pooling and keep-alive, reducing cold starts to under 200ms. For latency-sensitive applications, this matters enormously.

We trusted tool descriptions too much. The model interprets tool descriptions to decide when to use them. Subtly misleading descriptions caused bizarre routing decisions. Our calendar subagent's create_event tool originally described itself as "schedule a meeting or event." Users asking to "set a reminder" would trigger it, creating calendar events instead of reminders. Precision in descriptions is worth obsessing over.

The Road Ahead

MCP is still young. The specification continues to evolve, and the ecosystem is growing rapidly. We're watching several developments closely.

Streaming tool responses would enable subagents to return partial results as they process. Imagine querying "summarize my last 100 orders" and seeing results stream in as each batch processes.

Standardized observability would give us better visibility into subagent behavior without instrumenting each implementation individually.

Federated discovery would let subagents advertise themselves in a registry, enabling dynamic capability expansion.

But even in its current form, MCP has fundamentally changed how we think about AI architecture. The question is no longer "how do we connect our agent to everything?" It's "what specialists should we deploy, and how should they collaborate?"

The protocol solved our integration problem. The subagent pattern solved our cognitive load problem. Together, they've given us an architecture that scales—not just technically, but conceptually.

We started with an AI that tried to know everything. We ended with a team of specialists that know their domains deeply. MCP was the language that let them talk.