Tutorials

Build an MCP Prompt Server: How to Share Prompts Across Claude Code, Cursor and Windsurf

Published on April 27, 2026·11 min read

You tuned a great prompt for a code refactor in Cursor 3. The next day you reached for it in Claude Code and it was gone, so you rewrote it from memory. A week later you found yourself doing the same thing in Windsurf, and the prompt drifted again. Three forks of the same prompt now live in three places, all subtly different, none reviewed against the others.

The protocol-level fix for this exists, and it is more mature than most developers realize. Model Context Protocol crossed 97 million installs in March 2026 [1] and the official 2026 roadmap declared enterprise readiness a top priority [2]. AWS, Google, and Cloudflare ship MCP integrations. Claude Code, Cursor 3, and Windsurf are all MCP clients out of the box.

This guide is the practical version. We are not going to talk about strategy or governance. We are going to build a small MCP server that exposes a prompt, wire it to all three IDEs, and look at how to handle versioning when the prompt evolves. There is a section on what MCP does not solve, and a short pointer to where Keep My Prompts fits if you do not want to build the missing pieces yourself. Code in Python, config snippets for each tool, no fluff.

1. MCP in 60 Seconds

MCP is an open protocol that lets any LLM client talk to any external server with a fixed set of primitives. Three of them matter for this guide:

Tools: executable functions the agent can call (the verbs).
Resources: read-only data the agent can fetch (the nouns).
Prompts: reusable templates and workflows the client can surface to the user or the model.

The architecture is a host that orchestrates a session, clients that connect to servers, and servers that expose capabilities [3]. Two transports are widely deployed: stdio for local processes and Streamable HTTP for remote servers. Both are part of the spec.

The interesting primitive for prompt libraries is the third one. Most MCP coverage focuses on Tools because that is where agentic workflows happen. But Prompts is the primitive that turns your prompt library into a protocol-level object that any compatible client can discover and use without code changes.

2. The Prompts Primitive

Two methods do all the work: prompts/list and prompts/get.

prompts/list returns the catalog. Each prompt has a name, a human-readable description, and an optional list of arguments the client can fill in:

{
  "prompts": [
    {
      "name": "refactor-component",
      "description": "Refactor a React component for performance and readability",
      "arguments": [
        { "name": "file_path", "description": "Path to the component file", "required": true },
        { "name": "framework_version", "description": "React version", "required": false }
      ]
    }
  ]
}

prompts/get returns the actual prompt content given a name and arguments:

{
  "description": "Refactor a React component for performance and readability",
  "messages": [
    {
      "role": "user",
      "content": {
        "type": "text",
        "text": "Review the React component at src/components/Dashboard.tsx (React 19). Identify rendering bottlenecks, propose memoization where it actually helps, and rewrite the component preserving its public API. Return a unified diff."
      }
    }
  ]
}

The client (Claude Code, Cursor 3, Windsurf, or a custom agent) discovers the prompt through prompts/list, asks the user to fill in arguments, then calls prompts/get and feeds the resulting messages to the model. The same JSON-RPC pair works across every MCP client.

This is the key insight. Once a prompt lives behind MCP, you stop hardcoding system prompts in your application layer [4]. Different agents discover and use the same template without you having to maintain three copies.

Want to know how effective your prompts are? Prompt Score analyzes them on 6 criteria.

Try it free

3. A Minimal Prompt Server in Python

Install the official SDK:

pip install "mcp[cli]"

Create prompt_server.py:

from mcp.server.fastmcp import FastMCP

app = FastMCP("kmp-prompt-server")

@app.prompt()
def refactor_component(file_path: str, framework_version: str = "19") -> str:
    """Refactor a React component for performance and readability."""
    return (
        f"Review the React component at {file_path} (React {framework_version}). "
        "Identify rendering bottlenecks, propose memoization where it actually "
        "helps, and rewrite the component preserving its public API. "
        "Return a unified diff."
    )

@app.prompt()
def write_test(file_path: str, framework: str = "vitest") -> str:
    """Generate unit tests for a target file."""
    return (
        f"Read {file_path} and write {framework} unit tests covering the public "
        "exports. Use existing project conventions if visible. Mock external "
        "calls. Aim for one happy-path and one edge-case test per export."
    )

if __name__ == "__main__":
    app.run()

Run it:

python prompt_server.py

That is the full server. The @app.prompt() decorator handles the JSON-RPC boilerplate and turns the function signature into the arguments schema the client receives. A docstring becomes the description. The return value is wrapped into the messages array.

Two prompts is enough to demonstrate the pattern. In practice, a real prompt server has 20 to 100 entries covering refactors, test generation, code review, debugging, ADR drafting, commit message templates, and whatever else your team reaches for repeatedly.

Architecture: a single prompt library exposed via an MCP server, fanning out to Claude Code, Cursor 3, Windsurf, and custom agents

4. Wiring It to Your IDEs

Each MCP-aware IDE has its own config location, but the shape is the same: command, args, optional environment.

Claude Code uses a CLI helper:

claude mcp add kmp-prompts \
  --command python \
  --args /Users/you/prompt_server.py

Or directly in ~/.claude.json:

{
  "mcpServers": {
    "kmp-prompts": {
      "command": "python",
      "args": ["/Users/you/prompt_server.py"]
    }
  }
}

Cursor 3 reads ~/.cursor/mcp.json:

{
  "mcpServers": {
    "kmp-prompts": {
      "command": "python",
      "args": ["/Users/you/prompt_server.py"]
    }
  }
}

Windsurf uses ~/.codeium/windsurf/mcp_config.json with the identical structure:

{
  "mcpServers": {
    "kmp-prompts": {
      "command": "python",
      "args": ["/Users/you/prompt_server.py"]
    }
  }
}

After restart, all three IDEs surface refactor-component and write-test as discoverable prompts in their respective UIs. Cursor 3 lists them in the agent panel, Claude Code makes them available via slash commands, Windsurf picks them up in Cascade. The client decides the surface, the server stays the same.

Note that the config schema is stable enough to copy across tools, but the lookup paths are not unified yet. The 2026 roadmap calls this out as a "configuration portability" gap [2]. For now, paste the same block into each tool's config.

5. Versioning When the Prompt Evolves

A live prompt rarely stays unchanged. The interesting part of MCP-based prompt libraries is the strategy for handling change without breaking clients that pinned to an older behavior.

Three patterns work in practice.

Pattern 1: encode the version in the name.

@app.prompt(name="refactor-component@v3")
def refactor_component_v3(file_path: str) -> str:
    ...

Clients pin explicitly to refactor-component@v3 or float on refactor-component (the latest). Easy to reason about, slightly noisy if many versions live side by side.

The techniques you're reading about work. Test your prompts now with Prompt Score and see your score in real time.

Test your prompts

Pattern 2: add metadata as arguments.

Expose version and min_score as optional arguments. The server resolves the right prompt internally:

@app.prompt()
def refactor_component(
    file_path: str,
    version: str = "latest",
    min_score: int = 75,
) -> str:
    body = lookup_prompt("refactor-component", version=version, min_score=min_score)
    return body.format(file_path=file_path)

This keeps the catalog clean. The client sees one entry called refactor-component and selects a version through arguments. Works well when you have an underlying registry that can answer "give me the latest version of X scoring at least 75".

Pattern 3: server-side registry resolution.

The MCP server is a thin layer in front of a registry that owns versioning, scoring, and rollback. The client never sees version numbers. The server picks the latest prompt that has passed quality gates and serves it. If the registry rolls back, the next prompts/get call automatically returns the previous version. Clients keep working.

This is the pattern most production teams converge on, because it lets the prompt library team move forward (test, score, promote) without touching the client config. The MCP server is dumb, the registry is smart.

Anatomy of an MCP prompt response: name, description, arguments, messages, with metadata for version, score, and last-updated overlaid

6. Streamable HTTP for Multi-User Setups

Stdio is great for one developer on one machine. Two cases push you to the HTTP transport.

Centralized prompt server for a team. Everyone on the team reaches the same MCP server URL instead of running a copy locally. New prompts ship to all clients without anyone running git pull or restarting an IDE.

Cloud-hosted agents. A custom agent running on a server cannot exec a local Python process. It can hit an HTTP endpoint.

The minimal HTTP server with FastMCP:

from mcp.server.fastmcp import FastMCP

app = FastMCP("kmp-prompt-server", host="0.0.0.0", port=8765)

@app.prompt()
def refactor_component(file_path: str) -> str:
    """Refactor a React component for performance and readability."""
    return f"Review the component at {file_path}..."

if __name__ == "__main__":
    app.run(transport="streamable-http")

Wire it into Cursor 3 by URL:

{
  "mcpServers": {
    "kmp-prompts-team": {
      "url": "https://prompts.internal.example.com/mcp"
    }
  }
}

Stateful sessions are the rough edge here. The 2026 roadmap explicitly calls out that "stateful sessions fight with load balancers, horizontal scaling requires workarounds" [2], and the upcoming stateless Streamable HTTP work is meant to fix this. Until then, run a single instance behind a session-affinity proxy and you are fine for any team under 100 seats.

7. What MCP Does Not Solve

Once you have a prompt server running, it becomes obvious what MCP is and is not. The protocol covers transport. It does not cover the things you actually need to operate a prompt library.

Discovery beyond the catalog. prompts/list tells you the names. It does not tell you which prompts are good, which are deprecated, which scored 90 and which scored 50. Quality is your problem.
Versioning UX. You can implement any of the three patterns above, but MCP itself has no opinion on rollback, side-by-side comparison, or "which version is current".
Quality gating. A bad prompt exposed via MCP is just as discoverable as a good one. If you do not want to expose a prompt that scores below a threshold, you have to enforce that before the prompt enters the server.
Audit and observability. Who used refactor-component@v3 last week? With which arguments? Through which client? The 2026 roadmap names structured observability as a gap [2]. Today, you log it yourself.

These are not MCP failures. They are prompt-library concerns. The protocol moves bytes between client and server. What lives on the server side is up to you.

8. Keep My Prompts + MCP: A Worked Example

This is where Keep My Prompts fits, and the boundary is clean. Keep My Prompts holds the prompts as canonical, versioned, scored objects. The MCP server is a thin layer in front of Keep My Prompts that picks the right version and serves it.

A typical setup:

Prompts live in Keep My Prompts, each one versioned and scored on six criteria (specificity, context, structure, constraints, role, output format).
The Promptimizer rewrites weak prompts; the quality gate rejects variants that do not score higher than the original.
The MCP server resolves a prompt name to "latest version that passed the quality gate", reads it from Keep My Prompts, and returns it through prompts/get.
Claude Code, Cursor 3, Windsurf, and any custom agent all read the same library through the same protocol.

The split is what makes this work. MCP solves the "how do clients reach the prompt" half. Keep My Prompts solves the "is the prompt any good, has it changed, who else is using it" half. Either alone is incomplete.

For solo developers this is overkill until you have more than a handful of prompts. For small teams it is the moment the multi-IDE pain stops being a tax. Free to start, no credit card required.

9. The Signal

Three things are worth saying clearly.

First, MCP is not "enterprise-ready" in the strict sense yet. The 2026 roadmap explicitly names four open gaps: audit trails, SSO-integrated auth, gateway behavior, and configuration portability [2]. If you work somewhere that needs SSO and SIEM integration before anything ships, you are early.

Second, MCP is absolutely production-ready for solo developers and small teams. The Python SDK is stable, the three biggest IDE clients ship with first-class MCP support, and the prompts primitive is documented and predictable. The drift problem this article opened with is solved by writing roughly 30 lines of Python.

Third, the wrappers you build now compound. The MCP server you wrote today keeps working as the spec adds async tasks, scoped auth, and stateless Streamable HTTP. The clients pick up new capabilities as they implement them. Your prompt server does not need to change.

The teams shipping fastest in 2026 are the ones who treat their prompt library as a protocol-level asset, not a copy-paste artifact. Thirty lines of Python and a config block in three IDEs is a small price for the prompt library you keep.

Keep My Prompts lets you keep a single, scored, versioned prompt library that you can front with an MCP server in minutes. Free to start, no credit card required. For more on multi-IDE prompt strategy, see our guide on Cursor 3 vs Claude Code vs Windsurf.

References

[1] MCP usage milestone, March 2026 ecosystem report. https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/

[2] The 2026 MCP Roadmap, Model Context Protocol Blog, 2026. https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/

[3] Model Context Protocol Specification, modelcontextprotocol.io, 2025-11-25. https://modelcontextprotocol.io/specification/2025-11-25

[4] MCP Prompts concept reference. https://modelcontextprotocol.info/docs/concepts/prompts/

#mcp#model-context-protocol#prompt-server#python-sdk#claude-code#cursor#windsurf#prompt-library#agent-tools#2026

Ready to organize your prompts?

Start free, no credit card required.

Start Free

No credit card required