DEV Community

Cover image for Building Your First MCP Server with Python and Vibe Coding
OnlineProxy
OnlineProxy

Posted on

Building Your First MCP Server with Python and Vibe Coding

We have reached a plateau in how we interact with Large Language Models (LLMs). Most developers are still treating AI like a very smart librarian—asking questions and waiting for text-based answers. But the paradigm is shifting. We are moving from Retrieval (RAG) to Agency (Action).

This shift is powered by the Model Context Protocol (MCP). If you are relying solely on pasting context into a chat window or dealing with static RAG pipelines, you are working with one hand tied behind your back. An MCP server acts as the neurological bridge between an LLM (like Claude) and your local filesystem, databases, or external APIs. It turns the AI from a chatbot into an operator.

But here is the friction point: building these servers feels like boilerplate purgatory. It doesn't have to be. By leveraging modern tooling—specifically the Python SDK (fast-mcp) and "Vibe Coding" techniques using Cursor—you can architect a robust server in minutes, not hours.

This article dissects the construction of a Python-based MCP server. We will not just look at the code; we will examine the architectural decisions, the debugging workflows, and the implementation of Tools, Resources, and Prompts that transform a simple script into a production-ready interface.

Why build a server when you can just prompt?

This is the first question every senior engineer asks. Why add the overhead of a server protocol? The answer lies in determinism and dynamic context.

When you paste a file into a chat, that context is static. It degrades. An MCP server, however, establishes a live, bidirectional pipe. It allows the LLM to:

  1. Execute Tools: Perform mathematical calculations or API calls with strict type safety.
  2. Read Resources: Access live documentation, logs, or database rows that change in real-time.
  3. Utilize Prompts: Access standardized, reusable templates that enforce best practices across your team.

We aren't just building a script; we are building an extension of the model's brain that lives on your local machine.

The "Vibe Coding" Stack: A Context-First Architecture

It is not 1999. We do not write boilerplate code line-by-line anymore. The most efficient way to build an MCP server is through Vibe Coding—a methodology where you act as the architect and let an LLM-integrated IDE (like Cursor) handle the implementation details.

The secret to success here is Context Injection. Before writing a single line of Python, your first step is gathering documentation.

The Documentation Indexing Strategy
To get high-quality code generation, you must feed the specific SDK documentation into your IDE's context window.

  1. The Master Context: Locate the llms.txt or comprehensive Markdown documentation for the MCP Python SDK.
  2. The Specifics: Copy the README.md from the relevant repositories (e.g., fast-mcp or the source repository you are emulating).
  3. The Project Structure: Create a dedicated folder. Index these documents into Cursor using the @Docs feature.

By doing this, you prevent hallucinations about outdated API methods. You are essentially fine-tuning the coding agent on the specific constraints of the protocol before it writes a single character.

What actually goes inside an MCP Server?

An MCP server is composed of three distinct primitives. Understanding when to use which is the hallmark of a senior developer. We will use fast-mcp to implement these, as it abstracts away the low-level JSON-RPC message passing.

1. Tools: The Hands of the Model
Tools are executable functions. In our implementation, we focus on a calculator server, but the logic applies to any API.

A tool is defined by a decorator. Crucially, strictly typing your arguments and providing a verbose docstring is mandatory. The LLM uses the docstring to understand when to call the tool.

from mcp.server.fastmcp import FastMCP
import math

# Initialize the server
mcp = FastMCP("Calculator Server")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together. Use this for basic summation."""
    return a + b

@mcp.tool()
def advanced_power(base: float, exponent: float) -> float:
    """Calculate the power of a base number. Handles float inputs."""
    return math.pow(base, exponent)
Enter fullscreen mode Exit fullscreen mode

Insight: Notice the docstrings. "Use this for basic summation." This isn't for other developers; this is a prompt implementation for the model itself.

2. Resources: The Eyes of the Model
Resources are read-only data pipes. While Tools perform actions, Resources provide context. A common use case is exposing local documentation (like an SDK guide) so the model can reference it without you pasting it into the chat.

The architecture here involves defining a URI scheme (e.g., file://) and mapping it to local paths.

@mcp.resource("file://docs/typescript_sdk")
def get_typescript_sdk_docs() -> str:
    """Read the local TypeScript SDK documentation."""
    with open("./docs/typescript-sdk.md", "r") as f:
        return f.read()
Enter fullscreen mode Exit fullscreen mode

When the client (Claude Desktop) connects, it sees this resource available. If you ask, "How do I use the TypeScript SDK?", the model knows to pull this resource automatically.

3. Prompts: The Playbook
Prompts are the most underutilized aspect of MCP. They are reusable templates with dynamic variables. Instead of typing a 500-word system prompt to analyze a meeting transcript every time, you bake it into the server.

A robust prompt template includes:

  • Role Definition: "You are an executive assistant..."
  • Dynamic Slots: {{ date }}, {{ transcript }}.
  • Structured Output directives.

In fast-mcp, managing prompts requires pointing to these template files and handling the injection of arguments.

The Inspector Paradox: Debugging Without the Client

The biggest pitfall in MCP development is the "Restart Loop."

The Anti-Pattern:

  1. Write code.
  2. Update claude_desktop_config.json.
  3. Restart Claude Desktop.
  4. Wait for it to load.
  5. Test.
  6. Fail.
  7. Repeat.

The Pro-Pattern:
Use the MCP Inspector.

The MCP Inspector is a browser-based tool that proxies your server. It allows you to test Tools, Resources, and Prompts in isolation, independent of the main Claude client.

To launch it, you use uvx (part of the uv Python package manager which is highly recommended for its speed):

uvx mcp-inspector uv run server.py
Enter fullscreen mode Exit fullscreen mode

This command acts as a wrapper. It spins up your server and provides a web interface (usually on localhost:5173).

Critical Debugging Insight:
The Inspector defaults to stdio transport. If your server is crashing silently, the Inspector will capture the standardized error logs that Claude Desktop might swallow or obscure. Always validate your "Add" tool or your "Meeting Summary" prompt in the Inspector before ever opening the main client.

Configuring the Bridge: The claude_desktop_config.json

Once your server passes the Inspector test, you must bridge it to the client. This is done via a JSON configuration file.

For Python environments, absolute paths and virtual environment management are non-negotiable. You cannot simply rely on the system python. You must point to the binary inside your .venv.

Example Configuration:

{
  "mcpServers": {
    "calculator-server": {
      "command": "uv",
      "args": [
        "--directory",
        "C:\\Users\\Dev\\Desktop\\mcp-server",
        "run",
        "server.py"
      ],
      "env": {
        "PYTHONUNBUFFERED": "1"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Note: Using uv within the command handles the virtual environment resolution automatically, cleaner than pointing to a specific python executable.

Advanced Transport: Stdio vs. SSE vs. HTTP

By default, we use stdio (Standard Input/Output). The client spawns the server process and communicates via text streams in the terminal. This is secure, fast, and perfect for local development.

However, if you want to deploy this server to a virtual machine (VM) or expose it to teammates, stdio fails. You need Streamable HTTP or SSE (Server-Sent Events).

The modern standard is Streamable HTTP, which facilitates a handshake request that upgrades to an SSE connection for asynchronous message pushing.

To implement this in Python, the change is trivial in code but significant in architecture:

if __name__ == "__main__":
    # Auto-detects transport based on environment or arguments
    mcp.run(transport="sse")
Enter fullscreen mode Exit fullscreen mode

Warning: When debugging SSE/HTTP with the Inspector, the connection URL changes. You aren't just connecting to localhost:8000. You usually need to append the endpoint, e.g., http://localhost:8000/sse. Without the specific endpoint path, the handshake will fail.

Step-by-Step Guide: The Checklist

If you are building your first server, follow this strictly to avoid dependency hell.

  1. Environment Setup:
  • Install uv (The modern successor to pip/poetry).
  • Create a directory and run uv init.
  • Create a virtual environment: uv venv.
  1. Context Loading (Vibe Coding):
  • Open Cursor.
  • Add the fast-mcp documentation and strict type references to the chat context.
  • Use a prompt: "Create an MCP server with a calculator tool and a markdown resource reader using fast-mcp."
  1. Implementation:
  • Check generated code for @mcp.tool decorators.
  • Ensure strict typing (int, str) is used in function signatures.
  • Verify prompt templates are loaded from external files, not hardcoded strings (for maintainability).
  1. Verification:
  • Run uvx mcp-inspector uv run server.py.
  • Test: Can you add numbers? Can you read the documentation resource?
  1. Integration:
  • Locate your Claude config (Appdata/Roaming or Application Support).
  • Add the server entry pointing to your project directory.
  • Restart Claude.

Handling Edge Cases and "Hallucinations"

Even with Vibe Coding, LLMs struggle with the nuances of new protocols like MCP.

The "List vs. Get" Trap:
A common hallucination in generated code is implementing separate functions for list_prompts and get_prompt. In fast-mcp, defining a Prompt generally handles the listing capability automatically via the metadata. If your server throws syntax errors regarding "Schema validation," check if the LLM tried to reinvent the wheel by manually constructing JSON-RPC responses instead of using the SDK's high-level abstractions.

The Pathing Nightmare:
When adding Resources (files), Python's relative paths (./docs/info.md) are relative to where the command runs, not where the script lives. Always use os.path.dirname(__file__) to resolve absolute paths for your resources, or your server will crash when launched by Claude Desktop from a different working directory.

Final Thoughts

Building an MCP server is the dividing line between being a user of AI and being an architect of AI workflows.

When you successfully deploy a server that combines a calculator, a documentation reader, and dynamic meeting summary prompts, you realize something profound: The model is no longer the product. The system—the integration of your local data, your custom tools, and the model's reasoning—is the product.

You have moved from asking the fridge what you can cook, to stocking the fridge yourself. Don't settle for the tools OpenAI or Anthropic give you. Build the tools you need.

Start with a simple calculator. End with an operating system for your work.

Top comments (0)