Разработка MCP Server с нуля: пошаговое руководство 2026 по AI Tool Calling

LLM без «рук и ног» не достанет до БД, API и локальных заметок. Model Context Protocol (MCP) стандартизирует JSON-RPC-мост между AI-клиентом и внешними capability — в 2026 экосистема 13 000+ Server, OpenAI/Google/Microsoft на борту. Боль: Function Calling — vendor lock-in, смена модели = переписывание интеграции. Итог: после этого гайда вы самостоятельно разработаете, отладите и задеплоите production-ready MCP Server с учётом throughput и Unified Memory. Структура: протокол → env → Hello World → Tools/Resources/Prompts → HTTP remote → debug → Docker → knowledge base → экосистема.

1. Что такое MCP? Сначала протокол, потом код

1.1 Контекст появления

Три поколения tool calling: Function Calling (проприетарный OpenAI) → Plugins (ChatGPT, угасает) → MCP (открытый стандарт). Anthropic open-source в ноябре 2024 — мотивация: N×M взрыв интеграций при каждом новом клиенте. MCP решает стандартизацию wire-протокола AI ↔ tools — один Server для Cursor, Claude Desktop, VS Code. В 2026 governance у AAIF (Linux Foundation).

1.2 Архитектура протокола

┌────────────────────┐         ┌─────────────────────┐
│   MCP Client       │ ◄─────► │   MCP Server        │
│  (Claude / Cursor) │  JSON   │  (ваша разработка)  │
│                    │  -RPC   │                     │
└────────────────────┘         └─────────────────────┘
                                        │
                          ┌─────────────┼─────────────┐
                          ▼             ▼             ▼
                       Tools       Resources      Prompts
                    (tool call)   (read data)   (prompt templates)

Client: AI host (Claude Desktop, Cursor, custom agent). Server: ваш capability provider. Три core primitive: Tools — вызываемые функции (search, calc, SQL); Resources — readable data (files, config, URLs); Prompts — шаблоны с parameter injection.

1.3 Wire protocol

Базис: JSON-RPC 2.0. Transport: stdio (local subprocess, zero network overhead, latency <5 ms) и Streamable HTTP (spec 2025-06-18, заменил HTTP+SSE, remote/multi-client). Lifecycle: initialize handshake → capability negotiation (tools/list) → request/response → shutdown. ⚠️ В stdio запрещён non-protocol stdout — иначе JSON-RPC parse fail.

1.4 MCP vs альтернативы

Измерение	MCP	OpenAI Function Calling	LangChain Tools
Стандартизация	Открытый протокол	Vendor lock-in	Framework-bound
Transport	stdio / Streamable HTTP	HTTP	HTTP
Cross-model	Да	Нет	Частично
Resources/Prompts	Native	Нет	Нет
Экосистема	13 000+ Server (2026)	Зрелая	Зрелая

2. Подготовка dev-окружения

2.1 Выбор языка

Python (entry point): official SDK mcp + FastMCP decorators, минимальный boilerplate. TypeScript (frontend/fullstack): @modelcontextprotocol/sdk + Zod, npm downloads 150M+. Этот гайд — Python, TS для reference.

2.2 Setup

# Python
python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install "mcp[cli]"

# TypeScript (reference)
npm init -y
npm install @modelcontextprotocol/sdk zod

2.3 Структура проекта

my-mcp-server/
├── server.py            # Entry point
├── tools/               # Tool modules
│   ├── calculator.py
│   └── web_search.py
├── resources/           # Resource modules
│   └── file_reader.py
├── prompts/             # Prompt templates
│   └── templates.py
├── tests/
│   └── test_tools.py
├── pyproject.toml
└── README.md

2.4 Debug toolchain

1) MCP Inspector: npx @modelcontextprotocol/inspector python server.py, UI на localhost:6274. 2) Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json. 3) Cursor: Settings → MCP → mcpServers. Deep dive: разбор MCP-протокола.

3. Первый MCP Server: Hello World

3.1 Минимальный Server

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("my-first-server")

@mcp.tool()
def say_hello(name: str) -> str:
    """Приветствует пользователя"""
    return f"Hello, {name}! Это ваш первый MCP tool."

if __name__ == "__main__":
    mcp.run()

3.2 Запуск и верификация

python server.py
# или MCP Inspector
npx @modelcontextprotocol/inspector python server.py

3.3 Подключение Cursor / Claude Desktop

// claude_desktop_config.json или Cursor MCP
{
  "mcpServers": {
    "my-first-server": {
      "command": "/absolute/path/to/.venv/bin/python",
      "args": ["/absolute/path/to/server.py"]
    }
  }
}

⚠️ Только absolute paths для Python и script. После restart say_hello должен появиться в tool context.

4. Tools: функции для AI invocation

4.1 Базовая структура

Function signature = schema source: types + docstring → JSON Schema для LLM. Naming: snake_case, semantic (web_search > ws). Error handling: structured error strings вместо uncaught exceptions — иначе crash всего Server process.

4.2 Input types

from pydantic import BaseModel, Field

class SearchInput(BaseModel):
    query: str = Field(description="Поисковый запрос")
    max_results: int = Field(default=5, description="Макс. результатов")
    language: str = Field(default="ru", description="Язык результатов")

@mcp.tool()
def web_search(input: SearchInput) -> list[dict]:
    """Веб-поиск, возвращает список результатов"""
    return [{"title": "Пример", "url": "https://example.com"}]

4.3 Production tools: 5 паттернов

Tool	Назначение	Implementation
Calculator	Math eval	`eval` только с sandbox/`ast`
File I/O	Local read/write	Allowlist, anti path traversal
HTTP	External API	`httpx` + 30s timeout
DB query	Read-only SQL	Parameterized, no DDL
Time	Timezone convert	`zoneinfo` stdlib

4.4 Async tools

import httpx

@mcp.tool()
async def fetch_url(url: str) -> str:
    """Fetch URL content"""
    async with httpx.AsyncClient(timeout=30.0) as client:
        response = await client.get(url)
        return response.text[:10000]  # truncate — token budget

4.5 Error handling best practices

1) Structured errors: {"error": "...", "code": "TIMEOUT"}. 2) Timeout: все I/O hard limit 30s. 3) Auth: allowlist на Tool layer — не полагайтесь на LLM discipline.

5. Resources: dynamic content для AI

5.1 Resource vs Tool

Resource = data provider (read-only dominant), Tool = action executor. URI schemes: file://, http://, custom://.

5.2 Static vs dynamic

import json

@mcp.resource("config://app-settings")
def get_app_settings() -> str:
    """App config"""
    return json.dumps({"version": "1.0", "env": "production"})

@mcp.resource("user://{user_id}/profile")
def get_user_profile(user_id: str) -> str:
    """User profile by ID"""
    return json.dumps({"user_id": user_id, "name": "Demo"})

5.3 Resource types

Text (text/plain, application/json), binary (images/PDF → base64), streaming (real-time feeds через Streamable HTTP).

5.4 Case: filesystem resource server

List dir, read file, optional resources/subscribe. Production: strict root allowlist — см. mcp-server-filesystem design.

6. Prompts: reusable prompt templates

6.1 MCP Prompt primitive

Predefined prompt fragments + dynamic params — team consistency. Complements Cursor Agent Skills: MCP Prompt = protocol layer, Skill = runbook.

6.2 Template creation

from mcp.types import PromptMessage, TextContent

@mcp.prompt()
def code_review_prompt(language: str, code: str) -> list[PromptMessage]:
    """Code review template"""
    return [
        PromptMessage(
            role="user",
            content=TextContent(
                type="text",
                text=f"""Проведите code review для {language}:
1. Качество и читаемость
2. Bugs и security issues
3. Performance optimization

```{language}
{code}
```"""
            )
        )
    ]

6.3 Multi-turn prompts

Templates с user + assistant roles — interview sim, debug assistant с clarifying questions.

7. Advanced: HTTP transport (remote MCP Server)

7.1 stdio vs Streamable HTTP

Характеристика	stdio	Streamable HTTP
Deploy	Local process	Remote server
Latency	<5 ms (zero network)	50–200 ms (network-bound)
Multi-client	Нет	Да
Use case	Local tools, IDE	SaaS/team/7×24

⚠️ HTTP+SSE deprecated с 2025-06-18 — новые проекты только Streamable HTTP.

7.2 HTTP transport implementation

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("remote-server")

if __name__ == "__main__":
    mcp.run(transport="streamable-http", host="127.0.0.1", port=8000)

Production: uvicorn/gunicorn + reverse proxy. Serverless (Cloud Run/Lambda): stateless_http=True — in-memory session теряется на cold start.

7.3 Auth и security

Bearer Token, API Key middleware, CORS allowlist, rate limit (100 req/min/IP). Dev: bind 127.0.0.1, never expose 0.0.0.0 без auth. 2026: 30+ MCP CVE, включая CVSS 9.6 RCE в mcp-remote.

8. Debug и testing

8.1 MCP Inspector

UI workflow: list Tools → manual invoke → raw JSON-RPC → simulate timeout/errors. Throughput debug vs direct LLM: ~10× faster iteration.

8.2 Unit tests

import pytest
from mcp.client.session import ClientSession
from mcp.client.stdio import StdioServerParameters, stdio_client

@pytest.mark.asyncio
async def test_calculator_tool():
    server_params = StdioServerParameters(
        command="python", args=["server.py"]
    )
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool("calculate", {"expression": "2 + 2"})
            assert "4" in result.content[0].text

8.3 Troubleshooting matrix

Ошибка	Причина	Fix
Tool не в AI context	Wrong config path	Absolute paths в config.json
JSON serialize fail	Unsupported return type	Convert to str/dict
Timeout disconnect	Slow tool execution	Async + 30s limit
Permission denied	Path restricted	Configure allowlist

9. Production deploy

9.1 Docker containerization

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "server.py"]

9.2 Cloud deploy

Railway / Render: one-click, ~$5–20/mo. AWS Lambda / Cloud Run: serverless, pay-per-invoke. Self-hosted VPS: Nginx + Let's Encrypt + systemd/launchd keepalive.

9.3 Observability

Structured logs (JSON Lines), Prometheus metrics (mcp_tool_calls_total), Sentry alerts, /health endpoint. P99 latency alert threshold: 5s.

9.4 Versioning

Declare MCP protocol version at handshake; backward-compatible tool upgrades; capabilities negotiation.

10. Case study: personal knowledge base MCP Server

10.1 Requirements

AI ищет local Markdown notes, semantic retrieval, create/update notes. Cursor query: «Что я записал про MCP на прошлой неделе?»

10.2 Tech stack

Component	Choice	Rationale
Vector DB	ChromaDB / Qdrant	Local, zero-ops
Embedding	text-embedding-3-small	1536-dim, low cost
File watcher	watchfiles	Auto re-index on change

10.3 Core implementation

Четыре модуля: index_notes, semantic_search, write_note, resource notes://{path}. Index ~1000 notes: 2–5 min на M4 Pro — embedding throughput limited by Unified Memory bandwidth, не CPU clock.

10.4 Demo flow

Cursor input: «Что в моих заметках про MCP deploy за неделю?» → Agent вызывает semantic_search(query="MCP deploy", days=7) → 3 chunks (similarity 0.82–0.91) → LLM synthesize с citations. Full note library остаётся вне context window — zero token waste.

11. MCP ecosystem и outlook

11.1 Recommended servers

mcp-server-filesystem, mcp-server-github, mcp-server-brave-search, mcp-server-postgres, mcp-server-slack. Official registry: 13 000+ Server.

11.2 Trends 2026

Big Four full support; MCP Marketplace; enterprise OAuth 2.1 roadmap; complement to Google A2A (MCP = vertical tool layer, A2A = horizontal agent orchestration).

11.3 Next steps

① Spec на modelcontextprotocol.io; ② publish first public MCP Server; ③ MCP + Agent combos; ④ contribute to Python/TS SDK.

12. Five-step launch checklist

Step 1 — FastMCP Hello World, Inspector verify. Step 2 — 3 business Tools + 1 Resource. Step 3 — Cursor/Claude Desktop (absolute paths). Step 4 — Streamable HTTP remote. Step 5 — Docker + monitoring + security audit.

13. Reference metrics

Metric	Value
MCP Server ecosystem (2026)	13 000+
18-month growth	7.8× (1 200 → 9 400+)
TS SDK downloads	150M+
MCP CVEs 2026	30+
Recommended Tool I/O timeout	30s
Knowledge base index (1000 notes)	2–5 min (M4 Pro)

14. Deep case: local stdio → remote Mac compute node

AI engineer на MacBook Air: Cursor + 5 stdio MCP Server (filesystem, postgres, brave-search, knowledge base, browser tool). Unified Memory 24GB — 19GB resident, thermal throttling, lid close = disconnect. Migration: local Cursor host only; 5 Server на remote Mac mini (64GB Unified Memory) via Streamable HTTP, launchd keepalive; connect url: http://node.macgpu.local:8000/mcp. Tool call P99: local throttled 180ms → remote 95ms (no CPU/GPU throttling) — Metal Unified Memory headroom для concurrent tool invocations.

Windows/Linux VPS запускает MCP Server, но для параллельных graphics/multimedia + AI toolchain (Xcode, ComfyUI, Final Cut) macOS + Metal throughput остаётся оптимальным. Local stdio — dev/verify; 7×24 production — remote Apple Silicon node: Unified Memory для concurrent Tool calls, laptop только orchestration layer.

Для stable rentable hosting MCP Server clusters и Agent Gateway: MACGPU remote Mac node — 7×24 uptime, HTTP reverse proxy pre-configured, Unified Memory не забивается stdio subprocess pool — от «runs» к «runs at full throughput».