MCP 2026: Model Context Protocol как HTTP AI-эры — технический разбор

До 2024 интеграция AI-tools — хаос: ChatGPT Plugins, OpenAI Function Calling, Claude Tool Use, IDE-плагины с несовместимыми контрактами. Смена модели = переписать адаптеры. Model Context Protocol (MCP) — open-source спецификация Anthropic (ноябрь 2024) на JSON-RPC 2.0: единый wire format между Host/Client и Server. 2026: OpenAI, Google, Microsoft в экосистеме; 10 000+ MCP Server; governance у AAIF (Linux Foundation). Тезис протокольного уровня: REST отвечает на can-invoke; MCP — на discover-select-invoke в agent loop. Разбор: N×M combinatorics — трёхслойный stack — wire-level diff vs REST — timeline — SSE/session affinity — security surface — 5-step deploy — throughput на Apple Silicon.

1. N×M combinatorics: почему интеграция взрывается

LLM без tools — inference-only box. N models × M backends = N×M adapter implementations. LangChain/CrewAI/OpenClaw не шарят tool schemas. Vendor lock-in: CRM-adapter под Claude не портируется на GPT без rewrite. Аналогия pre-USB-C: proprietary connectors everywhere. MCP target state: write-once server, N hosts consume via tools/list.

Stack layer	Failure mode без MCP
CRM × 3 models	3 parallel adapters, ~40 eng-days/year maintenance
IDE agent	Разные code paths: fs, DB, HTTP
Agent frameworks	Non-portable tool definitions

2. Wire stack: Host / Client / Server

Transport primitives: STDIO (local subprocess, zero network hop) и HTTP+SSE (remote, bidirectional push). Payload: JSON-RPC 2.0. Discovery: tools/list. Data plane: tools/call, resources/read. Server-initiated messages — отличие от stateless REST.

┌─────────────────────────────────┐
│  Host (Cursor, Claude Desktop)   │
│  ┌───────────────────────────┐  │
│  │  MCP Client (1:1 per Server)│  │
│  └───────────────────────────┘  │
└─────────────────────────────────┘
             ↕ JSON-RPC 2.0
┌─────────────────────────────────┐
│  MCP Server (Tools/Resources)    │
└─────────────────────────────────┘
             ↕
┌─────────────────────────────────┐
│  Backend (DB, API, filesystem)   │
└─────────────────────────────────┘

Transport	Latency profile	Scaling constraint
STDIO	<1 ms IPC, cold start ~300–500 ms	Single machine, process-per-server
HTTP+SSE	Network RTT + keepalive	Session affinity required for SSE stickiness

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "query_database",
    "arguments": { "sql": "SELECT * FROM users LIMIT 10" }
  },
  "id": 1
}

3. MCP vs REST: protocol-level diff

Property	REST	MCP
Discovery	Static OpenAPI, human-curated for LLM	Runtime `tools/list`
State	Stateless request/response	Stateful session context
Schema	External docs	JSON Schema in-band
Direction	Client → Server only	Bidirectional (Server push)

REST = invoke gate. MCP = agent-compatible discovery + selection + invoke — это protocol layer для agent throughput, не CRUD wrapper.

4. Ecosystem timeline 2026

2024-11  Anthropic open-source MCP spec
2025     Cursor, Zed, Continue native MCP
2026-Q1  OpenAI adoption (January)
2026-Q2  Google Gemini MCP (February)
2026-Q2  Microsoft support complete
2026-Q2  Governance → Linux Foundation AAIF

Network effect: +1 Server → all compatible Clients. 10 000+ servers (2026). Enterprise integration cost: −38–55 %. Startup barrier: −62 %. Custom integrators: demand drop ~43 % on bespoke adapter work.

5. Limits: OAuth, SSE affinity, attack surface

OAuth 2.0/2.1 on 2026 roadmap — not fully standardized yet. No global MCP registry (DNS-equivalent gap). SSE behind load balancer → sticky sessions or connection routing; stateless HTTP scale patterns don't apply 1:1. Security scan 2026: ~1 000 exposed unauthenticated MCP servers; indirect prompt injection via tool return path documented. A2A (Google): horizontal agent mesh; MCP: vertical model↔tool — complementary layers in agent internet stack.

6. Five-step deploy runbook (Mac)

Step 1 — Host + bundled servers

Cursor/Claude Desktop: enable filesystem, GitHub, Postgres. STDIO on Mac — zero network overhead.

Step 2 — Validate tools/list

Agent must hydrate tool registry at runtime, not from hardcoded prompt injection.

Step 3 — Custom MCP Server

TS/Python SDK. One server → Cursor + Claude + VS Code + Gemini.

Step 4 — Transport selection

Dev: STDIO. Prod 7×24: HTTP+SSE on remote node with launchd keepalive.

Step 5 — Security hardening

RBAC at server layer, exposure audit, OAuth 2.1 track. Cross-ref: OpenClaw MCP runbook, Cursor Agent Skills — MCP = protocol, Skill = procedure.

7. Throughput metrics (citable)

Metric	Value
MCP Server count (2026)	10 000+
Integration cost reduction	38–55 %
Startup entry barrier	−62 %
Exposed unauthorized servers	~1 000
Compatible hosts	Cursor, Claude, VS Code, Gemini

8. Case: B2B SaaS adapter collapse

«B2B SaaS, 2025: три CRM-adapter (Claude, GPT, internal agent) — 120 eng-days/year. Q1 2026: single MCP Server (Postgres + REST), три Host configs. N×M=9 links → 1 server + 3 clients. Quarterly maintenance: 28 eng-days (−77 %). Model swap = config field change, tool layer zero diff.»

9. Local MCP vs remote Mac node: Unified Memory pressure

Cursor + N STDIO MCP servers + local MLX inference on 16–32 GB M-series → Unified Memory saturation, swap thrashing, GPU/ANE contention. STDIO isolates per-server process, but 7×24 on MacBook = thermal throttle + sleep breaking SSE sessions. Linux VPS runs MCP, but Xcode/ComfyUI/Metal sidecar pipelines favor macOS.

Production split: local Host (Cursor orchestration) + remote Mac MCP cluster via HTTP+SSE. launchd keepalive, session affinity on single node or sticky LB. MACGPU remote Mac node: 7×24, Unified Memory headroom for parallel tool/call fan-out, Metal/MLX side workloads on same silicon without laptop RAM bleed. Laptop = control plane only.