MCP 2026
HTTP_
AI_ERA.
До 2024 интеграция AI-tools — хаос: ChatGPT Plugins, OpenAI Function Calling, Claude Tool Use, IDE-плагины с несовместимыми контрактами. Смена модели = переписать адаптеры. Model Context Protocol (MCP) — open-source спецификация Anthropic (ноябрь 2024) на JSON-RPC 2.0: единый wire format между Host/Client и Server. 2026: OpenAI, Google, Microsoft в экосистеме; 10 000+ MCP Server; governance у AAIF (Linux Foundation). Тезис протокольного уровня: REST отвечает на can-invoke; MCP — на discover-select-invoke в agent loop. Разбор: N×M combinatorics — трёхслойный stack — wire-level diff vs REST — timeline — SSE/session affinity — security surface — 5-step deploy — throughput на Apple Silicon.
1. N×M combinatorics: почему интеграция взрывается
LLM без tools — inference-only box. N models × M backends = N×M adapter implementations. LangChain/CrewAI/OpenClaw не шарят tool schemas. Vendor lock-in: CRM-adapter под Claude не портируется на GPT без rewrite. Аналогия pre-USB-C: proprietary connectors everywhere. MCP target state: write-once server, N hosts consume via tools/list.
| Stack layer | Failure mode без MCP |
|---|---|
| CRM × 3 models | 3 parallel adapters, ~40 eng-days/year maintenance |
| IDE agent | Разные code paths: fs, DB, HTTP |
| Agent frameworks | Non-portable tool definitions |
2. Wire stack: Host / Client / Server
Transport primitives: STDIO (local subprocess, zero network hop) и HTTP+SSE (remote, bidirectional push). Payload: JSON-RPC 2.0. Discovery: tools/list. Data plane: tools/call, resources/read. Server-initiated messages — отличие от stateless REST.
| Transport | Latency profile | Scaling constraint |
|---|---|---|
| STDIO | <1 ms IPC, cold start ~300–500 ms | Single machine, process-per-server |
| HTTP+SSE | Network RTT + keepalive | Session affinity required for SSE stickiness |
3. MCP vs REST: protocol-level diff
| Property | REST | MCP |
|---|---|---|
| Discovery | Static OpenAPI, human-curated for LLM | Runtime tools/list |
| State | Stateless request/response | Stateful session context |
| Schema | External docs | JSON Schema in-band |
| Direction | Client → Server only | Bidirectional (Server push) |
REST = invoke gate. MCP = agent-compatible discovery + selection + invoke — это protocol layer для agent throughput, не CRUD wrapper.
4. Ecosystem timeline 2026
Network effect: +1 Server → all compatible Clients. 10 000+ servers (2026). Enterprise integration cost: −38–55 %. Startup barrier: −62 %. Custom integrators: demand drop ~43 % on bespoke adapter work.
5. Limits: OAuth, SSE affinity, attack surface
OAuth 2.0/2.1 on 2026 roadmap — not fully standardized yet. No global MCP registry (DNS-equivalent gap). SSE behind load balancer → sticky sessions or connection routing; stateless HTTP scale patterns don't apply 1:1. Security scan 2026: ~1 000 exposed unauthenticated MCP servers; indirect prompt injection via tool return path documented. A2A (Google): horizontal agent mesh; MCP: vertical model↔tool — complementary layers in agent internet stack.
6. Five-step deploy runbook (Mac)
Step 1 — Host + bundled servers
Cursor/Claude Desktop: enable filesystem, GitHub, Postgres. STDIO on Mac — zero network overhead.
Step 2 — Validate tools/list
Agent must hydrate tool registry at runtime, not from hardcoded prompt injection.
Step 3 — Custom MCP Server
TS/Python SDK. One server → Cursor + Claude + VS Code + Gemini.
Step 4 — Transport selection
Dev: STDIO. Prod 7×24: HTTP+SSE on remote node with launchd keepalive.
Step 5 — Security hardening
RBAC at server layer, exposure audit, OAuth 2.1 track. Cross-ref: OpenClaw MCP runbook, Cursor Agent Skills — MCP = protocol, Skill = procedure.
7. Throughput metrics (citable)
| Metric | Value |
|---|---|
| MCP Server count (2026) | 10 000+ |
| Integration cost reduction | 38–55 % |
| Startup entry barrier | −62 % |
| Exposed unauthorized servers | ~1 000 |
| Compatible hosts | Cursor, Claude, VS Code, Gemini |
8. Case: B2B SaaS adapter collapse
«B2B SaaS, 2025: три CRM-adapter (Claude, GPT, internal agent) — 120 eng-days/year. Q1 2026: single MCP Server (Postgres + REST), три Host configs. N×M=9 links → 1 server + 3 clients. Quarterly maintenance: 28 eng-days (−77 %). Model swap = config field change, tool layer zero diff.»
9. Local MCP vs remote Mac node: Unified Memory pressure
Cursor + N STDIO MCP servers + local MLX inference on 16–32 GB M-series → Unified Memory saturation, swap thrashing, GPU/ANE contention. STDIO isolates per-server process, but 7×24 on MacBook = thermal throttle + sleep breaking SSE sessions. Linux VPS runs MCP, but Xcode/ComfyUI/Metal sidecar pipelines favor macOS.
Production split: local Host (Cursor orchestration) + remote Mac MCP cluster via HTTP+SSE. launchd keepalive, session affinity on single node or sticky LB. MACGPU remote Mac node: 7×24, Unified Memory headroom for parallel tool/call fan-out, Metal/MLX side workloads on same silicon without laptop RAM bleed. Laptop = control plane only.