2026_OPENCLAW
MCP_SKILLS_
GATEWAY_
TOKEN_RUNBOOK.

OpenClaw chat works, but MCP servers and custom Skills misbehave: huge tool schemas, stale snapshots after restart, or binaries missing under launchd. This runbook orders directory precedence, Gateway reload checks, token budgeting for tools, and daemon PATH pitfalls, with a remote Mac hosting appendix. See also Gateway onboarding, Docker production, error troubleshooting.

Developer toolchain

1. Pain points: MCP and Skills are not two extra files

(1) Opaque load order: if precedence among workspace /skills, user directories, and bundled Skills is unclear, you will edit disk yet the agent still reads an older copy.(2) Gateway memory snapshots: some builds cache skill metadata in RAM; restarting the process incompletely or relying on hot reload can leave conversations behaving as if nothing changed.(3) MCP schema volume: wiring several MCP servers at once can inject enormous tool descriptions before the user prompt arrives, crowding 16k–32k windows and increasing refusal or hallucination risk.(4) Daemon environment drift: launchd jobs inherit a minimal PATH, so nvm, fnm, or volta shims that work in an interactive shell are invisible to the Gateway. On a remote Mac running 24/7, that gap is harder to spot because you are not constantly opening a terminal beside the service.

2. Skills directory precedence

Upstream packaging may rename paths between releases. The table below encodes the usual intuition: higher rows should override lower rows. Always reconcile against the documentation shipped with your exact build and print the resolved search path from logs when debugging.

Layer (high to low)Typical useWhat to verify
Workspace /skills or project-local skillsTeam workflows, repo-specific automationGateway cwd matches the workspace you think is mounted
User-level skills directoryPersonal experiments and shortcutsYAML frontmatter indentation errors skip the entire file silently in many parsers
Bundled defaultsBaseline capabilities from the distributionRead release notes after upgrades; defaults can change behavior without touching your files

3. MCP onboarding: smallest closed loop and token budget

Connect exactly one MCP server first and validate health checks, authentication, and a minimal tool call end to end. Only then add the next server. After each addition, estimate how many tokens the serialized tool list consumes relative to your model window—either by inspecting debug logs or by diffing prompt templates. When system plus tools approaches the ceiling, prefer narrowing the tool surface (per-channel profiles), deferring tools to sub-agent sessions, or moving to a larger-context model. Stacking servers without a budget is how teams end up “debugging model quality” when the real issue is context starvation.

4. Five-step verification checklist

Step 1: Capture the Gateway log line that states the workspace root and the directories scanned for Skills; mismatch here invalidates every local edit you make.Step 2: Change a visible string inside a single SKILL file, restart Gateway fully, and confirm the agent references the new text—this isolates cache issues from authoring mistakes.Step 3: Disable all MCP servers except one, record baseline tool-token usage, then re-enable servers one by one to see which connector spikes the budget.Step 4: Use launchctl print or equivalent to dump the environment for the supervised job; compare PATH and auth-related variables against your interactive shell.Step 5: Force an OAuth or API token to expire in staging and observe whether failures are loud or silent; silent failures in production look like “tool not found” and waste hours.

# Example health probe—adjust host/port to your install curl -sS http://127.0.0.1:18789/health 2>/dev/null || echo "adjust-endpoint"

5. Symptom matrix after Gateway restarts

SymptomLikely causeAction
SKILL edits appear ignoredA higher-precedence directory still winsPrint the skill search path; grep for a unique marker across trees
Tool counts fluctuate across restartsMCP handshake failures are swallowedRaise log level; enable a single MCP at a time
Random “tool missing” errorsExpired OAuth or refresh failing under daemonRe-authenticate; ensure refresh jobs run in the same environment as Gateway
Context fills with no obvious user textStacked large schemasPrune MCP servers, split sessions, or swap models

Reference figures (operations-oriented, not vendor marketing):

  • On 16k-class models, unfiltered MCP tool descriptions commonly exceed 10k tokens in aggregate—plan architecture before blaming the base model.
  • For always-on hosts, pin absolute interpreter paths inside the plist instead of relying on login shells.
  • Each new public channel (IM bot, webhook, email bridge) should trigger a fresh review of which tools that audience can invoke.

6. Deep dive: tool governance is production operations

MCP turns one-off integrations into composable platforms, but it also expands both context usage and authorization surface area. Unlike traditional microservices, the LLM may reschedule tools every turn; if policy filters lag behind schema registration, you pay twice—once in tokens, once in risk. macOS on Apple Silicon is an excellent quiet host for Gateways, yet unified memory still reflects concurrent channels and long sessions; this is not the same load profile as a single chat tab because tool definitions sit on the system side of the prompt for extended periods.

Keeping high-churn development on a laptop while placing production Gateway plus MCP on a dedicated machine is a deliberate blast-radius trade. The laptop remains ideal for fast iteration and local debugging, but it couples personal applications, sleep policies, and variable uplink with services your teammates rely on.

When stable uplink, predictable PATH, and uninterrupted supervision become the bottleneck, colocating OpenClaw on MACGPU remote Mac nodes preserves the macOS and Metal toolchain you already use while isolating process hygiene from daily-driver noise. Usage-based billing supports canary traffic before you commit to fixed capacity. A laptop-first setup can remain valid for experiments; for graphics-heavy or always-on AI workflows, a rented remote Mac typically yields fewer moving parts and clearer SLAs—so the move is a technical placement decision, not marketing hype.