1. Pain Points: Subagents Are Not “More Chats”
(1) Identity: Container UID vs host-edited files → EACCES on openclaw.json; Gateway may look healthy while spawns die. (2) Tool contract: minimal/messaging profiles omit runtime/fs tool groups—“Tool not found” in Telegram is often policy, not the model. (3) Thinking: Scheduled agents with thinking enabled may yield empty strings to announce—looks like “no reply.” (4) Fallback persistence: Some builds write fallback back to config; primary never returns without manual rollback and session cleanup.
2. Symptom Matrix
| Signal | First Suspect | Minimal Check |
|---|---|---|
| EACCES in spawn logs | File owner/permissions | UID inside container vs bind mount mode |
| Tools greyed / errors | Narrow tools.profile | Temporarily switch to coding/full—does it vanish? |
| Cron “OK” but channel silent | thinking or empty announce | thinking off for that job; inspect announce payload size |
| Primary “stuck” on fallback | Config rewrite | Diff openclaw.json vs last known good |
3. Five Steps: Reproduce, Then Harden
1 Freeze the path—same channel, same router, same model. 2 Log effective identity—uid/gid and real config path at spawn entry. 3 List tool groups explicitly—avoid hiding bugs behind full. 4 Split config for scheduled jobs—thinking off, fixed announce template. 5 Cold-start on remote/Docker—restart Gateway; first spawn must pass.
4. Reference Checks
- Mount config with ownership matching the process user or chown in entrypoint—avoid root on host, 1000 in container.
- For unattended jobs set thinking to off and compare empty-output rate before/after.
- Keep a last known good
openclaw.jsoncopy before failover tests.
5. When to Move OpenClaw to a Remote Mac
| Signal | Move |
|---|---|
| Heavy graphics + multi-spawn on one laptop | Gateway remote; local thin client |
| Need stable public ingress 24/7 | Dedicated node + monitoring |
| Team shares Gateway and skills | Separate machine for quota/audit |
6. FAQ
Q: sessions_spawn vs MCP? Not opposed—watch total tool schema tokens and reload order. Q: Works locally, fails remote? Usually path/permission/env—not the model. Q: Leave profile on full? Only for diagnosis; ship minimal viable tools.
When debugging, log: spawn exit/latency; tool list delta (interactive vs child); announce payload length. If length stays zero while spawn succeeds, suspect thinking/templates. If weekly manual fixes for spawn/permissions exceed three, freeze the environment (entrypoint, launchd, or a dedicated node).
7. Analysis: Subagents Are Ops, Not Prompts
Spawn-heavy setups fail like microservices with config drift. Remote Mac nodes add fixed UIDs, paths, restart policy—why teams isolate CI.
Laptops: sleep breaks gateways; paths differ per user; graphics and inference contend for memory. Remote Apple Silicon Macs fit gateway + subagent; hourly rental validates topology. MACGPU remote Mac rental beats endless chown and prompt tweaks for heavy Gateway/spawn.