1. Pain Decomposition: Copying the Repo ≠ Copying Runtime
(1) State directory vs workspace: teams often keep skills, prompts, and checked-in openclaw.json inside a repo while sessions, channel bindings, and Gateway caches live under the user profile; migrating only git restores CLI commands but yields ghost sessions or empty channel state on the destination. (2) Secrets and OAuth refresh tokens scatter: Slack, Discord, and enterprise IM stacks may store tokens both in plaintext config and in encrypted runtime stores; backing up only one layer forces re-authorization, and if you do not stop the old Gateway first, two endpoints can race for the same bot session. (3) launchd vs interactive shell: launchd does not inherit exports from .zshrc; migrating API keys only inside a terminal profile while the plist stays empty produces “manual foreground works, scheduled daemon misses variables” failures. (4) Remote Mac user boundaries: the account used for Screen Sharing or SSH must match the account running Gateway; mixing root with a login user splits file ownership and Keychain access, exploding debugging time.
2. Matrix: What Belongs in the Cold Migration Bundle (2026)
| Object | Typical path / meaning | Include in bundle? |
|---|---|---|
| User state tree | ~/.openclaw (exact subtree names may shift by release — follow upstream docs) |
Yes: channel bindings, caches, local secret material |
| Workspace / project | Skills, scripts, openclaw.json if versioned |
Yes: must share the same snapshot timestamp as state to avoid schema skew |
| launchd job | ~/Library/LaunchAgents/ plist for Gateway |
Yes: captures runtime user, working directory, env key names |
| Docker volumes | Named volumes or bind mounts in compose | Conditional: snapshot only when aligned with your Docker production layout |
| macOS Keychain entries | Some channels stash refresh tokens in Keychain | Careful: prefer vendor re-auth flows over ad-hoc exports that break compliance |
3. Five Steps: Freeze → Pack → Verify → Cold Start → Re-Pair
Step 1 — Freeze writers: stop Gateway and related launchd jobs (or the compose stack) on the old machine so no background session keeps mutating SQLite files or locks; packing while processes write yields random corruption after restore. Document the exact stop order in your internal wiki: some teams also pause cron-style hooks or CI webhooks that trigger agent runs, because those jobs can reopen state files moments after you thought everything was quiet. If you use multiple compose projects, stop them in dependency order (database sidecars first only when your runbook says so) and capture docker ps output as evidence of a clean floor.
Step 2 — Pair snapshots: capture ~/.openclaw and the workspace at the same instant, including lockfiles; for Docker, export the volume or image digest alongside so all three timestamps align. When your workspace lives on a case-sensitive volume or a symlink farm, resolve real paths before archiving so the destination does not accidentally recreate two trees that differ only by capitalization. If you rely on pnpm or npm caches for deterministic builds, note whether those caches are part of the migration bundle or will be rebuilt — either choice is valid, but mixing “old state + freshly installed deps” without recording the decision invites heisenbugs during channel smoke tests.
Step 3 — Verify and scrub: checksum the archive before unpacking; search configs for old hostnames, absolute paths, and LAN-only URLs that must become reachable from the new or remote Mac. Replace hard-coded references to USB serial devices, local model weights, or Samba mounts with the new equivalents, and re-run a quick grep for IP literals — residential DHCP changes are a frequent reason webhooks silently point into the void. If you migrate from a laptop to a colocated Mac, revisit MTU, VPN split tunnels, and DNS search domains; agents rarely care until a provider’s API suddenly resolves differently.
Step 4 — Cold start Gateway: boot with a minimal surface (loopback admin port first), run openclaw doctor or the project ladder, and confirm log directories, permissions, and listeners match the plist contract. Keep the first channel disabled until doctor passes; premature OAuth redirects while half the env vars are missing create confusing partial state in vendor consoles. Capture baseline CPU and memory after cold start so you can compare against the old host and catch accidental debug flags left enabled.
Step 5 — Channel re-pairing: follow each vendor’s OAuth or webhook registration — most platforms forbid two active bots with the same credentials; retire the old instance completely before bringing the new one online to avoid duplicate deliveries and signature conflicts. Where possible, rotate app secrets during migration even if not strictly required: it gives you a crisp audit trail and prevents forgotten staging tokens from calling production URLs. After re-pairing, run scripted send/receive probes per channel instead of ad-hoc manual tests so on-call can replay the same script after the next upgrade.
NUMBERS / THRESHOLDS (CITEABLE)
① Snapshot trio: ~/.openclaw + workspace + plist (or compose file + volume manifest). Missing one is an incomplete migration.
② Environment: launchd plist exports must explicitly match what you rely on in shells — at minimum API keys, proxy, and timezone-related vars.
③ Channel cutover: leave a cooling window between stopping the old bot and enabling the new one; when vendors stay silent, assume minutes-scale gray release.
④ Remote Mac: SSH user equals Gateway user; do not mix UIDs between interactive Screen Sharing and unattended launchd.
⑤ Rollback: keep a read-only archive of the old host for at least one release cycle before deleting writable copies.
4. Parameters: Why “Process Up” Still Means Silent Channels
The most misleading failure mode after migration is a healthy process list with passing probes, yet inbound messages never arrive. First, check bind addresses: moving from 127.0.0.1 to a LAN IP without updating the reverse proxy or firewall leaves the Gateway talking to itself while the public URL still points at a stale IP. Second, webhook URLs: cloud consoles must reference the new tunnel hostname or static egress — forgetting to update Slack or Feishu callbacks is a silent drop. Third, clock and TLS: large NTP skew on a colocated Mac can break signed requests; corporate TLS inspection with replaced certificates also fails quietly at the TLS layer. Fourth, tools.profile: if the old host whitelisted absolute binary paths, the new PATH breaks tool calls without obvious errors — walk the sessions_spawn runbook and widen or rewrite allowlists deliberately.
When you still cannot explain silence, collect four artifacts before opening a vendor ticket: sanitized plist redaction, the exact public URL vendors call, a tcpdump or proxy trace showing whether TLS handshakes complete, and the Gateway log slice covering one full inbound event ID. That bundle usually separates “we never received HTTP” from “we received but rejected signature,” which determines whether you reissue tokens or fix networking. Keep a short internal glossary that maps each channel’s dashboard field names to your config keys — onboarding documents love synonyms, and migrations amplify the confusion.
5. Decision Table: Stay on the Laptop vs Remote Mac Gateway
| Axis | Local laptop | Remote dedicated Mac |
|---|---|---|
| Availability | Lid close, sleep, travel | Cleaner 24/7 boundary for channel workloads |
| Permissions | Easy GUI debugging | Requires headless discipline — avoid blocking prompts |
| Networking | Residential IP churn | Stable egress or tunnel domains help webhooks |
| Cost | Hidden depreciation | Transparent rental fits SLA-minded teams |
6. FAQ: The Five Questions We Hear Every Migration
Q: Can I sync only the workspace? Fine for pure CLI experiments, but strongly discouraged whenever channels or session continuity matter — you will burn more hours than the tarball size saves. Q: Docker to bare metal? Treat it as stack change plus migration; pick one target shape and document it, or you double the risk surface. Q: Plaintext API keys inside plist? They work but violate hygiene; prefer locked-down env files referenced by plist or a managed secret backend. Q: Do I need Screen Sharing on a remote Mac? Use SSH for routine work; open VNC only for first-time OAuth or GUI-only tools, and log the change ticket. Q: Wipe the old laptop immediately? Run parallel observation for a full business cycle before destructive wipes.
Q: Should I upgrade OpenClaw during the same change window? Prefer like-for-like version first, validate channels, then schedule the upgrade as a second ticket with its own rollback path — combining migration and major upgrades turns every incident into a two-dimensional mystery. Q: How do I handle MDM-locked Macs? Coordinate with device management so plist installs, helper tools, and network extensions remain signed and allowed; otherwise Gateway may start yet lack network entitlements. Q: What about multi-region teams? Document which legal environment owns tokens and logs; copying state across regions without review can violate data residency expectations even when technically possible.
7. Deep Dive: Migration Is Moving State and Identity Together
OpenClaw-style agents in 2026 are less about “does it install” and more about which processes read and write state, and how external identities map to internal sessions. Moving only code without state is like relocating an office but leaving the safe behind — doors open, workflows cannot continue. Repository-level openclaw.json usually encodes intent and tool policy, while the profile directory stores credentials already negotiated with third parties; both must move in the same logical transaction or you get doctor-green, channel-mute schizophrenia.
The split between launchd and interactive shells is the second classic trap. Developers export vendor API keys in .zshrc while forgetting plist EnvironmentVariables, so the daemonized Gateway never sees them and logs only vague 401s. The opposite failure — stuffing dozens of variables into plist without fixing ownership on log/state paths — lets the process start yet prevents log rotation until disks fill. Keep a single cross-check table covering plist, shell profile, and CI scripts for every sensitive variable name.
Channel re-pairing is not busywork; it enforces single-endpoint assumptions. Dual-active bots duplicate deliveries, scramble ordering, and may trip vendor abuse systems. A serious runbook lists old instance retirement (process, listener, DNS, cloud console) before new instance promotion with a gray window. Enterprise WeChat, Feishu, Slack, and Discord each impose different cooldown and IP allowlist semantics — one template rarely fits all tenants.
On a rented remote Mac, separate interactive GUI sessions from unattended services: the former suit occasional OAuth popups; the latter should be restartable, auditable, and aligned with Apple Silicon headless best practices. MACGPU-style nodes exist so Gateway-class workloads stay on a predictable topology while laptops return to creative work instead of acting as accidental servers fighting thermal throttling and lid-close disconnects — the same lesson as picking npm vs Docker: define truth first, then wire channels and SLAs.
Operational maturity shows up in how you rehearse failure: schedule a quarterly restore drill that unpacks last quarter’s tarball onto a scratch Mac, runs doctor, and replays webhook probes — not because you expect disaster weekly, but because migration runbooks rot faster than application code. When incidents do happen, the teams that kept snapshots paired and documented stop orders recover in hours; those that “just copied the repo” spend days diffing invisible state. Treat channel silence after migration as a networking and identity problem first, a model problem second: misconfigured egress masquerades as “AI stuck thinking” more often than vendors admit.
Finally, align observability: ship structured logs off the Mac early (or at minimum rotate and ship archives), attach the OpenClaw version and git SHA of your skills repo to every alert, and tag incidents with “post-migration” until you clear a calendar-based burn-in window. These habits cost almost nothing during calm weeks but pay rent during the first night a remote Mac reboots after a security patch. The migration story is therefore not only about files — it is about restoring trust in a system whose external contracts (tokens, URLs, certificates) must line up exactly once on the new metal.
8. Close: Fast Moves Still Need Full Steps
(1) Limits: state layout can change across major versions — record the OpenClaw semver beside every archive; compliance may forbid cloning Keychain wholesale, so budget explicit re-auth windows. (2) Why remote Mac wins: fixed user, fixed egress, fixed hardware class reduces long-tail incidents from “laptop as server.” (3) MACGPU: if you want to rehearse Gateway hosting on a rented Apple-silicon node before locking topology, start from the homepage plans and help entry points. Treat every migration as a rehearsal for the next one: tighter runbooks compound over time.