2026 OpenClaw 2026.4.x Upgrades When Exec Is Fully Denied and Sandbox Defaults Flip: exec-approvals.json vs tools.exec, Container Cleanup, and openclaw logs Ladder (Remote Mac Gateway)

// Pain: after 2026.4.x, cron and tools look "fine" yet produce no side effects—logs show exec denied / allowlist miss; root causes stack exec-approvals.json, tools.exec tiers, and sandbox defaults into dual-truth drift. Takeaway: a symptom matrix, five-step recovery runbook, citeable redlines, and a remote Mac Gateway section for launchd vs logs alignment. Shape: pain | matrix | steps | thresholds | intersections | FAQ | analysis | close. Links: upgrade/auth v2, Task Brain, Docker WS, install audit, SSH/VNC, plans.

1. Pain split: when exec becomes “all denied”, the model is usually fine—defaults moved

(1) Silent cron / sub-agent failures: releases in the 2026.4.x family tightened tool execution and sandbox defaults; older configs with missing fields or permissive placeholders now fall through to deny. The UI can look healthy while jobs stop producing side effects. (2) Dual truth: exec-approvals.json vs tools.exec in openclaw.json: treat them as two windows on the same gate, not unrelated files—changing only one is a classic source of “it fixed itself yesterday”. (3) Sandbox registry drift: upgrades may recreate container names; if containers.json disagrees with Docker reality you get recreate loops or stale profiles. (4) Remote Mac Gateway: when launchd env differs from your SSH shell, openclaw logs may show denials that do not match the JSON the Gateway process actually loaded.

2. Symptom → hypothesis matrix (layer before you edit)

Signal / log keyword	Likely cause	First move
`exec denied` / `allowlist miss`	`tools.exec` tier + approvals + sandbox defaults intersect on deny	Read-only `openclaw doctor`; compare `tools.exec.security` / `ask` with `exec-approvals.json`
Cron “fires” but no effect	Command blocked inside sandbox or output dropped	Correlate `openclaw logs` with the Gateway unit; probe with `whoami` / `date`
Docker name conflict / cannot rm	Registry vs real containers	Follow backup-first cleanup for containers and `~/.openclaw/sandbox/containers.json` per release notes
CLI vs Gateway mismatch	Dual env sources / multiple config search paths	Use the Docker WS + token alignment checklist

3. Five-step recovery runbook

Snapshot: archive ~/.openclaw, workspace, openclaw.json, exec-approvals.json, and sandbox inventory.
Freeze truth sources: list every OPENCLAW_* from launchd, Docker, and shell; mark which set the running Gateway (openclaw gateway status + process env).
Align tools.exec: set tools.exec.security and tools.exec.ask explicitly in openclaw.json to match policy—avoid implicit defaults.
Align exec-approvals.json: validate minimal profiles for single-user vs multi-agent paths; ticket every change with rollback text.
Sandbox + logs gate: dry-run destructive cleanup; then openclaw channels probe → layered openclaw logs—no routing changes before probe passes.

# Recommended order (remote Mac too)
# openclaw doctor
# openclaw config get tools.exec
# openclaw gateway status
# openclaw channels probe
# openclaw logs --follow   # reproduce a minimal exec probe in another window

4. Citeable thresholds

If more than one source can change the effective tools.exec tier (JSON + undocumented env + CI injectors), block production until converged to a single truth.
Within 24 hours of an upgrade, run both a minimal exec probe and a single cron tick; keep ≥3 log lines each or mark the rollout unverified.
On remote Mac hosts, if the Gateway unit disagrees with your SSH shell on OPENCLAW_STATE_DIR (or equivalent), fix supervision first—otherwise you edit approvals in the wrong directory.

5. How this intersects upgrade, Task Brain, and Docker guides

Q: I already followed the upgrade / auth v2 checklist—why is exec still denied? That guide focuses on directory moves and device auth; 4.x exec adds sandbox defaults + approvals validation—use the matrix here first.

Q: Do I still need exec after Task Brain? Yes. Task Brain rollout covers control plane + skills policy; commands still traverse tools/exec with different log tokens—triage separately.

Q: Must I rebuild Docker? Not always—start with Docker WS + token parity, then decide if the sandbox sidecar must be recreated.

6. FAQ: rollback, fleets, least privilege

Q: Can I globally set ask: off? Single-user homelab ≠ production multi-tenant risk; if you must relax temporarily, time-box it with an automatic revert—do not bake it into the repo forever.

Q: Does sandbox cleanup lose state? It can drop local artifacts and warm caches; backup workspace + registry JSON; order stops as Gateway down → remove containers → prune registry rows to reduce races.

Q: Should I re-run install audit? Cross-check install.sh + security audit so widening exec does not accidentally widen listeners.

7. Analysis: “painless upgrades” must include exec preflight

Continuity for agents rests on sessions, memory, and executable tools. The industry bias in 2026 is safer defaults; exec and sandbox knobs will keep changing. If you only smoke-test channels and model routing, Monday morning complaints that “the assistant got dumb” are often silent tool denial, not reasoning regressions.

For always-on remote Mac Gateways, exec issues amplify with macOS updates and toolchain paths: missing binaries inside the sandbox look like random denials. A tiny exec health probe in launchd is an order of magnitude cheaper than an hour-long log archaeology session.

Culturally, treat exec and approvals edits like schema migrations—reviewed, scripted rollback, two-person rule—otherwise you get the classic “one-line config change stopped every cron” incident whose hidden cost dwarfs renting a staging remote Mac.

8. Close: even after local recovery, isolate production Gateway

(1) Limits: exec couples to versioned defaults; dual truth and registry drift create long tails; mixing laptop and server homes makes path assumptions brittle.

(2) Why remote Mac helps: isolate staging vs dev; fixed topology and unified launchd units simplify probes and rollback.

(3) MACGPU fit: rent a remote Mac for upgrade rehearsal + exec probes instead of experimenting on production laptops—CTA below (no login).

OPENCLAW_2026 EXEC_SANDBOX_RUNBOOK_REMOTE.