2026 GEMINI
CLI_TRUST_
CRISIS_
MAC.
В июне 2025 Google выпустил Gemini CLI под Apache 2.0; за год — 100 000+ stars и 6 000+ merged PR. 19 мая 2026 на Google I/O: с 18 июня 2026 Free, Google AI Pro/Ultra и Individuals перестают обслуживаться через Gemini CLI — миграция в closed-source Antigravity (agy), free quota с ~1 000 до 20 вызовов/день (~98 %). На Mac это ~/.gemini/, Agent Skills и terminal pipeline упрётся в auth wall за дни. Вывод по throughput: open license ≠ open service; до cutoff — paid API Key, альтернативные CLI, Metal/MLX offload на remote Mac с измеримым tok/s и p95 latency.
1. Pain points: bait-and-switch в цифрах
(1) Community пишет код, vendor владеет runtime. Репо Apache 2.0, OAuth для физлиц умирает 18.06. Andrea Alberti (27-commit PR в день policy): «работаем бесплатно на код только для enterprise?» (2) «Техническая необходимость» в двух весах. Enterprise Standard/Enterprise держат Gemini CLI + Antigravity; individuals — только миграция. (3) Antigravity и throughput. Closed Go binary; Reddit: Pro лимит после 6–7 prompts — не тянет длинные agent loops. (4) Mac blast radius. Связка с полное руководство Cursor Agent Skills и OpenClaw → перепрошивка auth/base URL за две недели.
Для Metal/MLX: unified memory на M2 Pro 32 GB при Qwen3 32B 4-bit @ 32K уже даёт peak ~22 GB — добавьте Gemini CLI stream + ComfyUI и вы получите thermal throttle раньше, чем исчерпаете «1000 calls/day». Remote Mac M4 Max 128 GB с mlx_lm.server снимает decode с ноутбука; API Key оставляет open-source клиент живым после cutoff.
2. Timeline 2025–2026
| Дата | Событие | Метрика |
|---|---|---|
| 2025-06 | Apache 2.0 launch | open contributions |
| 2025-06 – 2026-05 | рост | 100k+ stars, 6k+ PR |
| 2026-05-19 | Google I/O | Antigravity; лимиты Gemini CLI |
| 2026-05-23 | backlash | Discussion #27274 |
| 2026-05-29 | Linux Foundation | isitopen.ai |
| 2026-06-18 | cutoff | OAuth path stop |
3. Кто теряет доступ
| Сегмент | После 18.06 | Mac escape |
|---|---|---|
| Google AI Free | ❌ blocked | свой key / Antigravity |
| Pro / Ultra | ❌ blocked | то же |
| Code Assist Individuals | ❌ blocked | IDE paths |
| Code Assist GitHub personal | ❌ then stop | enterprise exempt |
| Standard / Enterprise | ✅ unchanged | Gemini CLI OK |
| Paid Gemini / Enterprise API Key | ✅ Gemini CLI | primary escape |
Christine Hall (FOSS Force): «Google не менял лицензию — отключил инфраструктуру.» Fork без model API = hollow shell.
4. Gemini CLI vs Antigravity — decision matrix
| Измерение | Gemini CLI | Antigravity |
|---|---|---|
| License | Apache 2.0 | Closed |
| Free quota/day | ~1000 | ~20 |
| Multi-model / ACP | зрелая экосистема | gaps |
| Project memory | Markdown context | early missing |
| Enterprise parallel | да | coexist |
| Agent loop throughput | выше на free tier | упирается в 20/day |
5. Пятиступенчатый runbook Mac (до 18.06)
Шаг 1 — Auth inventory
gemini auth status, ~/.gemini/. OAuth умирает; API Key держит open client.
Шаг 2 — API Key + budget cap
AI Studio / Cloud paid key; логируйте $/1M tokens; сравните Claude Code и матрица OpenRouter Mac.
Шаг 3 — Backup configs
~/.gemini/, .agents/skills/, MCP — не удалять до валидации Antigravity import.
Шаг 4 — Parallel CLI
Claude Code, Codex, Cursor models; MLX/Ollama offline baseline — 5 канонических prompt, error rate <1%.
Шаг 5 — Infrastructure register
license, auth owner, quota owner, fork viability — isitopen.ai.
6. Throughput / cost table
| Path | Daily cap | $/M tokens | Metal/remote |
|---|---|---|---|
| OAuth (ends) | 1000→0 | subscription | — |
| Antigravity free | ~20 | $0 | binary only |
| Gemini API Key | your cap | metered | client stays |
| OpenRouter | dashboard | multi-vendor | CI routing |
| MLX local | RAM-bound | power | Metal decode |
| Remote Mac 128GB | queue depth | rental | batch tok/s |
7. Open source vs open service + Metal reality
Code on GitHub, brain on vendor API. Partners (Dynatrace, Elastic, Figma, Shopify, Stripe) тоже мигрируют. Hard numbers: 100k+ stars; 6k+ PRs; quota 1000→20; discussion comment 31 downvotes.
На Mac измеряйте: TTFT, decode tok/s, unified memory peak, thermal sustained watts. Bucket A (Qwen3 Coder 30B 4-bit local) — для diff preview; Bucket C (billion-class) — только API или remote 128 GB node.
8. Metal / MLX / remote routing matrix
| Workload | Local MLX | API | Remote Mac |
|---|---|---|---|
| Offline diff preview | ✅ 30B 4-bit | opt | ✅ 128GB |
| Heavy refactor agent | ⚠️ thermal | ✅ key | ✅ nightly batch |
| OpenClaw 7×24 | ❌ laptop | ✅ caps | ✅ launchd rack |
| ComfyUI + Agent parallel | ⚠️ UM pressure | — | ✅ split GPU queue |
9. Case study: dual-stack week
«Indie dev, MacBook Pro, 3 SaaS repos: D1 Cloud API key cap $15/day; D2 heavy refactors → Claude Code; D3 CI gemini → OpenRouter; D4 remote Mac Ollama+MLX offline diff, measured decode ~45 tok/s 4-bit; D6 Antigravity only GCP scaffold. 17.06 OAuth dead — 0h terminal downtime, ~9h migration; dropped Antigravity D5 (20 calls/day ceiling).»
Official migration ≠ only path. Sustainable: API Key + multi-CLI + remote Metal throughput.
10. Acceptance, FAQ, KPIs
Acceptance: auth type □ | API key before 6/18 □ | backup □ | alt CLI □ | wiki □ | daily cap □ | routing doc □ | MLX smoke □
Q: Enterprise license personal? Rare; API key usual.
Q: Repo deleted? Not announced; OAuth closes.
Q: Skills? Standard unchanged; swap call endpoint.
KPIs: ① 100k+ stars ② 6k+ PRs ③ 98% quota drop ④ ~9h migration ⑤ 20 Antigravity calls/day free.
11. Terminal OAuth → Mac compute node
Long agent sessions saturate unified memory before CLI branding. Windows/Linux mount API keys fine; Xcode, FCP, ComfyUI, launchd OpenClaw still favor macOS. Cursor on laptop; overnight regression + gateway on remote Mac — avoid «OAuth died + thermal throttle» same week.
Policy won't revert fast; dependency graph can. Community guards code; rented Apple Silicon guards 24/7 throughput. Stable remote Agent hosts: MACGPU nodes — light CLI local, heavy queue in rack.
12. Industry shift: stars ≠ sustainable throughput
Gemini CLI is a data point in a larger pattern: OSS repos as talent and marketing funnels, margin on closed runtime and enterprise contracts. For Mac teams, architecture reviews must separate four layers — source license, authentication, model hosting, quota policy. After 18.06., only layer 1 stays stable for individuals if you move layer 2 to paid API keys.
Compare to traditional devtools: formulas remain usable when SaaS ends. Here the client is swappable; billing identity is the lock-in. Measure resilience: can you switch auth owner within 24 h at same error rate and p95 latency? Mature target: <4 h documented failover drill before cutoff.
Partners (Dynatrace, Elastic, Figma, Shopify, Stripe) migrate in parallel — confirms multi-CLI / multi-provider is median architecture in 2026, not exception. Single-vendor OAuth-only paths are concentration risk entries in vendor registers.
13. Metal / MLX deep dive: measurable offload
On M2 Pro 32 GB, Qwen3 Coder 30B 4-bit @ 32K often peaks ~22 GB unified memory — add Gemini stream + ComfyUI and you throttle before quota matters. Split workloads: laptop runs Cursor orchestration; remote M4 Max 128 GB runs mlx_lm.server or macMLX with OpenAI-compatible /v1 over SSH tunnel.
Log every baseline: TTFT (p50/p95), decode tok/s, UM peak GB, sustained power. Acceptance gates for MLX path: pass@1 ≥90 % of current default on 30 canonical tasks; 24 h mixed load error rate <1 %; weekly cost within 110 % of prior API chain at comparable p95. Fail any gate → rollback routing documented in wiki.
API Key path keeps Apache client alive post-cutoff; MLX path reduces egress and variable $/M tokens for diff preview and lint-heavy loops. Use матрица OpenRouter Mac for second-hop provider failover when Gemini rate-limits — primary → fallback chain in Agent config, same pattern as OpenClaw 429 runbooks.
14. Extended FAQ & SRE checklist
Success metrics post-migration? (G1) CLI error rate <1 % / 24 h mixed load; (G2) weekly token spend ±10 % budget; (G3) auth failover drill <4 h. Install Antigravity? Sandbox only — 20 calls/day kills agent loop throughput. GitHub Actions? Rotate to GEMINI_API_KEY, remove OAuth secrets, dry-run before 18.06.
launchd / OpenClaw: night gateways on remote Mac with stable thermals, not sleeping MacBook. MLX vs API: MLX for offline diff and high-frequency short tasks; API for heavy reasoning. Report to leads: median tokens/successful agent run; % runs hitting daily cap (target 0); active CLI paths (target ≥2); days since last failover drill (target <30).
Controversy won't revert quickly; your dependency graph can. Community guards code; rented Apple Silicon guards 24/7 tok/s. MACGPU nodes for rack queue — laptop stays light CLI + Skills.
15. Antigravity traps & 17.06. OAuth expiry (field notes)
Three recurring failures in Discussion #27274: (A) delete ~/.gemini/ before validated import — lose Skills/MCP IDs. (B) Antigravity free for CI — 20 calls/day cannot finish one nightly agent retry loop. (C) Pro/Ultra users expect unlimited OAuth — consumer path ends; Cloud billing with API key continues.
| Risk | Early signal | Countermeasure |
|---|---|---|
| Auth wall 18.06. | 401 after UTC midnight | Test API key by 17.06. evening |
| Quota shock | <10 useful prompts on agy | Primary = key + OpenRouter |
| Memory gap | No Markdown project memory | Archive context in Skills |
| UM throttle | Fan spike + decode drop | MLX on remote 128 GB node |
Google sunset history (Reader,+, Stadia) is a prior for risk scoring, not a prediction. Combine isitopen.ai with internal vendor score: contract term, lock-in depth, failover path count. With полное руководство Cursor Agent Skills, decouple endpoints from secrets — agentskills.io stays vendor-neutral. Treat 18.06. as infra incident: change freeze window, failover drill one week prior, OAuth read-only eve, post-mortem 19.06. on token $, error %, MLX tok/s — then Antigravity as secondary only.
SRE note: log gemini auth status output daily until cutoff; alert on OAuth expiry T-7/T-1; keep mlx_lm.server health on remote node separate from cloud API monitors. Throughput target after migration: maintain ≥95 % of pre-cutoff successful agent runs/week with ≤110 % cost — if not, re-open routing matrix.
16. Primary sources & pre-cutoff validation matrix
Traceability: Google I/O 2026 policy post, GitHub Discussion #27274, Linux Foundation isitopen.ai (29 May 2026 summit), FOSS Force (Christine Hall infrastructure quote). Internal wiki links: полное руководство Cursor Agent Skills, матрица OpenRouter Mac, MACGPU remote node docs for 7×24 gateway load.
| Canonical prompt | Pass criteria | Log fields |
|---|---|---|
| Lint fix | clean build | tokens, p95 TTFT |
| Multi-file refactor | tests green | tok/s decode, UM GB |
| Test generation | coverage +2 % | error code |
| Docs sync | diff acceptable | $/run |
| Security scan | no false block | provider id |
Only CLI paths with <1 % error rate over 24 h mixed load graduate to «production» before 18.06.; others stay «experimental». Run the matrix on API key path, Claude Code, OpenRouter hop, and MLX local — four rows minimum in your routing spreadsheet. Post-cutoff: weekly review of tok/s on remote node vs $/M on cloud; rebalance when thermal logs show sustained >85 °C on laptop during parallel ComfyUI + Agent.
Small teams (≤5): paid Gemini key + daily cap, OpenRouter fallback, local MLX smoke, documented remote Mac for nightly — four wiki lines, monthly review. Larger teams add vendor scorecard (license, auth, quota, fork, subprocessors) and calendarize 18.06. drill like any infra change. Metal lesson: unified memory is the real quota — plan UM before cloud quota; offload batches to 128 GB remote when M2/M3 peaks exceed 80 % UM during agent+ComfyUI overlap.
17. Benchmark discipline after cutoff
Every Sunday until Q3 2026: rerun the five canonical prompts on API key path, record p50/p95 TTFT, decode tok/s, UM peak, and $/run. Compare against pre-cutoff baseline stored in git-tracked CSV. If cloud p95 TTFT regresses >15 % while MLX remote holds steady, shift batch queues to SSH-tunneled /v1 on MACGPU node. If API costs exceed 110 % budget two weeks running, promote OpenRouter fallback in openclaw.json and demote Antigravity to sandbox-only. Throughput engineering beats policy outrage — measure first, tweet later.
Target after migration: ≥95 % successful agent runs per week at ≤110 % pre-cutoff cost. Miss the target → reopen the Metal/MLX vs API matrix, re-run SSH tunnel health checks, and re-benchmark mlx_lm.server on the remote 128 GB node before the next sprint. Policy threads on GitHub will not restore OAuth; your routing spreadsheet will. Log fan RPM and decode tok/s alongside token bills — thermal throttling is the silent quota on Apple Silicon.