gstack

Commit Graph

Author	SHA1	Message	Date
Garry Tan	a861c00cfa	v1.58.3.0 feat: gbrowser anti-detection Layer C stealth (#2047 ) * feat: Layer C stealth — chrome., Notification, per-install hardware, toString Proxy (gbrowser T1+T3+D6) Three additions stacked into the existing applyStealth() init script to close the visible automation tells that today push GBrowser users into Google's /sorry/index captcha and similar: T1 — Strip Playwright's automation default args: --enable-automation (kills "Chrome is being controlled" infobar) --disable-popup-blocking, --disable-component-update, --disable-default-apps (Patchright's list — each is a documented tell) Now centralized in STEALTH_IGNORE_DEFAULT_ARGS export, used by BOTH launchHeaded() and handoff() (the headless → headed re-launch path). D6 — Drop "GStackBrowser" UA branding suffix: Real Chrome's UA ends `Safari/537.36`, not `Safari/537.36 GStackBrowser`. The branded suffix was a high-entropy classifier for any vendor that grep'd UA for known automation/test-browser strings. Branding still lives in the wrapper .app name + Dock icon + tray — does not need to leak via the UA string for the product to be "GBrowser." Resolves the "looks like Chrome but identifies as GStackBrowser" contradiction codex review #18 flagged. T3 — Layer C init-script additions in stealth.ts: 1. Function.prototype.toString Proxy (must run first). Wraps every patched getter / function in a WeakSet so they report `function NAME() { [native code] }` at every recursion depth, defeating the depth-3+ integrity check (fn.toString.toString.toString().includes('[native code]')). 2. window.chrome.runtime / chrome.app / chrome.csi / chrome.loadTimes restoration with full enum shape (OnInstalledReason, PlatformArch, PlatformOs, etc.) + method bodies. Real Chrome ships these; their absence is universally checked. Vendor research (gbrowser plan deep-dive on Cloudflare + DataDome) confirmed both vendors probe this shape directly. 3. Notification.permission aligned to 'default'. The existing inline addInitScript already spoofs permissions.query({name:'notifications'}) to return 'prompt' — Notification.permission being 'denied' while Permissions returns 'prompt' is a cross-source inconsistency that detectors flag specifically. 4. Per-install hardware values via GSTACK_HW_CONCURRENCY / GSTACK_DEVICE_MEMORY env vars (set by gbd's host_profile.go from system_profiler + sysctl). Reporting real host values within the Chrome shape avoids the cross-user GBrowser fingerprint cluster that hardcoded defaults would create. Codex review #10 flagged hardcoding as creating contradictions across Apple Silicon / Intel / UA-CH architecture. 5. Selenium 25-global cleanup + PhantomJS + NightmareJS + Watir + Playwright (__pwInitScripts, __playwright__binding__) static-name deletion. The inline block continues to handle the dynamic cdc_/__webdriver/__selenium/__driver prefixes. D7 (codex correction) kept: still do NOT fake navigator.plugins or navigator.languages. Synthesizing those triggers MORE consistency flags from modern fingerprinters than letting Chromium surface them natively. Test coverage: - 15 new tests in stealth-layer-c.test.ts covering: launch-flag exports, script structure, toString-Proxy installs first, every spoof present, hardware values interpolated from input (not hardcoded), Selenium global cleanup spot-check, no GStackBrowser leak in stealth payload, backwards-compat exports preserved. - All 8 existing stealth-webdriver tests still pass. - All 2 existing browser-manager-unit tests still pass. For GBrowser specifically: this is the gstack-side half of Phase 1 / T1 + T3 + D6 in the anti-detection plan. The gbrowser repo's submodule pointer bump will land alongside this. feat: buildGStackLaunchArgs — Pack 1 cmdline-switch construction for gbrowser New stealth.ts export that turns the GSTACK_* env vars (already populated by gbrowser's gbd from host_profile.go) into the --gstack-* cmdline switches the Pack 1 Chromium patches read at WebGL getParameter, NavigatorUA::userAgentData, NavigatorConcurrentHardware::hardwareConcurrency, and NavigatorDeviceMemory::deviceMemory time. Wired into all three launchArgs sites: launch() (headless), launchHeaded() (real product path), and handoff() (headless → headed re-launch). Mapping: GSTACK_GPU_VENDOR → --gstack-gpu-vendor GSTACK_GPU_RENDERER → --gstack-gpu-renderer GSTACK_PLATFORM → --gstack-ua-platform (with mapping: MacARM/MacIntel → macOS, Win32 → Windows, Linux x86_64 → Linux) GSTACK_GPU_CHIPSET → --gstack-ua-model GSTACK_HW_CONCURRENCY → --gstack-hw-concurrency GSTACK_DEVICE_MEMORY → --gstack-device-memory Each switch is emitted only when its env var is non-empty — empty values fall through to the patch's "no override" path, which returns the real Chromium native value. Safe to ship on Chromium builds without the Pack 1 patches applied (zero behavior change). The patches themselves live in the gbrowser repo at chromium/patches/ {webgl-vendor-spoof,ua-client-hints-stealth,worker-navigator-stealth}.patch. Both halves (gstack arg construction + gbrowser C++ patches) must land + Chromium rebuild before the spoof reaches the WebGL/UA-CH/ hardware accessors. Currently dormant until then. Tests (browse/test/stealth-layer-c.test.ts): 7 new buildGStackLaunchArgs cases — empty env, all-populated, partial, platform mapping (MacARM/MacIntel/Win32/Linux), unrecognized platform fallthrough, vendor-with-spaces escape-safety. All 32 stealth/browser-manager tests pass. For GBrowser specifically: gstack-side half of the Pack 1 flag plumbing. gbrowser repo will bump the submodule pointer to this commit, then re-run bun run test/anti-bot/evidence-run.ts to verify creepjs's "33% headless" score drops after Pack 1 + Chromium rebuild. * feat: buildGStackLaunchArgs adds --gstack-suppress-prepare-stack-trace Pack 2 / B11 flag plumbing for the new error-preparestacktrace-stealth.patch in gbrowser/chromium/patches/. Always emit --gstack-suppress-prepare-stack-trace unless the caller explicitly sets GSTACK_CDP_STEALTH=off in the environment. Off by default in patch behavior (no-op without the C++ patch), so this is safe on stock Playwright Chromium too. Closes the Cloudflare canary trick where a page sets Error.prepareStackTrace and watches for it to fire during CDP serialization of a logged Error object. Tests: All 33 stealth/browser-manager tests pass. New cases: - GSTACK_CDP_STEALTH=off disables suppression - empty env still emits the always-on flag (count=1) - all-populated env now emits 7 flags (was 6) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(browse): enable Chromium sandbox on headed launchPersistentContext Mirrors v1.40.0.1 from main lineage (PR #1617). Cherry-picked onto gbrowser-anti-detection so the GBrowser submodule can consume the fix without waiting for main to merge. Playwright auto-adds --no-sandbox whenever chromiumSandbox !== true (playwright-core/lib/server/chromium/chromium.js:291-292). The headless chromium.launch() site set the option; the two headed sites (launchHeaded() and handoff()) did not. Every headed launch on macOS and Linux showed Chromium's yellow "unsupported command-line flag: --no-sandbox" infobar. shouldEnableChromiumSandbox() centralizes the Win32 / CI / CONTAINER / root heuristic that previously lived only in the headless path's explicit --no-sandbox push at :225. All three launch sites now use the helper, and six unit tests pin the policy across darwin, linux, win32, CI, CONTAINER, and root. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.40.0.2 fix(browse): Cmd+Q on managed Chromium stops triggering supervisor respawn Three browser.on('disconnected') handlers in browse/src/browser-manager.ts (launch, launchHeaded, handoff) each exited with a non-zero code on every disconnect, regardless of cause. Process supervisors that consume our exit code (gbrowser's gbd HealthMonitor in cmd/gbd/health.go) treated user Cmd+Q identical to a Chromium crash and respawned with exponential backoff, so the visible browser kept reappearing after the user closed it. Add resolveDisconnectCause(browser) that reads the underlying ChildProcess exitCode + signalCode (waiting up to 1s for the exit event if the disconnected event fired first). Exit code 0 + no signal = clean user quit; anything else = crash, signal-kill, or OOM. Wire the resolver into all three disconnect handlers: - launch() (headless): clean → exit 0, crash → exit 1 (was always 1) - launchHeaded() (headed): clean → exit 0, crash → exit 2 (was always 2) onDisconnect() cleanup callback still runs in both cases. - handoff() (re-launch): same as launch() via the helper. Preserve the per-path crash codes (1 vs 2) so any supervisor that differentiated headed vs headless crashes keeps working. Seven new unit tests in browse-manager-unit.test.ts cover the resolver across already-exited, signal-killed (SIGSEGV / SIGKILL), async exits, and null-browser inputs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(browse): apply stealth on every launch path + share automation-artifact cleanup handoff() built cmdline args but never called applyStealth, so a handed-off browser had no JS stealth (no webdriver mask, no chrome.* shape, no toString proxy). And the cdc_/Permissions cleanup shim lived inline in launchHeaded() only, so headless launch() reported Notification.permission='default' without the matching permissions.query='prompt' answer — the exact cross-source inconsistency the shim exists to prevent. Move the cleanup into AUTOMATION_ARTIFACT_CLEANUP_SCRIPT inside applyStealth so all three launch paths (launch, launchHeaded, handoff) get identical stealth, and call applyStealth(newContext) in handoff() before restoreState() navigates. A static tripwire in browser-manager-unit.test.ts fails CI if any launch path drops the applyStealth call again. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(browse): make --gstack-suppress-prepare-stack-trace opt-in, not default-on buildGStackLaunchArgs() pushed the flag unless GSTACK_CDP_STEALTH=off, i.e. on-by-default — contradicting its own comment ("off by default, only for gbrowser builds"). The switch is read by a C++ patch that only exists in gbrowser; on stock Playwright Chromium it is an unknown switch. Flip to opt-in: emit only when GSTACK_CDP_STEALTH is on/1/true. gbd opts in by exporting GSTACK_CDP_STEALTH=on; stock installs leave it unset so the flag never reaches a Chromium that wouldn't understand it. Comment now matches code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(browse): correct stale stealth comments The file-level stealth.ts docstring claimed "we DON'T fake navigator.plugins" while the same file now ships EXTENDED_STEALTH_SCRIPT, which does fake plugins when GSTACK_STEALTH=extended. Clarify that Layer C (the always-on default) doesn't fake plugins and the opt-in extended mode does, as the documented "actively lies, may break sites" escape hatch. Also fix the launch()/launchHeaded() comments that said "mask navigator.webdriver only" — applyStealth (Layer C) also restores window.chrome., aligns Notification.permission, and sets per-install hardware. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> test(browse): runtime + extended-mode coverage for the stealth blend The stealth tests were all static string-shape assertions; nothing executed the script in a real page. Add real-Chromium runtime checks via applyStealth + page.evaluate: - Layer C runtime: window.chrome.* rich shape, Notification.permission='default' paired with permissions.query notifications='prompt' (guards the shim now running on every path), and patched getters reporting [native code]. - Per-install hardware: navigator.hardwareConcurrency/deviceMemory reflect the GSTACK_* env profile. - Extended-mode blend: navigator.plugins is faked when GSTACK_STEALTH=extended, Layer C still wins window.chrome.runtime, and navigator.webdriver stays false (own-prop getter survives extended's prototype delete). - Persistent-context (launchHeaded/handoff) parity now uses a page created AFTER applyStealth — the old test checked pages()[0], which predates the init script, so webdriver was false only via the launch arg, not Layer C. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(browse): handoff() + launchHeaded() spread the shared STEALTH_LAUNCH_ARGS handoff() built its launch args from only ['--hide-crash-restore-bubble', ...buildGStackLaunchArgs()], omitting STEALTH_LAUNCH_ARGS — so a handed-off browser kept the --disable-blink-features=AutomationControlled tell that launch() and launchHeaded() strip. launchHeaded() also hardcoded the flag as a literal. Both now spread the shared constant, so the AutomationControlled flag lives in one place across all three launch paths. Tripwires: STEALTH_LAUNCH_ARGS spread into >= 3 sites (no inline literal) and STEALTH_IGNORE_DEFAULT_ARGS wired into both persistent-context paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(browse): drop dead HostProfile.platform, export test internals HostProfile.platform was set by readHostProfile but never read by buildStealthScript — the platform spoof is owned by the UA-CH cmdline switch in buildGStackLaunchArgs (which reads GSTACK_PLATFORM directly). Remove the dead field. Export readHostProfile and AUTOMATION_ARTIFACT_CLEANUP_SCRIPT so their clamp/shape invariants can be unit-tested. Correct the stale "25 Selenium globals" count comment and note the extended cdc_ scan is redundant-but-retained for standalone use. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(browse): cover readHostProfile clamp, toString depth-3, chrome.* calls Pre-landing review coverage gaps: - readHostProfile clamps 0/negative/NaN/missing env to 8 (a deviceMemory=0 or NaN would be a glaring bot tell) — now asserted. - toString proxy survives the depth-3 recursion trick (fn.toString.toString.toString().includes('[native code]')), the headline claim that was only tested at depth-1. - chrome.csi() and chrome.loadTimes() are invoked (not just typeof-checked) and runtime.connect() throws the native-shaped "No matching signature" error. - AUTOMATION_ARTIFACT_CLEANUP_SCRIPT static shape (cdc_/__webdriver strip + notifications->prompt) as a hermetic backup for the live-Chromium pairing test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(browse): recreateContext() re-applies stealth (closes 4th un-stealth path) useragent and viewport --scale route through recreateContext(), which rebuilds the BrowserContext via newContext() — a fresh context with no init scripts. It never called applyStealth, so a routine useragent/viewport-scale command silently dropped webdriver masking, window.chrome.* shape, hardware spoof, and the cdc/Permissions cleanup on every restored page. Caught by the cross-model adversarial review (Codex) after the Claude pass and eng review missed it. Both the main and fallback paths now call applyStealth before any page is created. The launch-path tripwire is raised to >= 4 sites and now asserts the recreateContext() body specifically, so the regression class can't recur. Also documents the load-bearing trust assumption on buildGStackLaunchArgs / readHostProfile (GSTACK_* must be gbd-sourced, never page/remote data — the injection-safety argument depends on it) and the notifications-permission spoof tradeoff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.58.3.0) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: sync browser stealth docs to Layer C (v1.58.3.0) BROWSER.md "Stealth scope" still described the default as navigator.webdriver masking only; Layer C is now the always-on default across all four context-creation paths. Update the stealth-scope prose, the "What GStack Browser means" blurb (stock-Chrome UA, no GStackBrowser suffix, captchas can still get through at the CDP layer), the stealth.ts source-map line, and the env-vars table (GSTACK_STEALTH, GSTACK_CDP_STEALTH, GSTACK_GPU_, GSTACK_PLATFORM, GSTACK_HW_CONCURRENCY/GSTACK_DEVICE_MEMORY + the explicit --gstack- switches and ignoreDefaultArgs stripping). Correct the stale "narrows to navigator.webdriver masking only" premise on the open CDP-patch TODO (the TODO itself stays open — the CDP-protocol layer is still unaddressed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-18 10:45:05 -07:00
Garry Tan	66f3a180d3	v1.43.2.0 fix wave: post-Daegu paper-cut — 18 fixes, 28 bisect commits (#1642 ) * fix(gbrain-sync): --full produces an empty code index on first run of a new repo `gbrain reindex-code` only RE-EMBEDS pages that already exist; it never walks the filesystem. On a freshly-registered source (0 pages), a --full run that called reindex-code alone found nothing ("No code pages to reindex"), finished in ~1s, and left the code index permanently empty while still reporting OK. Fix: --full now runs `sync --strategy code` FIRST to create pages via the file walk, then runs `reindex-code` to honor the documented "full walk + reindex" contract for both fresh and populated sources. Contributed by @jetsetterfl via #1584. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gbrain-local-status): classifier falsely reports broken-db inside repos with their own DATABASE_URL The freshClassify probe ran `gbrain sources list --json` with the inherited process env. When the probe ran from inside a repo with its own .env (an app DATABASE_URL on a different port), Bun autoloaded the project's .env, gbrain connected to the wrong database, and the classifier reported broken-db on otherwise-healthy brains. Fix: route the probe env through `buildGbrainEnv` from lib/gbrain-exec, the same helper the sync orchestrator uses. DATABASE_URL is seeded from ~/.gbrain/config.json so the result is cwd-independent. The 60s cache can no longer propagate a poisoned negative to clean directories. Contributed by @jetsetterfl via #1583. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(retro): stale-base + bad-today-anchor pre-flight guard (#1624) /retro silently produced confidently-wrong output when "today" drifted (model session-context error) or when origin/<default> was materially behind the actual remote — git log --since returned zero or near-zero commits and the narrative was fabricated from nothing. Adds Step 0.5 with four ordered pre-check branches before any window analysis: A. No 'origin' remote → skip with "base freshness not verified" note B. Detached HEAD → skip with "base freshness not verified" note C. `git fetch origin <default>` fails (offline) → warn, proceed against last-known origin/<default> D. Fetch succeeded → compare today vs latest origin/<default> commit; if gap > window-days, BLOCK with explicit citation of latest-commit date. Skip paths still proceed to Step 1, but the disclosure is carried into the retro narrative ("offline run, window not freshness-verified") so the output is never silently confidently-wrong. Atomic .tmpl + gen:skill-docs regen commit (T-Codex-3 pattern). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(retro): regression for #1624 stale-base pre-flight guard 13 static-invariant tests pinning the four ordered pre-check branches in retro/SKILL.md.tmpl:Step 0.5: A. no-remote skip — must check origin presence + set verdict B. detached-HEAD skip — must gate behind prior verdict (ordering) C. fetch-fail warn — must match `if !` or `\|\|` shape, gate by verdict D. stale-base BLOCK — must read latest-commit ISO date, cite remediation Plus a disclosure-survives-to-narrative invariant: skip-path verdicts must be named in prose so the retro output carries the cited reason rather than silently misreporting. Failing build if Step 0.5 is removed, branches re-ordered (no-remote no longer wins), or the BLOCK message stops citing today/latest-commit/remediation path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gbrain-sync): configurable timeouts + resume from gbrain checkpoint (#1611) The memory and code stages hardcoded a 35-min spawn timeout. On brains with ~2000+ staged files, /sync-gbrain --full reliably SIGTERM'd the child at exactly 35 minutes with exit 143. gbrain left ~/.gbrain/import-checkpoint.json pointing at the staging dir, but gstack-memory-ingest's SIGTERM handler unconditionally cleaned the dir up — so the next run found a checkpoint pointing at nothing and restaged from scratch, repeating the SIGTERM forever. Three changes: 1. Configurable timeouts via env (bounds 60_000ms - 86_400_000ms, default 2_100_000ms = 35min unchanged): GSTACK_SYNC_MEMORY_TIMEOUT_MS GSTACK_SYNC_CODE_TIMEOUT_MS Out-of-range or non-numeric values warn and fall back to the default. 2. SIGTERM in gstack-memory-ingest no longer always cleans up the staging dir. If gbrain has written ~/.gbrain/import-checkpoint.json pointing at the active staging dir, the dir is PRESERVED for next-run resume. Otherwise (no checkpoint pointing here, crash before gbrain ever touched it) it's cleaned up as before. 3. Next /sync-gbrain run detects gbrain's checkpoint via decideResume() in gstack-gbrain-sync.ts: - no checkpoint → fresh ingest pass - checkpoint + staging ok → set GSTACK_INGEST_RESUME_DIR; child reuses staging dir and skips writeStaged; gbrain import resumes from processedIndex+1 - checkpoint + staging gone → warn "previous checkpoint stale (staging dir gone), restaging from scratch" and proceed Reuses gbrain's own checkpoint as the source of truth (D1 — no double-store state). Detect-then-fallback semantics per C1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(gbrain-sync): regression for #1611 timeouts + resume 19 tests across three surfaces: - resolveStageTimeoutMs (10 tests): undefined/empty → default; non-numeric, zero, negative, below-floor, above-ceiling → warn + default; at-floor, at-ceiling, valid mid-range → accepted as-is. - decideResume (6 tests): no checkpoint, corrupt JSON, checkpoint + staging ok, checkpoint + staging missing, checkpoint with no dir, checkpoint with empty dir. - SIGTERM staging preservation (3 static invariants): memory-ingest signal handler must check stagingDirIsCheckpointed BEFORE cleanup; preserve branch must come before cleanup branch (ordering); orchestrator must pass GSTACK_INGEST_RESUME_DIR to the grandchild on resume. Also threads process.env.HOME through readGbrainCheckpoint and stagingDirIsCheckpointed so tests can redirect home. os.homedir() caches at process start and ignores later mutation, so the env override is the only reliable test injection point. Failing build if the timeout bounds are removed, the resume detection short-circuits incorrectly, or the SIGTERM handler regresses to unconditional cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review): pre-emit verification gate kills Django-shape FP class (#1539) External user filed 4/8 false positives on a /review run against a Django + DRF + PostgreSQL repo (Sprint 2.5). Every FP class was the same shape: "resolvable in <5 minutes by viewing the actual code or running a simple grep" — fields that don't exist on the model, dict.get()-might-be-None on a form that returns {}-initialized cleaned_data, standard ORM save behavior called out as data loss. Extends the Confidence Calibration resolver (consumed by review, cso, plan-eng-review, ship) with a Pre-emit verification gate: Every finding MUST quote the specific code line that motivates it (file:line + verbatim text). If the reviewer cannot produce the quote, the finding is unverified — its confidence is forced to 4-5 so the existing "Suppress from main report" rule fires automatically. The finding still goes to the appendix for calibration audit, but the user does not see it in the critical-pass output. Reuses the existing suppression mechanism — no new code path. The FP classes the gate kills are enumerated in the resolver text so reviewers see the named patterns. Framework-meta nudge included for Django Meta, Rails associations, SQLAlchemy relationships, TypeORM decorators, Sequelize init, Prisma generated client — the reviewer must quote the meta-construct that generates the symbol, not just grep for the literal name. Deeper framework-aware ORM verification (model introspection, migration-history- aware checks) is deliberately deferred to a future wave per T-Codex-2. Atomic .tmpl-equivalent (resolver) edit + gen:skill-docs regen commit per T-Codex-3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(review): regression for #1539 pre-emit verification gate 12 tests pinning the gate behavior: - Resolver emits the gate header + #1539 reference - Gate requires quoting file:line + verbatim text - Unverified findings forced to confidence 4-5 (auto-suppress via existing <7-rule, no new mechanism) - Framework-meta nudge names Django, Rails, SQLAlchemy, TypeORM, Sequelize, Prisma - Deferred design doc reference present (1539-framework-aware-review.md) - Four named FP classes from #1539 enumerated: * field doesn't exist on model * dict.get() might be None * save() might lose fields * update_fields might miss X - All four downstream SKILL.md consumers (review, cso, plan-eng-review, ship) carry the gate text after gen:skill-docs - Existing confidence 9-10 'Show normally' + 3-4 'Suppress' rows unchanged (regression on existing behavior) Failing build if the gate is removed, the suppression mechanism is re-invented separately, the framework-meta nudge drops a framework, or gen:skill-docs stops propagating the gate to consumers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(config): expose explain_level default * fix(benchmark): parse positional prompt after flags * fix(artifacts): reject malformed remote paths * fix(learnings): preserve current entries in cross-project search * fix(setup): register root gstack slash alias * fix(memory): probe gitleaks without shell builtin * fix(gbrain-lib): pin LC_ALL=C in varname validator (macOS locale guard) In many macOS shells the default locale (e.g. en_US.UTF-8) makes bash glob brackets like `[A-Z]` match lowercase letters too, so the existing `case "$name" in [A-Z_][A-Z0-9_])` branch lets names like `lower-case` through validation. The function then trips `printf -v "$varname"` and `export "$varname"` with `not a valid identifier` errors that surface mid-prompt, which is exactly what the validator was supposed to prevent. Pinning `LC_ALL=C` inside the function gives ASCII-only bracket semantics on both macOS and Linux, matching the documented `[A-Z_][A-Z0-9_]` contract. Declared `local` so it doesn't leak to the calling shell — `gstack-gbrain-lib.sh` is documented as a sourced helper, so a bare assignment would mutate the caller's locale for the rest of the process (silently affecting downstream `sort`, `tr`, locale-aware globs in the same shell, etc.). The existing regression test `test/gbrain-lib-verify.test.ts:'rejects invalid var names'` already covers the macOS repro shape (passes `lower-case` and expects the validator to reject + emit `invalid var name`). On Linux CI the test silently passed because `LC_ALL=C` is the typical default; on macOS dev boxes it fails. Verified: - `bun test test/gbrain-lib-verify.test.ts`: 22 pass, 0 fail (on macOS). - `_gstack_gbrain_validate_varname lower-case; echo $?` → 2. - `_gstack_gbrain_validate_varname FOO_BAR; echo $?` → 0. - Caller's LC_ALL preserved across calls (confirmed via sourced bash). * fix(land-and-deploy): detect merged PR after gh failure After `gh pr merge` exits non-zero, the PR may already be MERGED server-side (concurrent merge landed, or local cleanup phase failed AFTER the merge succeeded). Calling `gh pr merge` a second time then errors with a confusing "already merged" — and worse, the deploy workflow never runs because we stopped on the first failure. Adds a Post-failure PR-state check (§4a-postfail) that runs after ANY non-zero exit from `gh pr merge`: - state == MERGED → record MERGE_PATH=direct, OFFER (don't force) stale-worktree cleanup on the base branch with uncommitted-work guard, proceed to §4a CI watch - state == OPEN → check autoMergeRequest; if non-null treat as merge-queue wait; if null surface both errors and STOP - state == CLOSED → STOP Hard invariant: never retry `gh pr merge` after a non-zero exit. Server state is authoritative. Re-authored from PR #1620 into land-and-deploy/SKILL.md.tmpl (the source of truth) instead of the generated SKILL.md, so the next gen:skill-docs run preserves the change. Original diff by @davidfoy via #1620. Related: cli/cli#3442, cli/cli#13380. Contributed by @davidfoy via #1620. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: detect PgBouncer transaction-mode pooler and set GBRAIN_PREPARE=true (#1435) When gbrain connects through a PgBouncer transaction-mode pooler (port 6543), it auto-disables prepared statements. This breaks `gbrain search` silently — the /sync-gbrain capability check fails and the GBrain Search Guidance block never gets written to CLAUDE.md. Three-layer fix: 1. lib/gbrain-exec.ts — `buildGbrainEnv()` now detects port 6543 in the effective DATABASE_URL and sets `GBRAIN_PREPARE=true` in the env passed to every gbrain spawn. This is the single chokepoint — all gstack gbrain invocations inherit the fix. Caller can opt out with `GBRAIN_PREPARE=false`. 2. sync-gbrain/SKILL.md{,.tmpl} — capability check now exports `GBRAIN_PREPARE=true` explicitly and retries search up to 3x with 1s delay for async index propagation under connection pooling. 3. bin/gstack-gbrain-detect — surfaces `gbrain_pooler_mode` field ("transaction" \| "session" \| null) in the preamble probe JSON so /setup-gbrain and /sync-gbrain can advise users about pooler state. Closes #1435 Built with [ClosedLoop.AI](https://closedloop.ai) \| [GitHub](https://github.com/closedloop-ai/claude-plugins) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(supabase-provision): rewrite transaction/6543 -> session/5432 for new projects - Single-object pooler API responses default to transaction-mode at 6543, but the shared pooler tenant on new projects only listens on session/5432 - Add a `pool_mode == transaction && db_port == 6543` rewrite + stderr note - Escape hatch via `GSTACK_SUPABASE_TRUST_API_PORT=1` for forward-compat - 5 new tests covering rewrite, no-op shapes, env opt-out, array path Fixes #1301. * fix(browse): GSTACK_CHROMIUM_NO_SANDBOX opt-out for Ubuntu/AppArmor (#1562) Ubuntu/AppArmor configurations often block unprivileged Chromium sandboxing for headless agent sessions even for normal users — /qa hangs without --no-sandbox. The kernel policy denies the unprivileged user namespaces Chromium needs. Adds GSTACK_CHROMIUM_NO_SANDBOX=1 as an explicit user override that forces the sandbox off without changing the default for everyone else. Re-authored from PR #1562 onto v1.42.2.0's shouldEnableChromiumSandbox() helper — purely additive, preserves the headed-launch sandbox-on-by-default behavior that v1.42.2.0 shipped to kill the --no-sandbox yellow infobar. Three new regression tests cover: - linux + override=1 → false (the named use case) - darwin + override=1 → false (env wins on any platform) - override=0 → does NOT trigger (must be exactly "1") Original diff by @techcenter68 via #1562. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(browse): mirror isCustomChromium() guard in headless launch() When BROWSE_EXTENSIONS_DIR is set alongside GSTACK_CHROMIUM_PATH pointing at a baked-extension build (GBrowser / GStack Browser), the headless launch() path was unconditionally adding --disable-extensions-except / --load-extension. This causes the same ServiceWorkerState::SetWorkerId DCHECK crash that launchHeaded() already guards against via isCustomChromium(). Mirror the existing guard: skip --load-extension flags when isCustomChromium() returns true; always push the off-screen window geometry args. * fix(browse): daemonize macOS/Linux server via setsid() `Bun.spawn().unref()` only releases the child from Bun's event loop — it does NOT call setsid(). The spawned bun server inherits the spawning shell's process session. When the CLI runs inside a session-managed shell that exits shortly after the CLI returns (Claude Code's per-command Bash sandbox, Conductor, OpenClaw, CI step runners), the session leader's exit sends SIGHUP to every PID in the session — killing the bun server and its Chromium grandchildren within seconds of a successful `connect`. Setting `BROWSE_PARENT_PID=0` (already done by the `connect` command and pair-agent) disables the parent-process watchdog but does NOT save the server here: SIGHUP from session teardown still reaps it. Replace the macOS/Linux `Bun.spawn().unref()` with Node's `child_process.spawn({ detached: true })`, which calls setsid() and gives the server its own session leader role (PPID=1, STAT=Ss). This mirrors the Windows path's rationale (PR #191 by @fqueiro) — same root cause, different OS surface. Verified on macOS in Conductor: pre-fix the server dies ~10–15s after connect across separate Bash invocations; post-fix the same PID stays alive (PPID=1, SESS=0, STAT=Ss) and responds to `status`/`goto`/ `snapshot` across many separate shell calls. The `proc?.stderr` startup-error branch is removed since both platforms now spawn with `stdio: 'ignore'`; both fall through to the on-disk `browse-startup-error.log` written by `server.ts`'s start().catch. * fix(design): bump image-gen timeout to 240s + pin gpt-image-2 The design binary calls /v1/responses (gpt-4o + image_generation tool, quality:high, 1536x1024) but aborted the request after a hardcoded 120s. That class of request consistently takes ~140-160s end-to-end, so every generate/variants/evolve/iterate call aborted before the image returned. In /design-shotgun this cascades: Step 3c launches N parallel agents, each calling `$D generate`, each aborts at 120s and retries, all fail, the comparison board never opens — the skill appears to hang indefinitely. Reproduced the exact API call with a longer budget: HTTP 200, valid image, 143.5s. A real /design-shotgun run after the patch generated 3 variants in parallel at 150.0s / 161.0s / 152.1s, all exit 0 — note the 161s case, which a naive 150s bump would still have failed. - Bump AbortController timeout 120_000 -> 240_000 in generate.ts, variants.ts, evolve.ts, iterate.ts (both call sites) - Pin the image_generation tool to model "gpt-image-2" design/test/variants-retry-after.test.ts: 5 pass, 0 fail. The feedback-roundtrip.test.ts failures are a pre-existing browse-module breakage (session.clearLoadedHtml undefined), unrelated to this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: fill coverage gaps for PRs #1606, #1612, #1620 Three cherry-picked PRs in this wave landed without unit-test coverage for the specific invariant they protect: #1606 (@andrey-esipov) — LC_ALL=C pin in _gstack_gbrain_validate_varname 8 tests by sourcing bin/gstack-gbrain-lib.sh and calling the validator directly. Asserts uppercase/digit/underscore accepted, lowercase REJECTED (the macOS-locale regression case), mixed-case rejected, LC_ALL=C scoping is local (doesn't leak to caller). #1612 (@bharat2913) — setsid daemonize via Node child_process.spawn 4 static-invariant tests on browse/src/cli.ts. The actual setsid syscall is hard to assert without a real spawn, so we pin the source shape: nodeSpawn imported from child_process; non-Windows branch uses nodeSpawn(...) with detached:true and .unref(); comment documents setsid/SIGHUP root cause; Bun.spawn() is NOT used on macOS/Linux. #1620 (@davidfoy, re-authored into .tmpl per A3) — §4a-postfail 12 static invariants on land-and-deploy/SKILL.md.tmpl + generated SKILL.md. Pins all three state branches (MERGED/OPEN/CLOSED), the authoritative state query, the merge-SHA capture, non-destructive worktree cleanup with uncommitted-work guard, autoMergeRequest probe on OPEN, hard "never retry gh pr merge" rule, and atomic regen propagation. Failing build if any of the three invariants regresses. Note: gbrain-lib-validate-varname.test.ts also surfaces a pre-existing glob-pattern overpermissiveness (hyphens + dots accepted) — not in #1606's scope; documented inline as a separate cleanup target. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(learnings): align injection-prevention tests with PR #1619 tagged-line shape PR #1619 (preserve current entries in cross-project search) refactored gstack-learnings-search to tag rows inline (`current\t<json>` vs `cross\t<json>`) instead of filtering inside the bun block via process.env.GSTACK_SEARCH_SLUG. The bun block no longer reads SLUG or CROSS env vars — it parses the per-line tag and sets a per-entry _crossProject flag. The pre-existing test/learnings-injection.test.ts still asserted on the old SLUG + CROSS env var shape. Updates: - Remove the SLUG env var assertion (no longer set on bash command line) - Remove the bun-block CROSS env var assertion (block reads the tag now, not the env) - Add a new positive assertion that the bun block parses the tag (sourceTag \| tabIndex \| crossProject) - Keep the shell-interpolation safety assertion unchanged — that's independent of the SLUG refactor The CROSS env var is still SET on the bash command line (it controls whether the cross-project find runs at all), but the bun child no longer reads it. The existing "env vars set on bash command line" test continues to pin that. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(fixtures): regenerate ship-SKILL.md golden baselines ship/SKILL.md consumes the Confidence Calibration resolver via the preamble pipeline. This wave's #1539 pre-emit verification gate extends the resolver text, which propagated to ship/SKILL.md via gen:skill-docs. The golden fixtures in test/fixtures/golden/ matched the pre-#1539 shape and failed the host-config regression check. Refreshes claude-ship-SKILL.md, codex-ship-SKILL.md, and factory-ship-SKILL.md to match the current generated output. Matches the Daegu wave's bisect commit 23 ("test(fixtures): regenerate ship-SKILL.md golden baselines"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(gbrain-detect): include gbrain_pooler_mode in schema regression (PR #1591) PR #1591 (PgBouncer transaction-mode detection, @mikeangstadt) added gbrain_pooler_mode to the gstack-gbrain-detect JSON output but did not update the schema regression check in test/gstack-gbrain-detect-mcp-mode.test.ts. Adding the key in alphabetical order matching the rest of the schema array. Downstream sync-gbrain ignores unknown keys, so this is forward-compat. Without this, the test fails with a diff: + "gbrain_pooler_mode" because keys is the actual set returned and the expected array was pre-#1591. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): v1.43.0.0 — post-Daegu paper-cut wave Bumps VERSION 1.42.2.0 → 1.43.0.0 (MINOR per scale-aware bump rules: new env-var surface GSTACK_SYNC__TIMEOUT_MS + GSTACK_CHROMIUM_NO_SANDBOX, behavior expansion in browse/src/browser-manager.ts headless launch, three skill-template prompt changes affecting /retro, /review, /sync-gbrain). CHANGELOG entry leads with what stopped happening: /retro stops fabricating retros against stale bases, /sync-gbrain stops SIGTERM-looping 35-min restarts on big brains, /review stops shipping framework FPs the reviewer never grep'd. 18 fixes total — 15 community PRs + 3 self-filed silent-failure issues (#1624, #1611, #1539) — in one bundled PR with 26 bisect commits and 7 new regression test files. Every wave-touched test file passes in isolation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> chore(release): bump v1.43.0.0 → v1.43.2.0 for queue collision CI check-version-stale flagged v1.43.0.0 already claimed by PR #1574 (garrytan/colombo-v3). PR #1639 (garrytan/muscat-v3) claims v1.43.1.0. Next available MINOR slot is v1.43.2.0. Bump VERSION + package.json + CHANGELOG entry header. No behavior changes — purely re-versioning to clear the queue collision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Jayesh Betala <jayesh.betala7@gmail.com> Co-authored-by: Andrey Esipov <andrey.esipov@outlook.com> Co-authored-by: David Foy <davidfoy@users.noreply.github.com> Co-authored-by: mikeangstadt <mike.angstadt@closedloop.ai> Co-authored-by: 0xDevNinja <manmit0x@gmail.com> Co-authored-by: techcenter68 <techcenter68@users.noreply.github.com> Co-authored-by: shohu <shohu33@gmail.com> Co-authored-by: Bharat <bharat@theysaid.io> Co-authored-by: Matteo Hertel <info@matteohertel.com>	2026-05-21 21:21:07 -07:00
Garry Tan	029356e1f0	v1.42.2.0 fix wave: browse launch hardening (2 bug fixes + headed exit-code wiring) (#1629 ) * v1.42.1.1 fix wave: browse launch hardening (2 bug fixes + headed exit-code wiring) Bundles two browse launch-path bug fixes plus the missing exit-code wiring that made the second fix actually work end-to-end. PR #1617 — Chromium sandbox policy at all 3 launch sites - shouldEnableChromiumSandbox() centralizes the Win32 / CI / CONTAINER / root heuristic that previously lived only in the headless launch path. - launch(), launchHeaded() / launchPersistentContext(), and handoff() now share the policy so Playwright stops auto-adding --no-sandbox on every headed launch and the yellow "unsupported command-line flag" infobar disappears on macOS and Linux dev. PR #1626 — clean Cmd+Q stops triggering supervisor respawn - resolveDisconnectCause(browser) reads the underlying Chromium ChildProcess exitCode + signalCode (with a 1s wait for an async exit event) to distinguish clean user-quit from crash. - handleChromiumDisconnect(browser) dispatches the headless launch() disconnect path: clean → exit(0), crash → exit(1). - launchHeaded() disconnect handler resolves cause inline and computes exitCode = 0 (clean) \| 2 (crash) before forwarding to onDisconnect. - handoff() disconnect handler uses the same shared helper. Codex-caught propagation fix (this commit, not in either source PR) - BrowserManager.onDisconnect signature widened to accept an exitCode argument. Without this, launchHeaded's locally-computed exit code was dropped before reaching server.ts. - browse/src/server.ts:688 — onDisconnect callback now forwards the resolved code: (code) => activeShutdown?.(code ?? 2). The ?? 2 preserves legacy crash semantics for callers that invoke onDisconnect without an explicit code. Tests - browse/test/browser-manager-unit.test.ts goes from 2 → 17 tests. - 6 new tests pin shouldEnableChromiumSandbox across darwin / linux / win32 / CI / CONTAINER / root. - 7 new tests pin resolveDisconnectCause across already-exited, async-exit, SIGSEGV, SIGKILL, and null-browser. - 2 new tests (this commit) pin the onDisconnect(exitCode) propagation contract including the exact server.ts forwarding callback shape so a refactor that drops the forward fails CI before the user-visible respawn bug returns. Refs PRs #1617, #1626; companion gbrowser PR #23. * chore: bump version v1.42.1.1 → v1.42.2.0 User-requested rebump (claims v1.42.2.0 slot on the queue). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 19:30:08 -07:00
Garry Tan	7665adf4fe	feat: headed mode + sidebar agent + Chrome extension (v0.12.0) (#517 ) * feat: CDP connect — control real Chrome/Comet via Playwright Add `connectCDP()` to BrowserManager: connects to a running browser via Chrome DevTools Protocol. All existing browse commands work unchanged through Playwright's abstraction layer. - chrome-launcher.ts: browser discovery, CDP probe, auto-relaunch with rollback - browser-manager.ts: connectCDP(), mode guards (close/closeTab/recreateContext/handoff), auto-reconnect on browser restart, getRefMap() for extension API - server.ts: CDP branch in start(), /health gains mode field, /refs endpoint, idle timer only resets on /command (not passive endpoints) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: browse connect/disconnect/focus CLI commands - connect: pre-server command that discovers browser, starts server in CDP mode - disconnect: drops CDP connection, restarts in headless mode - focus: brings browser window to foreground via osascript (macOS) - status: now shows Mode: cdp \| launched \| headed - startServer() accepts extra env vars for CDP URL/port passthrough Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: CDP-aware skill templates — skip cookie import in real browser mode Skills now check `$B status` for CDP mode and skip: - /qa: cookie import prompt, user-agent override, headless workarounds - /design-review: cookie import for authenticated pages - /setup-browser-cookies: returns "not needed" in CDP mode Regenerated SKILL.md files from updated templates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: activity streaming — SSE endpoint for Chrome extension Side Panel Real-time browse command feed via Server-Sent Events: - activity.ts: ActivityEntry type, CircularBuffer (capacity 1000), privacy filtering (redacts passwords, auth tokens, sensitive URL params), cursor-based gap detection, async subscriber notification - server.ts: /activity/stream SSE, /activity/history REST, handleCommand instrumented with command_start/command_end events - 18 unit tests for filterArgs privacy, emitActivity, subscribe lifecycle Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Chrome extension Side Panel + Conductor API proposal Chrome extension (Manifest V3, sideload): - Side Panel with live activity feed, @ref overlays, dark terminal aesthetic - Background worker: health polling, SSE relay, ref fetching - Popup: port config, connection status, side panel launcher - Content script: floating ref panel with @ref badges Conductor API proposal (docs/designs/CONDUCTOR_SESSION_API.md): - SSE endpoint for full Claude Code session mirroring in Side Panel - Discovery via HTTP endpoint (not filesystem — extensions can't read files) TODOS.md: add $B watch, multi-agent tabs, cross-platform CDP, Web Store publishing. Mark CDP mode as shipped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: detect Conductor runtime, skip osascript quit for sandboxed apps macOS App Management blocks Electron apps (Conductor) from quitting other apps via osascript. Now detects the runtime environment: - terminal/claude-code/codex: can manage apps freely - conductor: prints manual restart instructions + polls for 60s detectRuntime() checks env vars and parent process. When Chrome needs restart but we can't quit it, prints step-by-step instructions and waits for the user to restart Chrome with --remote-debugging-port. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: detect Conductor via actual env vars (CONDUCTOR_WORKSPACE_NAME) Previous detection checked CONDUCTOR_WORKSPACE_ID which doesn't exist. Conductor sets CONDUCTOR_WORKSPACE_NAME, CONDUCTOR_BIN_DIR, CONDUCTOR_PORT, and __CFBundleIdentifier=com.conductor.app. Check these FIRST because Conductor sessions also have ANTHROPIC_API_KEY (which was matching claude-code). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: connection status pill — floating indicator when gstack controls Chrome Small pill in bottom-right corner of every page: "● gstack · 3 refs" Shows when connected via CDP, fades to 30% opacity after 3s, full on hover. Disappears entirely when disconnected. Background worker now notifies content scripts on connect/disconnect state changes so the pill appears/disappears without polling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Chrome requires --user-data-dir for remote debugging Chrome refuses --remote-debugging-port without an explicit --user-data-dir. Add userDataDir to BrowserBinary registry (macOS Application Support paths) and pass it in both auto-launch and manual restart instructions. Fix double-quoting in CLI manual restart instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Chrome must be fully quit before launching with --remote-debugging-port Chrome refuses to enable CDP on its default profile when another instance is running (even with explicit --user-data-dir). The only reliable path: fully quit Chrome first, then relaunch with the flag. Updated instructions to emphasize this clearly with verification step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: bin/chrome-cdp — quit Chrome and relaunch with CDP in one command Quits Chrome gracefully, waits for full exit, relaunches with --remote-debugging-port, polls until CDP is ready. Usage: chrome-cdp [port] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use Playwright channel:chrome instead of broken connectOverCDP Playwright's connectOverCDP hangs with Chrome 146 due to CDP protocol version mismatch. Switch to channel:'chrome' which uses Playwright's native pipe protocol to launch the system Chrome binary directly. This is simpler and more reliable: - No CDP port discovery needed - No --remote-debugging-port or --user-data-dir hassles - $B connect just works — launches real Chrome headed window - All Playwright APIs (snapshot, click, fill) work unchanged bin/chrome-cdp updated with symlinked profile approach (kept for manual CDP use cases, but $B connect no longer needs it). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: green border + gstack label on controlled Chrome window Injects a 2px green border and small "gstack" label on every page loaded in the controlled Chrome window via context.addInitScript(). Users can instantly tell which Chrome window Claude controls. Also fixes close() for channel:chrome mode (uses browser.close() not browser.disconnect() which doesn't exist). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: cleanup chrome-launcher runtime detection, remove puppeteer-core dep Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style(design): redesign controlled Chrome indicator Replace crude green border + label with polished indicator: - 2px shimmer gradient at top edge (green→cyan→green, 3s loop) - Floating pill bottom-right with frosted glass bg, fades to 25% opacity after 4s so it doesn't compete with page content - prefers-reduced-motion disables shimmer animation - Much more subtle — looks like a developer tool, not broken CSS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: document real browser mode + Chrome extension in BROWSER.md and README.md BROWSER.md: new sections for connect/disconnect/focus commands, Chrome extension Side Panel install, CDP-aware skills, activity streaming. Updated command reference table, key components, env vars, source map. README.md: updated /browse description, added "Real browser mode" to What's New section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: step-by-step Chrome extension install guide in BROWSER.md Replace terse bullet points with numbered walkthrough covering: developer mode toggle, load unpacked, macOS file picker tip (Cmd+Shift+G), pin extension, configure port, open side panel. Added troubleshooting section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Cmd+Shift+. tip for hidden folders in macOS file picker macOS hides folders starting with . by default. Added both shortcuts: Cmd+Shift+G (paste path directly) and Cmd+Shift+. (show hidden files). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: integrate hidden folder tips into the install flow naturally Move Cmd+Shift+G and Cmd+Shift+. tips inline with the file picker step instead of as a separate tip block after it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: auto-load Chrome extension when $B connect launches Chrome Extension auto-loads via --load-extension flag — no manual chrome://extensions install needed. findExtensionPath() checks repo root, global install, and dev paths. Also adds bin/gstack-extension helper for manual install in regular Chrome, and rewrites BROWSER.md install docs with auto-load as primary path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: /connect-chrome skill — one command to launch Chrome with Side Panel New skill that runs $B connect, verifies the connection, guides the user to open the Side Panel, and demos the live activity feed. Extension auto-loads via --load-extension so no manual chrome://extensions install needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use launchPersistentContext for Chrome extension loading Playwright's chromium.launch() silently ignores --load-extension. Switch to launchPersistentContext with ignoreDefaultArgs to remove --disable-extensions flag. Use bundled Chromium (real Chrome blocks unpacked extensions). Fixed port 34567 for CDP mode so the extension auto-connects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sync extension to DESIGN.md — amber accent, zinc neutrals, grain texture Import design system from gstack-website. Update all extension colors: green (#4ade80) → amber (#F59E0B/#FBBF24), zinc gray neutrals, grain texture overlay. Regenerate icons as amber "G" monogram on dark background. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar chat with Claude Code — icon opens side panel directly Replace popup flyout with direct side panel open on icon click. Primary UI is now a chat interface that sends messages to Claude Code via file queue. Activity/Refs tabs moved behind a debug toggle in the footer. Command bar with history, auto-poll for responses, amber design system. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar agent — Claude-powered chat backend via file queue Add /sidebar-command, /sidebar-response, and /sidebar-chat endpoints to the browse server. sidebar-agent.ts watches the command queue file, spawns claude -p with browse context for each message, and streams responses back to the sidebar chat. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove duplicate gstack pill overlay, hide crash restore bubble The addInitScript indicator and the extension's content script were both injecting bottom-right pills, causing duplicates. Remove the pill from addInitScript (extension handles it). Replace --restore-last-session with --hide-crash-restore-bubble to suppress the "Chromium didn't shut down correctly" dialog. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: state file authority — CDP server cannot be silently replaced Hardens the connect/disconnect lifecycle: - ensureServer() refuses to auto-start headless when CDP server is alive - $B connect does full cleanup: SIGTERM → 2s → SIGKILL, profile locks, state - shutdown() cleans Chromium SingletonLock/Socket/Cookie files - uncaughtException/unhandledRejection handlers do emergency cleanup This prevents the bug where a headless server overwrites the CDP server's state file, causing $B commands to hit the wrong browser. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar agent streaming events + session state management Enhance sidebar-agent.ts with: - Live streaming of claude -p events (tool_use, text, result) to sidebar - Session state file for BROWSE_STATE_FILE propagation to claude subprocess - Improved logging (stderr, exit codes, event types) - stdin.end() to prevent claude waiting for input - summarizeToolInput() with path shortening for compact sidebar display Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: sidebar chat UI — streaming events, agent status, reconnect retry Sidebar panel improvements: - Chat tab renders streaming agent events (tool_use, text, result) - Thinking dots animation while agent processes - Agent error display with styled error blocks - tryConnect() with 2s retry loop for initial connection - Debug tabs (Activity/Refs) hidden behind gear toggle - Clear chat button - Compact tool call display with path shortening Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: server-integrated sidebar agent with sessions and message queue Move the sidebar agent from a separate bun process into server.ts: - Agent spawns claude -p directly when messages arrive via /sidebar-command - In-memory chat buffer backed by per-session chat.jsonl on disk - Session manager: create, load, persist, list sessions - Message queue (cap 5) with agent status tracking (idle/processing/hung) - Stop/kill endpoints with queue dismiss support - /health now returns agent status + session info - All sidebar endpoints require Bearer auth - Agent killed on server shutdown - 120s timeout detects hung claude processes Eliminates: file-queue polling, separate sidebar-agent.ts process, stale auth tokens, state file conflicts between processes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: extension auth + token flow for server-integrated agent Update Chrome extension to use Bearer auth on all sidebar endpoints: - background.js captures auth token from /health, exposes via getToken msg - background.js sets openPanelOnActionClick for direct side panel access - sidepanel.js gets token from background, sends in all fetch headers - Health broadcasts include token so sidebar auto-authenticates - Removes popup from manifest — icon click opens side panel directly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: self-healing sidebar — reconnect banner, state machine, copy button Sidebar UI now handles disconnection gracefully: - Connection state machine: connected → reconnecting → dead - Amber pulsing banner during reconnect (2s retry, 30 attempts) - Red "Server offline" banner with Reconnect + Copy /connect-chrome buttons - Green "Reconnected" toast that fades after 3s on successful reconnect - Copy button lets user paste /connect-chrome into any Claude Code session Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: crash handling — save session, kill agent, distinct exit codes Hardened shutdown/crash behavior: - Browser disconnect exits with code 2 (distinct from crash code 1) - emergencyCleanup kills agent subprocess and saves session state - Clean shutdown saves session before exit (chat history persists) - Clear user message on browser disconnect: "Run $B connect to reconnect" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: worktree-per-session isolation for sidebar agent Each sidebar session gets an isolated git worktree so the agent's file operations don't conflict with the user's working directory: - createWorktree() creates detached HEAD worktree in ~/.gstack/worktrees/ - Falls back to main cwd for non-git repos or on creation failure - Handles collision cleanup from prior crashes - removeWorktree() cleans up on session switch and shutdown - worktreePath persisted in session.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(qa): ISSUE-001 — disconnect blocked by CDP guard in ensureServer $B disconnect was routed through ensureServer() which refused to start a headless server when a CDP state file existed. Disconnect is now handled before ensureServer() (like connect), with force-kill + cleanup fallback when the CDP server is unresponsive. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve claude binary path for daemon-spawned agent The browse server runs as a daemon and may not inherit the user's shell PATH. Add findClaudeBin() that checks ~/.local/bin/claude (standard install location), which claude, and common system paths. Shows a clear error in the sidebar chat if claude CLI is not found. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve claude symlinks + check Conductor bundled binary posix_spawn fails on symlinks in compiled bun binaries. Now: - Checks Conductor app's bundled binary first (not a symlink) - Scans ~/.local/share/claude/versions/ for direct versioned binaries - Uses fs.realpathSync() to resolve symlinks before spawning Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: compiled bun binary cannot posix_spawn — use external agent process Compiled bun binaries fail posix_spawn on ALL executables (even /bin/bash). The server now writes to an agent queue file, and a separate non-compiled bun process (sidebar-agent.ts) reads the queue, spawns claude, and POSTs events back via /sidebar-agent/event. Changes: - server.ts: spawnClaude writes to queue file instead of spawning directly - server.ts: new /sidebar-agent/event endpoint for agent → server relay - server.ts: fix result event field name (event.text vs event.result) - sidebar-agent.ts: rewritten to poll queue file, relay events via HTTP - cli.ts: $B connect auto-starts sidebar-agent as non-compiled bun process Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: loading spinner on sidebar open while connecting to server Shows an amber spinner with "Connecting..." when the sidebar first opens, replacing the empty state. After the first successful /sidebar-chat poll: - If chat history exists: renders it immediately - If no history: shows the welcome message Prevents the jarring empty-then-populated flash on sidebar open. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: zero-friction side panel — auto-open on install, pill is clickable Three changes to eliminate manual side panel setup: - Auto-open side panel on extension install/update (onInstalled listener) - gstack pill (bottom-right) is now clickable — opens the side panel - Pill has pointer-events: auto so clicks always register (was: none) User no longer needs to find the puzzle piece icon, pin the extension, or know the side panel exists. It opens automatically on first launch and can be re-opened by clicking the floating gstack pill. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: kill CDP naming, delete chrome-launcher.ts dead code The connectCDP() method and connectionMode: 'cdp' naming was a legacy artifact — real Chrome was tried but failed (silently blocks --load-extension), so the implementation already used Playwright's bundled Chromium via launchPersistentContext(). The naming was misleading. Changes: - Delete chrome-launcher.ts (361 LOC) — only import was in unreachable attemptReconnect() method - Delete dead attemptReconnect() and reconnecting field - Delete preExistingTabIds (was for protecting real Chrome tabs we never connect to) - Rename connectCDP() → launchHeaded() - Rename connectionMode: 'cdp' → 'headed' across all files - Replace BROWSE_CDP_URL/BROWSE_CDP_PORT env vars with BROWSE_HEADED=1 - Regenerate SKILL.md files for updated command descriptions - Move BrowserManager unit tests to browser-manager-unit.test.ts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: converge handoff into connect — extension loads on handoff Handoff now uses launchPersistentContext() with extension auto-loading, same as the connect/launchHeaded() path. This means when the agent gets stuck (2FA, CAPTCHA) and hands off to the user, the Chrome extension + side panel are available automatically. Before: handoff used chromium.launch() + newContext() — no extension After: handoff uses chromium.launchPersistentContext() — extension loads Also sets connectionMode to 'headed' and disables dialog auto-accept on handoff, matching connect behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: gate sidebar chat behind --chat flag $B connect (default): headed Chromium + extension with Activity + Refs tabs only. No separate agent spawned. Clean, no confusion. $B connect --chat: same + Chat tab with standalone claude -p agent. Shows experimental banner: "Standalone mode — this is a separate agent from your workspace." Implementation: - cli.ts: parse --chat, set BROWSE_SIDEBAR_CHAT env, conditionally spawn sidebar-agent - server.ts: gate /sidebar-* routes behind chatEnabled, return 403 when disabled, include chatEnabled in /health response - sidepanel.js: applyChatEnabled() hides/shows Chat tab + banner - background.js: forward chatEnabled from health response - sidepanel.html/css: experimental banner with amber styling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: file drop relay + $B inbox command Sidebar agent now writes structured messages to .context/sidebar-inbox/ when processing user input. The workspace agent can read these via $B inbox to see what the user reported from the browser. File drop format: .context/sidebar-inbox/{timestamp}-observation.json { type, timestamp, page: {url}, userMessage, sidebarSessionId } Atomic writes (tmp + rename) prevent partial reads. $B inbox --clear removes messages after display. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: $B watch — passive observation mode Claude enters read-only mode and captures periodic snapshots (every 5s) while the user browses. Mutation commands (click, fill, etc.) are blocked during watch. $B watch stop exits and returns a summary with the last snapshot. Requires headed mode ($B connect). This is the inverse of the scout pattern — the workspace agent watches through the browser instead of the sidebar relaying to it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add coverage for sidebar-agent, file-drop, and watch mode 33 new tests covering: - Sidebar agent queue parsing (valid/malformed/empty JSONL) - writeToInbox file drop (directory creation, atomic writes, JSON format) - Inbox command (display, sorting, --clear, malformed file handling) - Watch mode state machine (start/stop cycles, snapshots, duration) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: TODOS cleanup + Chrome vs Chromium exploration doc - Update TODOS.md: mark CDP mode, $B watch, sidebar scout as SHIPPED - Delete dead "cross-platform CDP browser discovery" TODO - Rename dependencies from "CDP connect" to "headed mode" - Add docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md memorializing the architecture exploration and decision to use Playwright Chromium Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add Conductor Chrome sidebar integration design doc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sidebar-agent validates cwd before spawning claude The queue entry may reference a worktree that was cleaned up between sessions. Now falls back to process.cwd() if the path doesn't exist, preventing silent spawn failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: gen-skill-docs resolver merge + preamble tier gate + plan file discovery The local RESOLVERS record in gen-skill-docs.ts was shadowing the imported canonical resolvers, causing stale test coverage and preamble generators to be used instead of the authoritative versions in resolvers/. Changes: - Merge imported RESOLVERS with local overrides (spread + override pattern) - Fix preamble tier gate: tier 1 skills no longer get AskUserQuestion format - Make plan file discovery host-agnostic (search multiple plan dirs) - Add missing E2E tier entries for ship/review plan completion tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: ungate sidebar agent + raise timeout to 5 minutes (v0.12.0) Sidebar chat is now always available in headed mode — no --chat flag needed. Agent tasks get 5 minutes instead of 2, enabling multi-page workflows like navigating directories and filling forms across pages. Changes: - cli.ts: remove --chat flag, always set BROWSE_SIDEBAR_CHAT=1, always spawn agent - server.ts: remove chatEnabled gate (403 response), raise AGENT_TIMEOUT_MS to 300s - sidebar-agent.ts: raise child process timeout from 120s to 300s Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: headed mode + sidebar agent documentation (v0.12.0) - README: sidebar agent section, personal automation example (school parent portal), two auth paths (manual login + cookie import), DevTools MCP mention - BROWSER.md: sidebar agent section with usage, timeout, session isolation, authentication, and random delay documentation - connect-chrome template: add sidebar chat onboarding step - CHANGELOG: v0.12.0 entry covering headed mode, sidebar agent, extension - VERSION: bump to 0.12.0.0 - TODOS: Chrome DevTools MCP integration as P0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files Generated from updated templates + resolver merge. Key changes: - Tier 1 skills no longer include AskUserQuestion format section - Ship/review skills now include coverage gate with thresholds - Connect-chrome skill includes sidebar chat onboarding step - Plan file discovery uses host-agnostic paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate Codex connect-chrome skill Updated preamble with proactive prompt and sidebar chat onboarding step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: network idle, state persistence, iframe support, chain pipe format (v0.12.1.0) (#516) * feat: network idle detection + chain pipe format - Upgrade click/fill/select from domcontentloaded to networkidle wait (2s timeout, best-effort). Catches XHR/fetch triggered by interactions. - Add pipe-delimited format to chain as JSON fallback: $B chain 'goto url \| click @e5 \| snapshot -ic' - Add post-loop networkidle wait in chain when last command was a write. - Frame-aware: commands use target (getActiveFrameOrPage) for locator ops, page-only ops (goto/back/forward/reload) guard against frame context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: $B state save/load + $B frame — new browse commands - state save/load: persist cookies + URLs to .gstack/browse-states/{name}.json File perms 0o600, name sanitized to [a-zA-Z0-9_-]. V1 skips localStorage (breaks on load-before-navigate). Load replaces session via closeAllPages(). - frame: switch command context to iframe via CSS selector, @ref, --name, or --url. 'frame main' returns to main frame. Execution target abstraction (getActiveFrameOrPage) across read-commands, snapshot, and write-commands. - Frame context cleared on tab switch, navigation, resume, and handoff. - Snapshot shows [Context: iframe src="..."] header when in frame. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add tests for network idle, chain pipe format, state, and frame - Network idle: click on fetch button waits for XHR, static click is fast - Chain pipe: pipe-delimited commands, quoted args, JSON still works - State: save/load round-trip, name sanitization, missing state error - Frame: switch to iframe + back, snapshot context header, fill in frame, goto-in-frame guard, usage error New fixtures: network-idle.html (fetch + static buttons), iframe.html (srcdoc) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: review fixes — iframe ref scoping, detached frame recovery, state validation - snapshot.ts: ref locators, cursor-interactive scan, and cursor locator now use target (frame-aware) instead of page — fixes @ref clicking in iframes - browser-manager.ts: getActiveFrameOrPage auto-recovers from detached frames via isDetached() check - meta-commands.ts: state load resets activeFrame, elementHandle disposed after contentFrame(), state file schema validation (cookies + pages arrays), filter empty pipe segments in chain tokenizer - write-commands.ts: upload command uses target.locator() for frame support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: regenerate SKILL.md files + rebuild binary Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.12.1.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 11:15:24 -06:00

4 Commits