fix: merge main into skill frontmatter PR

This commit is contained in:
Hermes PR Sweep 2026-05-28 02:10:06 -05:00
commit 2e8caafa23
27 changed files with 2359 additions and 149 deletions

View File

@ -317,6 +317,7 @@ from `snapshot`, or `@c` refs from `snapshot -C`. Full table:
| `disconnect` | Close headed Chrome, return to headless |
| `focus [@ref]` | Bring headed Chrome to foreground (macOS); `@ref` also scrolls into view |
| `state save\|load <name>` | Save or load browser state (cookies + URLs) |
| `memory [--json]` | Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes. Use `--json` for programmatic consumers; text mode renders sorted top-10 tabs with "and N more" tail. |
### Handoff

View File

@ -1,5 +1,58 @@
# Changelog
## [1.51.0.0] - 2026-05-27
## **Long-running browser sessions hold flat RSS on the Bun side. `$B memory` gives every future OOM receipts instead of a screenshot.** Four CDP-resource leak classes closed and pinned with tripwires; a structured diagnostic surfaces Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes in real time.
This release closes four leak classes in the browse server that compounded silently across long sidebar sessions: response-body materialization in the requestfinished listener (multi-GB/hour Buffer churn on media-heavy pages), three undetached CDP session call sites (cdp-bridge, write-commands archive, cdp-inspector), an unbounded modificationHistory array in the CSS inspector, and SSE subscriber cleanup that only fired on the abort edge — TCP-died-without-abort cases (Chromium MV3 service-worker suspend, intermediate proxy half-close) left subscribers in the Set forever holding the controller and any queued bytes. All four have invariant tests; a static-grep tripwire fails CI if a future refactor reintroduces direct `newCDPSession(...)` calls outside the helper module.
Alongside the fixes, `$B memory` and `/memory` ship the diagnostic the original 160 GB OOM investigation was missing: Bun RSS + heap breakdown, per-tab JS heap via CDP `Performance.getMetrics`, Chromium process tree via `SystemInfo.getProcessInfo` (PID + type + CPU), and the bounded buffer sizes (modificationHistory, activity subscribers, inspector subscribers, console/network/dialog buffers, capture buffer bytes). The sidebar footer polls `/memory` every 30s with adaptive backoff (drops to 5min if response time exceeds 2s), and a tab-count guardrail fires soft-warn at 50 / hard-warn at 200 with a top-5-by-RAM toast offering one-click close. Single-tab JS heap above 4 GB triggers an immediate toast, catching the WebGL/video runaway case where one tab balloons without the count ever reaching 200.
### The numbers that matter
Source: this branch's 16 commits + the post-merge audit reports. Net diff: 23 files changed, +2251 / -143 = 2394 LOC across browse server (TypeScript), gstack extension (JS/HTML/CSS), and tests.
| Capability | Before this PR | After this PR |
|---|---|---|
| `requestfinished` body handling | `await res.body()` on every response, allocates full body Buffer for one `.length` read | `req.sizes()` reads structured byte count from `Network.loadingFinished`, zero body materialization, accurate for chunked / gzip / streaming responses |
| CDP session lifecycle (3 sites) | direct `newCDPSession`, detach missing or success-path-only | `withCdpSession` (try/finally detach) + `getOrCreateCdpSession` (cached + close-detach) helpers, all 3 sites migrated, static-grep tripwire prevents regression |
| modificationHistory in CSS inspector | unbounded array, grew for every `$B css` edit across the session | bounded FIFO cap 200, evicted-count surfaced in the undo error so the user knows why their target index is gone |
| SSE subscriber cleanup | abort-edge only; TCP-died-without-abort leaked subscriber + controller + queued bytes until process exit | `createSseEndpoint` helper with cleanup on abort + enqueue-throw + heartbeat-throw, idempotent (any edge fires once) |
| Tab-count visibility | none — user could accumulate hundreds of tabs without warning | soft warn at 50 (activity entry), action toast at 200 (top 5 by RAM + Close-selected + Snooze), single-tab >4 GB triggers immediate toast |
| Diagnostic command | not available | `$B memory` (text + `--json`), `/memory` endpoint (SSE-session-cookie gated), sidebar footer with adaptive backoff |
| Net change in `server.ts` (SSE refactor) | 132 lines of inline ReadableStream wiring across two endpoints | 23 lines, both endpoints route through one helper |
| Test pins for the leak class | none specific | 6 new test files, 45 new tests; static-grep tripwire fails CI on regression |
### What this means for builders
The next time you leave a gbrowser session running for days, the Bun side holds its RSS flat instead of churning on per-response Buffer allocations. If a tab does go rogue, the sidebar footer shows you in real time — `RSS: 5.6 GB · 12 tabs`, color-coded — and a 200-tab toast surfaces the top RAM consumers with one-click close before you hit the OS OOM killer. If the next OOM still fires, `$B memory` is there to give it receipts instead of theory: Activity Monitor says 160 GB; the diagnostic tells you which process tree, which tabs, and which in-memory structures are holding it. Every code path the diagnostic measures is also bounded — modificationHistory at 200, console/network/dialog buffers at 50K via the existing CircularBuffer, SSE subscribers via the new cleanup contract — so the bookkeeping itself can't leak.
### Itemized changes
#### Added
- **`$B memory` command** in `browse/src/memory-command.ts` — text mode with sorted top-10 tabs + "and N more" tail; `--json` mode for programmatic consumers and the sidebar footer poll.
- **`/memory` HTTP endpoint** in `browse/src/server.ts` — same SSE-session-cookie auth model as `/activity/stream`. Deliberately NOT extending `/health` (which already leaks AUTH_TOKEN in headed mode per TODOS.md "Audit /health token distribution").
- **`BrowserManager.getMemorySnapshot()`** — collects Bun process memory + per-tab JS heap via `Performance.getMetrics` (lazy per tracked page, swallows target-died errors) + Chromium process tree via `Browser.newBrowserCDPSession()` + `SystemInfo.getProcessInfo`.
- **`browse/src/memory-snapshot.ts`** — shared types (`MemorySnapshot`, `MemoryTabSnapshot`, `MemoryProcess`, `MemoryStructureStats`) plus `formatBytes()` renderer (4 tiers, 2 decimals at GB).
- **`withCdpSession(page, fn)`** and **`getOrCreateCdpSession(page, cache)`** in `browse/src/cdp-bridge.ts` — lifecycle helpers for one-shot and cached CDP work. Every direct `newCDPSession` call site now routes through one of them.
- **`createSseEndpoint(req, config)`** in `browse/src/sse-helpers.ts` — owns the SSE cleanup contract (abort + enqueue-throw + heartbeat-throw, all idempotent). Built-in lone-surrogate sanitization on every JSON.stringify.
- **Sidebar footer RSS readout** in `extension/sidepanel.{html,js,css}` — polls `/memory` every 30s with 5-minute backoff if response time exceeds 2s. Color-coded thresholds: orange at 2 GB Bun RSS or 50 tabs, red at 8 GB or 200 tabs.
- **Tab guardrail UX** in `extension/sidepanel.js` — top-5-by-RAM toast at 200 tabs OR any single tab over 4 GB JS heap, with checkboxes + Close-selected (via `$B closetab`) + Snooze persisted in `chrome.storage.session`. Snooze bumps the thresholds so the toast stays hidden until the user accumulates more tabs or one tab grows another 2 GB.
- **Static-grep tripwire** (`browse/test/cdp-session-cleanup.test.ts`) — fails CI if any source file outside `cdp-bridge.ts` calls `newCDPSession(...)` directly.
- **45 new tests across 6 files** pinning the leak-fix invariants: CDP session lifecycle (8), SSE cleanup contract (6), modificationHistory cap + evicted-aware error (7), tab guardrail fires-once + re-arms (6), body-materialization reproducer (1), `$B memory` formatter + byte renderer + JSON entry (17).
- **4 follow-up entries in `TODOS.md`** (P2: MV3 SW memory profile, P2: native + GPU memory breakdown, P3: single-context CDP listener via `Target.setAutoAttach`, P3: real-Chromium peak-RSS reproducer for periodic tier).
#### Changed
- **`wirePageEvents.requestfinished` no longer materializes response bodies.** Pre-fix: `await res.body()` allocated a Bun `Buffer` of the full response on every fetch just to read `.length`. Post-fix: `req.sizes()` pulls the structured byte count from `Network.loadingFinished` without body fetch. Accurate for chunked transfer, gzip-encoded responses, and streaming media.
- **`modificationHistory` capped at 200 entries with FIFO eviction.** `undoModification` error now reports `"No modification at index N. History has 200 entries (most recent 200 only — M earlier entries evicted at the cap)."` when the requested index is out of range AND the buffer has overflowed.
- **`/activity/stream` and `/inspector/events` refactored through `createSseEndpoint`.** Both endpoints collapse from ~45 lines of inline `ReadableStream` wiring to ~8 lines of helper config; behavior preserved bit-for-bit.
- **`memory` command classified under the `Server` category** in `COMMAND_DESCRIPTIONS` so it appears in the generated SKILL.md tables alongside `status` / `restart` / `handoff`.
#### For contributors
- Plan completion audit: 12 of 17 plan items DONE, 2 CHANGED (deliberate scope decisions documented in the relevant commits — `req.sizes()` swap simpler than a single-context CDP listener; tab guardrail action toast wired through `$B closetab` instead of a `chrome.tabs.remove` bridge), 1 deferred to periodic tier (UI E2E tests).
- Coverage audit: 44% pre-diagnostic-tests → ~62% after adding the formatter coverage. Strong paths (CDP session lifecycle, body materialization, history cap, tab guardrail, SSE cleanup) all at 100% with invariant tests. Extension UI tests deferred (no extension test harness in this repo today).
- The CDP-session cleanup tripwire is the most reusable artifact here — any future addition of CDP work should route through the two helpers. Trying to call `newCDPSession` outside `cdp-bridge.ts` fails CI immediately with a pointer to the right helper.
## [1.48.0.0] - 2026-05-26
## **Agents stop dropping AskUserQuestion options when there are 5+.** A new canonical preamble rule + runtime gate makes Conductor's 4-option cap a split-or-batch decision, not a silent trim.

View File

@ -294,6 +294,26 @@ response in `server.ts`, read
`browse/test/server-sanitize-surrogates.test.ts` pins the wiring with invariant
tests, so bypasses fail CI.
**SSE endpoint helper** (v1.51.0.0+). New SSE endpoints in `server.ts` MUST route
through `createSseEndpoint(req, config)` from `browse/src/sse-helpers.ts`. The
helper owns the cleanup contract (abort + enqueue-throw + heartbeat-throw, all
idempotent) and bakes in `sanitizeLoneSurrogates` on every JSON.stringify, so
new subscribers can't accidentally regress either invariant. Inline
`ReadableStream` wiring leaked subscribers when the TCP connection died without
firing `req.signal.abort` (Chromium MV3 service-worker suspend, intermediate
proxy half-close). `/activity/stream`, `/inspector/events`, and `/memory`
(SSE-eligible) all route through it. `browse/test/sse-helpers.test.ts` pins the
cleanup contract.
**CDP session lifecycle** (v1.51.0.0+). Direct `page.context().newCDPSession(page)`
calls outside `browse/src/cdp-bridge.ts` fail CI via the static-grep tripwire in
`browse/test/cdp-session-cleanup.test.ts`. Use `withCdpSession(page, async (s) => {...})`
for one-shot CDP work (try/finally detach) or `getOrCreateCdpSession(page, cache)`
for cached sessions tied to a page's lifetime (close-detach via `Map<page, session>`).
Three sites migrated: cdp-bridge frame events, write-commands archive capture,
cdp-inspector. The helpers prevent the per-session leak class where successful-path
detach happened but error-path detach was missed.
**Setup symlink hardening** (v1.38.0.0+). Every link site in `setup` MUST route
through the `_link_or_copy SRC DST` helper near the `IS_WINDOWS` detection. On
Windows without Developer Mode, plain `ln -snf` produces frozen file copies that

View File

@ -963,6 +963,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
| `disconnect` | Disconnect headed browser, return to headless mode |
| `focus [@ref]` | Bring headed browser window to foreground (macOS) |
| `handoff [message]` | Open visible Chrome at current page for user takeover |
| `memory [--json]` | Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes. JSON output with --json. |
| `restart` | Restart server |
| `resume` | Re-snapshot after user takeover, return control to AI |
| `state save|load <name>` | Save/load browser state (cookies + URLs) |

135
TODOS.md
View File

@ -1,5 +1,140 @@
# TODOS
## gbrowser memory follow-ups (filed via /plan-eng-review + /codex on the v1.49 leak-fix PR)
These four items came out of the memory-leak investigation that shipped
the `$B memory` diagnostic + the four leak fixes. They were
deliberately deferred from that PR (already 14 commits / ~12 files);
each stands alone and any one could ship independently.
### P2: MV3 extension service worker memory profile
**What:** The `/memory` endpoint snapshot enumerates pages but does
not enumerate the gstack baked-in extension's service-worker target.
A long-running MV3 service worker can leak through retained DOM
snapshots, message ports that never close, alarms that re-arm, and
caches that grow without bound. The diagnostic should call
`Target.getTargets` with a filter for `service_worker` and include
each one in `tabs[]` (or a sibling `serviceWorkers[]` array) with the
same `Performance.getMetrics` data.
**Why:** Codex's outside-voice review on the eng-review surfaced this
class of leak (the extension is part of the gbrowser process tree but
invisible to today's snapshot). Until we surface it, a SW leak shows
up only in the parent process RSS with no per-target attribution.
**Pros:** Closes the per-target attribution gap for the
single-most-likely future leak source (our own extension).
**Cons:** Extension SW lifecycle is asymmetric vs page lifecycle;
auto-attach + filter is one more piece of CDP plumbing.
**Context:** Codex finding #4 on the eng-review outside voice. Not
in scope of the v1.49 PR; deliberately deferred to keep the PR to
the four highest-confidence leak fixes.
**Priority:** P2. **Effort:** M.
---
### P2: Native + GPU memory breakdown in `$B memory`
**What:** `$B memory` shows Bun RSS + per-tab JS heap + Chromium
process tree (PIDs + types + CPU time) but the per-process RSS is
absent — `SystemInfo.getProcessInfo` doesn't expose RSS and the eng
review (D2 USE_CDP) explicitly chose CDP over shelling to `ps`. The
honest next step is to surface what CDP DOES give for the other
memory categories: `Memory.getDOMCounters` per target (node + listener
counts), `SystemInfo.getInfo` for GPU memory, `Memory.getAllTimeSamplingProfile`
for a sampled native estimate.
**Why:** Codex's outside-voice review flagged that
`Performance.getMetrics` misses native memory, GPU memory, video
buffers, Skia, network cache, extension process RSS, and
browser-process RSS — all the categories where a 160 GB leak would
actually live. A diagnostic that misses the categories where the
leak class lives undersells itself.
**Pros:** Per-process category breakdown closes the gap between
"Activity Monitor says 160 GB" and what the diagnostic shows.
**Cons:** Each CDP method has its own quirks; this is a real
implementation pass, not a one-line addition.
**Context:** Codex finding #5 on the eng-review outside voice. Not
in scope of the v1.49 PR; deliberately deferred.
**Priority:** P2. **Effort:** M.
---
### P3: Single-context CDP listener for Network.loadingFinished
**What:** `wirePageEvents` attaches a `page.on('requestfinished')`
listener PER PAGE. The D10 fix removed the body-materialization leak
inside that listener but kept the per-page listener architecture
(7 listeners attached per tab — close, framenavigated, dialog,
console, request, response, requestfinished). The stretch goal from
D10 was to replace the per-page `requestfinished` listener with a
single context-level CDP listener via
`Target.setAutoAttach({autoAttach: true, waitForDebuggerOnStart: false,
flatten: true})` and a browser-wide `Network.loadingFinished` event
handler.
**Why:** Going from N to 1 listener for the request-size capture is
structurally the right architecture and removes one piece of per-tab
memory pressure. The body-materialization fix already addressed the
acute leak; this is the architectural cleanup that prevents similar
leaks in the same class.
**Pros:** One listener per browser instead of one per tab.
**Cons:** `Target.setAutoAttach` plumbing is more code than the
straight per-page listener; the marginal memory win is small on top
of the body-fetch fix that already landed.
**Context:** D10 stretch goal on the eng-review. The minimal-risk
fix shipped in v1.49 (replaces `await res.body()` with
`await req.sizes()`, preserving the per-page listener); this is the
architectural follow-up.
**Priority:** P3. **Effort:** M-L.
---
### P3: Real-Chromium peak-RSS reproducer (periodic tier)
**What:** The gate-tier reproducer
(`browse/test/memory-leak-reproducer.test.ts`) pins the invariant
that `res.body()` is never called during a burst of
`requestfinished` events. It uses a fake page; it does NOT spin up a
real Chromium nor measure peak Bun RSS during a real concurrent fetch
burst. A periodic-tier follow-up should: spin up a real headless
Chromium, navigate to a fixture page that concurrently fetches 500
mixed responses (small JSON, 100 KB images, 10 MB chunked,
gzip-compressed 2 MB), sample `process.memoryUsage().heapUsed` every
100 ms during the burst, assert `peak_heap < 200 MB above baseline`
AND `post-gc_heap < 30 MB above baseline`. Also include a single-tab
WebGL canvas variant that grows to >4 GB and asserts the per-tab RSS
toast fires.
**Why:** Codex flagged that the leak's real failure mode is transient
amplification under concurrent burst, not retained leak — a steady-state
heap test misses it. The fake-page gate-tier test catches the
listener-architecture regression; the periodic real-browser test
catches the actual peak-RSS class.
**Pros:** Closes the "did we actually demonstrate the OOM is fixed"
question with hard numbers. Feeds the ANGLE_B_NUMBERS CHANGELOG
release-summary table.
**Cons:** Periodic tier costs minutes of CI time and money per run;
real-browser memory tests are inherently flaky.
**Context:** Codex outside-voice finding on the eng-review; D7
ANGLE_B_NUMBERS CHANGELOG framing needs this reproducer's numbers
before /ship time.
**Priority:** P3. **Effort:** M.
---
## design daemon: follow-ups (filed v1.45.0.0 via /ship review army)
### ✅ DONE (v1.45.0.0): Tighten daemon test coverage

View File

@ -921,6 +921,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
| `disconnect` | Disconnect headed browser, return to headless mode |
| `focus [@ref]` | Bring headed browser window to foreground (macOS) |
| `handoff [message]` | Open visible Chrome at current page for user takeover |
| `memory [--json]` | Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes. JSON output with --json. |
| `restart` | Restart server |
| `resume` | Re-snapshot after user takeover, return control to AI |
| `state save|load <name>` | Save/load browser state (cookies + URLs) |

View File

@ -18,9 +18,12 @@
import { chromium, type Browser, type BrowserContext, type BrowserContextOptions, type Page, type Locator, type Cookie } from 'playwright';
import { writeSecureFile, mkdirSecure } from './file-permissions';
import { addConsoleEntry, addNetworkEntry, addDialogEntry, networkBuffer, type DialogEntry } from './buffers';
import { emitActivity } from './activity';
import { validateNavigationUrl } from './url-validation';
import { TabSession, type RefEntry } from './tab-session';
import { resolveChromiumProfile, cleanSingletonLocks } from './config';
import { withCdpSession } from './cdp-bridge';
import type { MemorySnapshot, MemoryStructureStats, MemoryTabSnapshot, MemoryProcess } from './memory-snapshot';
/**
* Detect whether GSTACK_CHROMIUM_PATH points at a custom Chromium build that
@ -194,6 +197,51 @@ export class BrowserManager {
private connectionMode: 'launched' | 'headed' = 'launched';
private intentionalDisconnect = false;
// ─── Tab Count Guardrail (D5 + Codex single-tab flag) ───────
// Idempotent threshold trackers: each guardrail fires exactly once per
// upward crossing of its threshold and re-arms when the tab count drops
// back below. Pre-guardrail, nothing tracked tab count growth and a
// user could accumulate hundreds of tabs (each holding 50300 MB of
// Chromium-side RSS) without warning until the OS OOM-killer fired.
// The toast UX lives in the sidebar (extension/sidepanel.js); the
// server-side responsibility is the audit-trail activity entry that
// appears in the activity feed even when the sidebar is closed.
private static readonly TAB_GUARDRAIL_SOFT = 50;
private static readonly TAB_GUARDRAIL_HARD = 200;
private tabGuardrailSoftHit = false;
private tabGuardrailHardHit = false;
/**
* Called from context.on('page') after a new tab is tracked. Emits at
* most one activity entry per upward crossing of each threshold.
*/
private checkTabGuardrails(): void {
const total = this.pages.size;
if (!this.tabGuardrailSoftHit && total >= BrowserManager.TAB_GUARDRAIL_SOFT) {
this.tabGuardrailSoftHit = true;
const msg = `Tab count crossed ${BrowserManager.TAB_GUARDRAIL_SOFT} (now ${total}). Consider closing unused tabs — each Chromium tab holds 50300 MB.`;
console.warn(`[browse] ${msg}`);
emitActivity({ type: 'error', command: 'tab-guardrail', error: msg, tabs: total });
}
if (!this.tabGuardrailHardHit && total >= BrowserManager.TAB_GUARDRAIL_HARD) {
this.tabGuardrailHardHit = true;
const msg = `Tab count crossed ${BrowserManager.TAB_GUARDRAIL_HARD} (now ${total}). OOM risk imminent. Open the sidebar to see top RAM consumers.`;
console.error(`[browse] ${msg}`);
emitActivity({ type: 'error', command: 'tab-guardrail', error: msg, tabs: total });
}
}
/** Called from page.on('close') so the guardrails re-arm. */
private recheckTabGuardrailsOnClose(): void {
const total = this.pages.size;
if (this.tabGuardrailSoftHit && total < BrowserManager.TAB_GUARDRAIL_SOFT) {
this.tabGuardrailSoftHit = false;
}
if (this.tabGuardrailHardHit && total < BrowserManager.TAB_GUARDRAIL_HARD) {
this.tabGuardrailHardHit = false;
}
}
// Called when the headed browser disconnects without intentional teardown
// (user closed the window). Wired up by server.ts to run full cleanup
// (sidebar-agent, state file, profile locks) before exiting with code 2.
@ -620,6 +668,7 @@ export class BrowserManager {
// Inject indicator on the new tab
page.evaluate(indicatorScript).catch(() => {});
console.log(`[browse] New tab detected (id=${id}, total=${this.pages.size})`);
this.checkTabGuardrails();
});
// Persistent context opens a default page — adopt it instead of creating a new one
@ -1004,6 +1053,116 @@ export class BrowserManager {
}
}
/**
* Diagnostic for `$B memory` and the /memory endpoint.
*
* Collects:
* - Bun process memory (cross-platform, accurate, no shelling).
* - Per-tab JS heap via CDP Performance.getMetrics the most portable
* per-tab signal CDP exposes. Misses native/GPU/Skia/cache memory
* (Codex flag on the eng-review; see follow-up TODO "native/GPU
* memory breakdown").
* - Chromium process tree via SystemInfo.getProcessInfo PID + type
* + CPU time. Per-process RSS is NOT exposed via CDP and the eng
* review (D2 USE_CDP) explicitly chose CDP over shelling to `ps`,
* so RSS columns are absent and `notes[]` says why.
*
* `structures` is passed in by the caller (read-commands / server) so
* browser-manager doesn't take a hard dep on every buffer-owning module.
*/
async getMemorySnapshot(structures: MemoryStructureStats): Promise<MemorySnapshot> {
const bunMem = process.memoryUsage();
const notes: string[] = [];
// Per-tab JS heap. Lazy: only the pages we already track. A target
// that died mid-snapshot is omitted, never throws.
const tabs: MemoryTabSnapshot[] = [];
for (const [id, page] of this.pages) {
try {
const url = (() => { try { return page.url(); } catch { return ''; } })();
const title = await page.title().catch(() => '');
const metrics = await withCdpSession(page, async (session) => {
await session.send('Performance.enable').catch(() => undefined);
const result = await session.send('Performance.getMetrics');
return ((result as { metrics?: Array<{ name: string; value: number }> }).metrics) ?? [];
});
const mm: Record<string, number> = {};
for (const m of metrics) mm[m.name] = m.value;
tabs.push({
id,
url,
title,
jsHeapUsed: mm.JSHeapUsedSize ?? 0,
jsHeapTotal: mm.JSHeapTotalSize ?? 0,
documents: mm.Documents ?? 0,
nodes: mm.Nodes ?? 0,
listeners: mm.JSEventListeners ?? 0,
});
} catch {
// Target died or CDP unavailable mid-snapshot — skip this tab.
}
}
// Chromium process tree. Browser handle may be on the `browser` field
// (launched mode) or accessible via `context.browser()` (persistent
// context / headed mode); try both.
let processes: MemoryProcess[] | null = null;
const browser: Browser | null = this.browser ?? (this.context ? this.context.browser() : null);
if (browser) {
try {
// `newBrowserCDPSession` is browser-wide. Not exposed on every
// Playwright TypeScript surface, but present at runtime on the
// Browser instance — use a typed cast to avoid the @ts-expect-error.
type BrowserWithCDP = Browser & {
newBrowserCDPSession?: () => Promise<{
send: (method: string, params?: unknown) => Promise<unknown>;
detach: () => Promise<void>;
}>;
};
const maybeFactory = (browser as BrowserWithCDP).newBrowserCDPSession;
if (typeof maybeFactory === 'function') {
const browserSession = await maybeFactory.call(browser);
try {
const info = (await browserSession.send('SystemInfo.getProcessInfo')) as {
processInfo?: Array<{ id: number; type: string; cpuTime: number }>;
};
processes = (info.processInfo ?? []).map((p) => ({
id: p.id,
type: p.type,
cpuTime: p.cpuTime,
}));
notes.push(
'Per-Chromium-process RSS not collected — SystemInfo.getProcessInfo exposes PID+type+CPU only. ' +
'See follow-up TODO "native/GPU memory breakdown" for the deferred fix.',
);
} finally {
await browserSession.detach().catch(() => undefined);
}
} else {
notes.push('Playwright build does not expose newBrowserCDPSession; per-process info skipped.');
}
} catch (err: any) {
notes.push(`CDP browser session unavailable: ${err?.message ?? String(err)}`);
}
} else {
notes.push('Browser handle unavailable (server connection mode); per-process info skipped.');
}
return {
bunServer: {
rss: bunMem.rss,
heapUsed: bunMem.heapUsed,
heapTotal: bunMem.heapTotal,
external: bunMem.external,
},
tabs,
processes,
structures,
capturedAt: Date.now(),
notes,
};
}
// ─── Ref Map (delegates to active session) ──────────────────
setRefMap(refs: Map<string, RefEntry>) {
this.getActiveSession().setRefMap(refs);
@ -1530,6 +1689,7 @@ export class BrowserManager {
break;
}
}
this.recheckTabGuardrailsOnClose();
});
// Clear ref map on navigation — refs point to stale elements after page change
@ -1598,23 +1758,38 @@ export class BrowserManager {
}
});
// Capture response sizes via response finished
// Capture response sizes via requestfinished — but DO NOT call
// response.body() here. Pre-fix, this listener materialized every
// response body across CDP just to read .length: multi-GB/hour of
// Buffer churn on long-lived headed Chromium with media-heavy
// pages, the primary Bun-side accelerant on the gbrowser-OOM
// investigation. req.sizes() pulls from the Network.loadingFinished
// event Chromium already emits — accurate for chunked transfer,
// gzip-compressed responses, and streaming media, all the cases
// where the previous Content-Length-header approach would have
// missed the size.
//
// The "single context-level CDP listener" architecture (D10's
// stretch goal — would reduce per-page listener count from N to 1
// via Target.setAutoAttach) is deferred. TODOS.md tracks it.
page.on('requestfinished', async (req) => {
try {
const res = await req.response();
if (res) {
const url = req.url();
const body = await res.body().catch(() => null);
const size = body ? body.length : 0;
for (let i = networkBuffer.length - 1; i >= 0; i--) {
const entry = networkBuffer.get(i);
if (entry && entry.url === url && !entry.size) {
networkBuffer.set(i, { ...entry, size });
break;
}
const sizes = await req.sizes().catch(() => null);
if (!sizes) return;
const url = req.url();
const size = sizes.responseBodySize ?? 0;
for (let i = networkBuffer.length - 1; i >= 0; i--) {
const entry = networkBuffer.get(i);
if (entry && entry.url === url && !entry.size) {
networkBuffer.set(i, { ...entry, size });
break;
}
}
} catch {}
} catch {
// Best-effort: requestfinished fires for aborted/cached requests too,
// where sizes() is unavailable. Missing size is acceptable; an
// unbounded throw would noise the console for every cache hit.
}
});
}
}

View File

@ -25,18 +25,84 @@ import { logTelemetry } from './telemetry';
const CDP_TIMEOUT_MS = 5000;
const CDP_ACQUIRE_TIMEOUT_MS = 5000;
// Per-page CDPSession cache. Created lazily on first allow-listed call,
// cleaned up when the page closes.
// ─── CDP session lifecycle helpers ─────────────────────────────
//
// Every direct `newCDPSession(page)` call needs a matching `session.detach()`
// to release the Chromium-side CDP target. Forgetting the detach leaves the
// target attached until the underlying transport drops (often process exit),
// which on a long-lived headed browser shows up as steadily-climbing
// browser-process RSS. To make the leak class unforgettable, callers should
// go through one of these two helpers and a static-grep test
// (browse/test/cdp-session-cleanup.test.ts) fails CI if any source file
// calls `newCDPSession(` outside this module.
/**
* Ephemeral CDP session with try/finally detach. Use for one-shot CDP work
* where the caller doesn't need session reuse e.g. archive snapshots,
* `$B memory`, a single `Page.captureScreenshot`. The session is detached
* in `finally` regardless of whether `fn` threw, so the Chromium target
* doesn't leak on the error path.
*
* For repeated use of the same page (e.g. the `$B cdp` bridge or the
* inspector), use `getOrCreateCdpSession` instead it caches and detaches
* on page close.
*/
export async function withCdpSession<T>(
page: Page,
fn: (session: any) => Promise<T>,
): Promise<T> {
const session = await page.context().newCDPSession(page);
try {
return await fn(session);
} finally {
try {
await session.detach();
} catch {
// Best-effort cleanup. Session may already be detached (target closed,
// context recreated, browser disconnect). Swallowing all errors is the
// correct cleanup posture per CLAUDE.md "best-effort cleanup paths".
}
}
}
/**
* Cached long-lived CDP session keyed by Page. First call creates the
* session and registers a `page.once('close', ...)` hook that removes the
* cache entry AND calls `session.detach()`. Pre-helper code only removed
* the cache entry, leaving the Chromium-side target attached.
*
* Pass a caller-owned WeakMap so this helper doesn't impose a single global
* cache the `$B cdp` bridge and the inspector each keep their own session
* pool with different invariants (e.g. the inspector also detaches on
* `framenavigated` because DOM/CSS domain state is tied to the document).
*/
export async function getOrCreateCdpSession(
page: Page,
cache: WeakMap<Page, any>,
): Promise<any> {
let session = cache.get(page);
if (session) return session;
session = await page.context().newCDPSession(page);
cache.set(page, session);
page.once('close', () => {
cache.delete(page);
session.detach().catch(() => {
// Best-effort cleanup — see withCdpSession finally block.
});
});
return session;
}
// ─── $B cdp bridge ─────────────────────────────────────────────
// Per-page CDPSession cache. Lifecycle delegated to getOrCreateCdpSession
// which registers a close hook that BOTH removes the cache entry AND calls
// session.detach() — pre-helper code only did the former, leaving the
// Chromium-side target attached.
const sessionCache: WeakMap<Page, any> = new WeakMap();
async function getCdpSession(page: Page): Promise<any> {
let s = sessionCache.get(page);
if (s) return s;
s = await page.context().newCDPSession(page);
sessionCache.set(page, s);
// Clear cache on detach so we don't hold a stale handle.
page.once('close', () => sessionCache.delete(page));
return s;
return getOrCreateCdpSession(page, sessionCache);
}
export interface CdpDispatchInput {

View File

@ -13,6 +13,7 @@
*/
import type { Page } from 'playwright';
import { getOrCreateCdpSession } from './cdp-bridge';
// ─── Types ──────────────────────────────────────────────────────
@ -106,15 +107,23 @@ async function getOrCreateSession(page: Page): Promise<any> {
}
}
session = await page.context().newCDPSession(page);
cdpSessions.set(page, session);
session = await getOrCreateCdpSession(page, cdpSessions);
// Enable DOM and CSS domains
await session.send('DOM.enable');
await session.send('CSS.enable');
initializedPages.add(page);
// Enable DOM and CSS domains on first init for this page. The session
// itself is cached + close-detached by getOrCreateCdpSession; the
// initializedPages WeakSet is inspector-layer state that needs its
// own close hook to stay in sync.
if (!initializedPages.has(page)) {
await session.send('DOM.enable');
await session.send('CSS.enable');
initializedPages.add(page);
page.once('close', () => initializedPages.delete(page));
}
// Auto-detach on navigation
// Auto-detach on navigation — DOM/CSS domain state is tied to the
// document. Close-detach (from getOrCreateCdpSession) handles the
// tab-close case; framenavigated catches in-tab navigation that
// invalidates inspector state without closing the tab.
page.once('framenavigated', () => {
try {
session.detach().catch(() => {});
@ -130,7 +139,41 @@ async function getOrCreateSession(page: Page): Promise<any> {
// ─── Modification History ───────────────────────────────────────
// Bounded FIFO of style modifications. Pre-cap, this was an unbounded
// module-scoped array that grew for every CSS edit made through $B css
// across the whole browser session — small per-entry footprint but no
// upper bound, the kind of slow leak that compounds over multi-day
// inspector use. The cap is 200 because per-session undo workflows
// rarely walk back more than a handful of edits, and a user who really
// wants to roll a long change back can `$B css reset` to revert all of
// them. totalPushed is monotonic across the session so undoModification
// can tell the user when their target index has been evicted, instead
// of just "no modification at index N".
const MOD_HISTORY_CAP = 200;
const modificationHistory: StyleModification[] = [];
let modHistoryTotalPushed = 0;
function pushModification(mod: StyleModification): void {
modificationHistory.push(mod);
modHistoryTotalPushed++;
while (modificationHistory.length > MOD_HISTORY_CAP) {
modificationHistory.shift();
}
}
// Test-only entry: exposes the history-cap mechanics (push, reset, cap value)
// without requiring a CDP-driven Page. Production code must go through
// modifyStyle / undoModification / resetModifications.
export const __testInternals = {
pushModification,
MOD_HISTORY_CAP,
getRawHistory: () => modificationHistory.slice(),
getTotalPushed: () => modHistoryTotalPushed,
resetForTest: () => {
modificationHistory.length = 0;
modHistoryTotalPushed = 0;
},
};
// ─── Specificity Calculation ────────────────────────────────────
@ -559,7 +602,7 @@ export async function modifyStyle(
method,
};
modificationHistory.push(modification);
pushModification(modification);
return modification;
}
@ -569,7 +612,12 @@ export async function modifyStyle(
export async function undoModification(page: Page, index?: number): Promise<void> {
const idx = index ?? modificationHistory.length - 1;
if (idx < 0 || idx >= modificationHistory.length) {
throw new Error(`No modification at index ${idx}. History has ${modificationHistory.length} entries.`);
const evictedNote = modHistoryTotalPushed > MOD_HISTORY_CAP
? ` (most recent ${MOD_HISTORY_CAP} only — ${modHistoryTotalPushed - MOD_HISTORY_CAP} earlier entries evicted at the cap)`
: '';
throw new Error(
`No modification at index ${idx}. History has ${modificationHistory.length} entries${evictedNote}.`,
);
}
const mod = modificationHistory[idx];
@ -622,6 +670,23 @@ export function getModificationHistory(): StyleModification[] {
return [...modificationHistory];
}
/**
* Diagnostic accessor for the $B memory snapshot. Returns current buffer
* occupancy, the cap, and how many entries have been evicted since the
* last reset.
*/
export function getModificationHistoryStats(): {
current: number;
cap: number;
evicted: number;
} {
return {
current: modificationHistory.length,
cap: MOD_HISTORY_CAP,
evicted: Math.max(0, modHistoryTotalPushed - MOD_HISTORY_CAP),
};
}
/**
* Reset all modifications, restoring original values.
*/
@ -648,6 +713,7 @@ export async function resetModifications(page: Page): Promise<void> {
}
}
modificationHistory.length = 0;
modHistoryTotalPushed = 0;
}
/**

View File

@ -45,6 +45,7 @@ export const META_COMMANDS = new Set([
'domain-skill',
'skill',
'cdp',
'memory',
]);
export const ALL_COMMANDS = new Set([...READ_COMMANDS, ...WRITE_COMMANDS, ...META_COMMANDS]);
@ -89,6 +90,7 @@ export function wrapUntrustedContent(result: string, url: string): string {
export const COMMAND_DESCRIPTIONS: Record<string, { category: string; description: string; usage?: string }> = {
// Navigation
'memory': { category: 'Server', description: 'Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes. JSON output with --json.', usage: 'memory [--json]' },
'goto': { category: 'Navigation', description: 'Navigate to URL (http://, https://, or file:// scoped to cwd/TEMP_DIR)', usage: 'goto <url>' },
'load-html': { category: 'Navigation', description: 'Load HTML via setContent. Accepts a file path under safe-dirs (validated), OR --from-file <payload.json> with {"html":"...","waitUntil":"..."} for large inline HTML (Windows argv safe).', usage: 'load-html <file> [--wait-until load|domcontentloaded|networkidle] [--tab-id <N>] | load-html --from-file <payload.json> [--tab-id <N>]' },
'back': { category: 'Navigation', description: 'History back' },

View File

@ -0,0 +1,115 @@
// `$B memory` — diagnostic snapshot of Bun heap + per-tab JS heap +
// Chromium process tree + bounded buffer sizes. Lives in its own file
// because the meta-commands dispatcher imports it lazily — projects
// that never run the diagnostic don't pay the import-graph cost (CDP
// bridge, memory-snapshot types, buffer accessors).
import type { BrowserManager } from './browser-manager';
import { formatBytes, type MemorySnapshot, type MemoryStructureStats } from './memory-snapshot';
import { getModificationHistoryStats } from './cdp-inspector';
import { getSubscriberCount as getActivitySubscriberCount } from './activity';
import { getInspectorSubscriberCount } from './server';
import { consoleBuffer, networkBuffer, dialogBuffer } from './buffers';
import { getCaptureBuffer } from './network-capture';
/**
* Assemble the MemoryStructureStats from the modules that own each buffer.
* Browser-manager doesn't take a hard dep on every buffer-owning module
* the snapshot caller passes them in.
*/
function collectStructureStats(): MemoryStructureStats {
return {
modificationHistory: getModificationHistoryStats(),
activitySubscribers: getActivitySubscriberCount(),
inspectorSubscribers: getInspectorSubscriberCount(),
consoleBufferLen: consoleBuffer.length,
networkBufferLen: networkBuffer.length,
dialogBufferLen: dialogBuffer.length,
captureBufferBytes: getCaptureBuffer().byteSize,
};
}
/**
* Pretty-print the snapshot for terminal output. JSON mode (--json) goes
* straight through JSON.stringify so the extension footer and any test
* harness can consume it programmatically.
*/
function formatSnapshotText(s: MemorySnapshot): string {
const lines: string[] = [];
lines.push(
`Bun server: RSS: ${formatBytes(s.bunServer.rss)} ` +
`heap: ${formatBytes(s.bunServer.heapUsed)} / ${formatBytes(s.bunServer.heapTotal)} ` +
`external: ${formatBytes(s.bunServer.external)}`,
);
if (s.processes && s.processes.length > 0) {
// Group by type so the user sees "renderer: 12" vs listing 12 separate rows.
const byType: Record<string, number> = {};
for (const p of s.processes) byType[p.type] = (byType[p.type] ?? 0) + 1;
const typeSummary = Object.entries(byType)
.map(([t, n]) => `${t}=${n}`)
.join(' ');
lines.push(`Chromium processes: ${s.processes.length} total (${typeSummary})`);
} else if (s.processes === null) {
lines.push('Chromium processes: (unavailable — see notes)');
} else {
lines.push('Chromium processes: 0');
}
if (s.tabs.length > 0) {
// Sort by JS heap descending; show top 10 plus "...N more" tail.
const sorted = [...s.tabs].sort((a, b) => b.jsHeapUsed - a.jsHeapUsed);
const shown = sorted.slice(0, 10);
lines.push(`Renderers: ${s.tabs.length} tabs (top by JS heap):`);
for (const t of shown) {
const urlShort = t.url.length > 80 ? t.url.slice(0, 77) + '...' : t.url;
lines.push(
` [${formatBytes(t.jsHeapUsed).padStart(8)} JS, ` +
`${String(t.nodes).padStart(6)} nodes, ` +
`${String(t.listeners).padStart(5)} listeners] ` +
`tab #${t.id}${urlShort}`,
);
}
if (sorted.length > shown.length) {
lines.push(` ...and ${sorted.length - shown.length} more`);
}
} else {
lines.push('Renderers: (no tabs tracked)');
}
lines.push('─────────────────────────────────────────────────');
lines.push('In-memory structures (Bun side):');
const m = s.structures.modificationHistory;
lines.push(
` modificationHistory: ${m.current} / ${m.cap} entries` +
(m.evicted > 0 ? ` (${m.evicted} evicted since reset)` : ''),
);
lines.push(` inspectorSubscribers: ${s.structures.inspectorSubscribers}`);
lines.push(` activitySubscribers: ${s.structures.activitySubscribers}`);
lines.push(` consoleBuffer: ${s.structures.consoleBufferLen} entries`);
lines.push(` networkBuffer: ${s.structures.networkBufferLen} entries`);
lines.push(` dialogBuffer: ${s.structures.dialogBufferLen} entries`);
lines.push(` captureBuffer: ${formatBytes(s.structures.captureBufferBytes)}`);
if (s.notes.length > 0) {
lines.push('');
lines.push('Notes:');
for (const n of s.notes) lines.push(` - ${n}`);
}
return lines.join('\n');
}
export async function handleMemoryCommand(args: string[], bm: BrowserManager): Promise<string> {
const jsonMode = args.includes('--json');
const structures = collectStructureStats();
const snapshot = await bm.getMemorySnapshot(structures);
if (jsonMode) return JSON.stringify(snapshot);
return formatSnapshotText(snapshot);
}
/** Entry point used by the /memory HTTP endpoint — same data, always JSON. */
export async function buildMemorySnapshotJson(bm: BrowserManager): Promise<MemorySnapshot> {
const structures = collectStructureStats();
return bm.getMemorySnapshot(structures);
}

View File

@ -0,0 +1,73 @@
// Shared types for the $B memory diagnostic command and the /memory
// endpoint. Lives in its own module so server.ts, read-commands.ts, and
// the extension footer poll can import without taking a circular dep on
// browser-manager.ts.
//
// Background: the gbrowser-OOM investigation (160 GB Activity Monitor
// reading on a friend's machine) needed a diagnostic that could land
// before the next incident — measurement comes first, fixes come after.
// $B memory is that diagnostic.
/** Counts/bytes for the bounded in-memory structures on the Bun side. */
export interface MemoryStructureStats {
modificationHistory: { current: number; cap: number; evicted: number };
activitySubscribers: number;
inspectorSubscribers: number;
consoleBufferLen: number;
networkBufferLen: number;
dialogBufferLen: number;
captureBufferBytes: number;
}
/** Per-tab JS heap snapshot (CDP Performance.getMetrics). */
export interface MemoryTabSnapshot {
id: number;
url: string;
title: string;
jsHeapUsed: number;
jsHeapTotal: number;
documents: number;
nodes: number;
listeners: number;
}
/** Chromium process metadata via CDP SystemInfo.getProcessInfo. */
export interface MemoryProcess {
/** Chromium-internal process id (not OS PID). */
id: number;
/** 'browser' | 'renderer' | 'gpu' | 'utility' | 'extension' | ... */
type: string;
/** CPU time accumulated since process start (seconds). */
cpuTime: number;
}
export interface MemorySnapshot {
bunServer: {
rss: number;
heapUsed: number;
heapTotal: number;
external: number;
};
tabs: MemoryTabSnapshot[];
/**
* Chromium process tree. `null` when no browser handle is available
* (server in connection mode, or browser not yet launched).
*
* Per-process RSS is NOT included: SystemInfo.getProcessInfo returns
* id+type+cpuTime but Chromium does not expose RSS via CDP. The
* `notes[]` field tells the caller why see the follow-up TODO
* "native/GPU memory breakdown" for the deferred fix.
*/
processes: MemoryProcess[] | null;
structures: MemoryStructureStats;
capturedAt: number;
notes: string[];
}
/** Format bytes as a short human string ("1.4 GB", "312 MB", "84 KB"). */
export function formatBytes(n: number): string {
if (n < 1024) return `${n} B`;
if (n < 1024 * 1024) return `${(n / 1024).toFixed(1)} KB`;
if (n < 1024 * 1024 * 1024) return `${(n / 1024 / 1024).toFixed(1)} MB`;
return `${(n / 1024 / 1024 / 1024).toFixed(2)} GB`;
}

View File

@ -1161,6 +1161,13 @@ export async function handleMetaCommand(
return await handleCdpCommand(args, bm);
}
case 'memory': {
// Lazy import — pulls in cdp-bridge + memory-snapshot + buffer accessors
// that aren't useful for projects that never run the diagnostic.
const { handleMemoryCommand } = await import('./memory-command');
return await handleMemoryCommand(args, bm);
}
default:
throw new Error(`Unknown meta command: ${command}`);
}

View File

@ -38,6 +38,7 @@ import {
import { validateTempPath } from './path-security';
import { resolveConfig, ensureStateDir, readVersionHash, resolveChromiumProfile, cleanSingletonLocks } from './config';
import { emitActivity, subscribe, getActivityAfter, getActivityHistory, getSubscriberCount } from './activity';
import { createSseEndpoint } from './sse-helpers';
import { initAuditLog, writeAuditEntry } from './audit';
import { inspectElement, modifyStyle, resetModifications, getModificationHistory, detachSession, type InspectorResult } from './cdp-inspector';
// Bun.spawn used instead of child_process.spawn (compiled bun binaries
@ -723,6 +724,11 @@ let inspectorTimestamp: number = 0;
type InspectorSubscriber = (event: any) => void;
const inspectorSubscribers = new Set<InspectorSubscriber>();
/** Diagnostic accessor used by the $B memory snapshot. */
export function getInspectorSubscriberCount(): number {
return inspectorSubscribers.size;
}
function emitInspectorEvent(event: any): void {
for (const notify of inspectorSubscribers) {
queueMicrotask(() => {
@ -2432,62 +2438,19 @@ export function buildFetchHandler(cfg: ServerConfig): ServerHandle {
});
}
const afterId = parseInt(url.searchParams.get('after') || '0', 10);
const encoder = new TextEncoder();
const stream = new ReadableStream({
start(controller) {
// SSE egress invariant: every JSON.stringify here ships page-content-derived
// fields (URLs, command args, errors) to the sidebar. Lone surrogates must
// be sanitized DURING stringify (via sanitizeReplacer) so they're cleaned
// before escape-encoding — post-stringify regex is ineffective because
// JSON.stringify has already converted \uD800 → "\\ud800".
// 1. Gap detection + replay
// Cleanup contract (abort + enqueue-fail + heartbeat-fail, all
// idempotent) lives in createSseEndpoint; sanitizeReplacer is
// applied to every JSON.stringify inside the helper, so
// page-content-derived fields (URLs, command args, errors)
// stay surrogate-safe per CLAUDE.md egress invariant.
return createSseEndpoint(req, {
initialReplay: (send) => {
const { entries, gap, gapFrom, availableFrom } = getActivityAfter(afterId);
if (gap) {
controller.enqueue(encoder.encode(`event: gap\ndata: ${JSON.stringify({ gapFrom, availableFrom }, sanitizeReplacer)}\n\n`));
}
for (const entry of entries) {
controller.enqueue(encoder.encode(`event: activity\ndata: ${JSON.stringify(entry, sanitizeReplacer)}\n\n`));
}
// 2. Subscribe for live events
const unsubscribe = subscribe((entry) => {
try {
controller.enqueue(encoder.encode(`event: activity\ndata: ${JSON.stringify(entry, sanitizeReplacer)}\n\n`));
} catch (err: any) {
console.debug('[browse] Activity SSE stream error, unsubscribing:', err.message);
unsubscribe();
}
});
// 3. Heartbeat every 15s
const heartbeat = setInterval(() => {
try {
controller.enqueue(encoder.encode(`: heartbeat\n\n`));
} catch (err: any) {
console.debug('[browse] Activity SSE heartbeat failed:', err.message);
clearInterval(heartbeat);
unsubscribe();
}
}, 15000);
// 4. Cleanup on disconnect
req.signal.addEventListener('abort', () => {
clearInterval(heartbeat);
unsubscribe();
try { controller.close(); } catch {
// Expected: stream already closed
}
});
},
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
if (gap) send('gap', { gapFrom, availableFrom });
for (const entry of entries) send('activity', entry);
},
subscribe,
liveEventName: 'activity',
});
}
@ -2796,6 +2759,32 @@ export function buildFetchHandler(cfg: ServerConfig): ServerHandle {
});
}
// GET /memory — diagnostic snapshot (auth required, does NOT reset idle).
// Same auth model as /activity/stream and /inspector/events: Bearer header
// OR view-only SSE-session cookie. Does NOT extend /health (which already
// leaks AUTH_TOKEN to any localhost caller in headed mode — see TODOS.md
// "Audit /health token distribution"); a separate endpoint with the
// standard SSE auth keeps the future /health fix from cascading into the
// sidebar footer poll.
if (url.pathname === '/memory' && req.method === 'GET') {
const cookieToken = extractSseCookie(req);
if (!validateAuth(req) && !validateSseSessionToken(cookieToken)) {
return new Response(JSON.stringify({ error: 'Unauthorized' }), {
status: 401, headers: { 'Content-Type': 'application/json' },
});
}
const { buildMemorySnapshotJson } = await import('./memory-command');
const snapshot = await buildMemorySnapshotJson(cfgBrowserManager);
// sanitizeReplacer is required at every SSE/JSON egress that ships
// page-content-derived strings — tab.url and tab.title come from
// page content, so lone-surrogate bytes from broken emoji or
// mid-emoji splits could otherwise reach the sidebar / Claude API.
return new Response(JSON.stringify(snapshot, sanitizeReplacer), {
status: 200,
headers: { 'Content-Type': 'application/json' },
});
}
// GET /inspector/events — SSE for inspector state changes (auth required)
if (url.pathname === '/inspector/events' && req.method === 'GET') {
// Same auth model as /activity/stream: Bearer OR view-only cookie.
@ -2806,62 +2795,20 @@ export function buildFetchHandler(cfg: ServerConfig): ServerHandle {
status: 401, headers: { 'Content-Type': 'application/json' },
});
}
const encoder = new TextEncoder();
const stream = new ReadableStream({
start(controller) {
// SSE egress invariant: inspectorData and CDP event payloads carry
// page-DOM strings (selectors, attribute values, console messages).
// sanitizeReplacer cleans lone surrogates DURING JSON.stringify so
// they're neutralized before escape-encoding (post-stringify regex
// is a no-op once \uD800 has become "\\ud800").
// Send current state immediately
if (inspectorData) {
controller.enqueue(encoder.encode(
`event: state\ndata: ${JSON.stringify({ data: inspectorData, timestamp: inspectorTimestamp }, sanitizeReplacer)}\n\n`
));
}
// Subscribe for live events
const notify: InspectorSubscriber = (event) => {
try {
controller.enqueue(encoder.encode(
`event: inspector\ndata: ${JSON.stringify(event, sanitizeReplacer)}\n\n`
));
} catch (err: any) {
console.debug('[browse] Inspector SSE stream error:', err.message);
inspectorSubscribers.delete(notify);
}
};
// Cleanup contract (abort + enqueue-fail + heartbeat-fail,
// idempotent) lives in createSseEndpoint; sanitizeReplacer is
// applied to every JSON.stringify inside the helper. The
// inspector subscriber set stays here because it's also written
// to by emitInspectorEvent above.
return createSseEndpoint(req, {
initialReplay: inspectorData
? (send) => send('state', { data: inspectorData, timestamp: inspectorTimestamp })
: undefined,
subscribe: (notify) => {
inspectorSubscribers.add(notify);
// Heartbeat every 15s
const heartbeat = setInterval(() => {
try {
controller.enqueue(encoder.encode(`: heartbeat\n\n`));
} catch (err: any) {
console.debug('[browse] Inspector SSE heartbeat failed:', err.message);
clearInterval(heartbeat);
inspectorSubscribers.delete(notify);
}
}, 15000);
// Cleanup on disconnect
req.signal.addEventListener('abort', () => {
clearInterval(heartbeat);
inspectorSubscribers.delete(notify);
try { controller.close(); } catch (err: any) {
// Expected: stream already closed
}
});
},
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
return () => inspectorSubscribers.delete(notify);
},
liveEventName: 'inspector',
});
}

154
browse/src/sse-helpers.ts Normal file
View File

@ -0,0 +1,154 @@
// SSE endpoint helper — shared cleanup contract for stream endpoints.
//
// Pre-helper, /activity/stream and /inspector/events implemented the same
// pattern in parallel and both leaked subscribers when enqueue failed
// without a corresponding abort signal (e.g. Chromium MV3 service-worker
// suspend dropped the TCP without an abort edge). The subscriber closure
// stayed in the Set, capturing the ReadableStreamDefaultController plus
// any payloads queued behind it. Over a multi-day sidebar session this
// compounded into multi-MB of retained controllers per dead connection.
//
// Centralizing the cleanup contract here means any future SSE endpoint
// inherits the invariant — cleanup runs on abort, enqueue failure, AND
// heartbeat failure, exactly once, regardless of which edge fires first.
import { stripLoneSurrogates } from './sanitize';
/**
* JSON.stringify replacer that strips lone UTF-16 surrogates from string
* values before they get escape-encoded. Pair with stringify when the
* consumer will JSON.parse the payload back into JS strings (SSE clients
* do this). Required at every SSE egress that ships page-content-derived
* fields see CLAUDE.md "Unicode sanitization at server egress".
*/
function sanitizeReplacer(_key: string, value: unknown): unknown {
return typeof value === 'string' ? stripLoneSurrogates(value) : value;
}
/** Send an SSE event. Handles JSON encoding + lone-surrogate sanitization. */
export type SseSender = (event: string, data: unknown) => void;
export interface SseEndpointConfig<T> {
/**
* Optional. Runs once after the stream opens, before subscribing for live
* events. Use for initial event replay (activity gap detection, history
* burst) or a current-state snapshot (inspector). The `send` helper
* handles JSON encoding with sanitizeReplacer and SSE framing; pass
* any event name and any payload object.
*/
initialReplay?: (send: SseSender) => void;
/**
* Subscribe to the live event source. Receives a `notify` callback;
* returns an unsubscribe function. The callback routes through the
* helper's safeEnqueue + cleanup-on-throw, so a dead consumer ends up
* removed from the subscriber set on the very next event (instead of
* waiting for an abort that may never fire).
*/
subscribe: (notify: (entry: T) => void) => () => void;
/**
* SSE event name for live events. `data: <JSON.stringify(entry)>\n\n`
* is wrapped automatically. /activity/stream uses 'activity';
* /inspector/events uses 'inspector'.
*/
liveEventName: string;
/** Heartbeat interval in ms. Default: 15000. */
heartbeatMs?: number;
}
/**
* Build a streaming Response that owns the cleanup contract:
* - safeEnqueue catches enqueue throws cleanup
* - 15s heartbeat catches dead peers; failure cleanup
* - req.signal abort cleanup
* - cleanup is idempotent (clearInterval + unsubscribe + try close)
*/
export function createSseEndpoint<T>(
req: Request,
config: SseEndpointConfig<T>,
): Response {
const heartbeatMs = config.heartbeatMs ?? 15000;
const encoder = new TextEncoder();
const stream = new ReadableStream({
start(controller) {
let cleanedUp = false;
let heartbeat: ReturnType<typeof setInterval> | null = null;
let unsubscribe: (() => void) | null = null;
const cleanup = (): void => {
if (cleanedUp) return;
cleanedUp = true;
if (heartbeat !== null) {
clearInterval(heartbeat);
heartbeat = null;
}
if (unsubscribe !== null) {
unsubscribe();
unsubscribe = null;
}
try {
controller.close();
} catch {
// Expected: stream already closed by the consumer.
}
};
const send: SseSender = (event, data) => {
if (cleanedUp) return;
try {
controller.enqueue(
encoder.encode(
`event: ${event}\ndata: ${JSON.stringify(data, sanitizeReplacer)}\n\n`,
),
);
} catch {
// Consumer disconnected mid-write. Tear down so this subscriber
// doesn't sit in the set forever.
cleanup();
}
};
// Initial replay (caller-provided).
if (config.initialReplay) {
try {
config.initialReplay(send);
} catch {
cleanup();
return;
}
if (cleanedUp) return;
}
// Subscribe for live events.
unsubscribe = config.subscribe((entry) => {
send(config.liveEventName, entry);
});
// Heartbeat keeps NAT boxes and proxies from dropping idle SSE,
// and serves as a liveness probe: an enqueue failure here is the
// cheapest way to learn the consumer is gone without waiting for
// an abort signal that may never arrive.
heartbeat = setInterval(() => {
if (cleanedUp) return;
try {
controller.enqueue(encoder.encode(`: heartbeat\n\n`));
} catch {
cleanup();
}
}, heartbeatMs);
req.signal.addEventListener('abort', cleanup);
},
});
return new Response(stream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
},
});
}

View File

@ -18,6 +18,7 @@ import type { SetContentWaitUntil } from './tab-session';
import { TEMP_DIR, isPathWithin } from './platform';
import { SAFE_DIRECTORIES } from './path-security';
import { modifyStyle, undoModification, resetModifications, getModificationHistory } from './cdp-inspector';
import { withCdpSession } from './cdp-bridge';
/**
* Aggressive page cleanup selectors and heuristics.
@ -1409,9 +1410,10 @@ export async function handleWriteCommand(
validateOutputPath(outputPath);
try {
const cdp = await page.context().newCDPSession(page);
const { data } = await cdp.send('Page.captureSnapshot', { format: 'mhtml' });
await cdp.detach();
const data = await withCdpSession(page, async (cdp) => {
const result = await cdp.send('Page.captureSnapshot', { format: 'mhtml' });
return (result as { data: string }).data;
});
fs.writeFileSync(outputPath, data);
return `Archive saved: ${outputPath} (${Math.round(data.length / 1024)}KB, MHTML)`;
} catch (err: any) {

View File

@ -0,0 +1,95 @@
import { describe, test, expect, beforeEach } from 'bun:test';
import type { Page } from 'playwright';
import {
__testInternals,
undoModification,
} from '../src/cdp-inspector';
// Regression tests for the modificationHistory cap (D6 / smoking gun #2).
// Pre-cap, the module-scoped array grew unbounded across the session. Cap is
// 200 entries, oldest evicted on push past the cap. undoModification reports
// "evicted at the cap" in the error message so a user who asks for a
// no-longer-available index understands what happened (instead of seeing the
// pre-cap "No modification at index 500" with no context).
const { pushModification, MOD_HISTORY_CAP, getRawHistory, getTotalPushed, resetForTest } = __testInternals;
function fakeMod(id: number) {
return {
selector: `#node-${id}`,
property: 'color',
oldValue: 'red',
newValue: 'blue',
source: 'inline' as const,
timestamp: id,
method: 'setProperty' as 'setProperty',
};
}
beforeEach(() => {
resetForTest();
});
describe('modificationHistory cap', () => {
test('1. push under cap keeps every entry', () => {
for (let i = 0; i < 50; i++) pushModification(fakeMod(i));
expect(getRawHistory().length).toBe(50);
expect(getTotalPushed()).toBe(50);
expect(getRawHistory()[0].timestamp).toBe(0);
expect(getRawHistory()[49].timestamp).toBe(49);
});
test('2. push exactly cap keeps every entry', () => {
for (let i = 0; i < MOD_HISTORY_CAP; i++) pushModification(fakeMod(i));
expect(getRawHistory().length).toBe(MOD_HISTORY_CAP);
expect(getTotalPushed()).toBe(MOD_HISTORY_CAP);
expect(getRawHistory()[0].timestamp).toBe(0);
});
test('3. push past cap evicts oldest, keeps length at cap', () => {
const total = MOD_HISTORY_CAP + 50;
for (let i = 0; i < total; i++) pushModification(fakeMod(i));
expect(getRawHistory().length).toBe(MOD_HISTORY_CAP);
expect(getTotalPushed()).toBe(total);
// Oldest 50 dropped — entry that was #0 is gone; new oldest is #50.
expect(getRawHistory()[0].timestamp).toBe(50);
expect(getRawHistory()[MOD_HISTORY_CAP - 1].timestamp).toBe(total - 1);
});
test('4. resetForTest clears both buffer and totalPushed', () => {
for (let i = 0; i < 10; i++) pushModification(fakeMod(i));
resetForTest();
expect(getRawHistory().length).toBe(0);
expect(getTotalPushed()).toBe(0);
});
});
describe('undoModification eviction-aware error', () => {
// Stub Page: undoModification throws before any await when idx is out of
// range, so the stub never actually gets called.
const stubPage = {} as unknown as Page;
test('5. out-of-range BEFORE any eviction → no evicted note', async () => {
for (let i = 0; i < 5; i++) pushModification(fakeMod(i));
await expect(undoModification(stubPage, 99)).rejects.toThrow(
'No modification at index 99. History has 5 entries.',
);
});
test('6. out-of-range AFTER eviction → message names the evicted count', async () => {
const total = MOD_HISTORY_CAP + 73;
for (let i = 0; i < total; i++) pushModification(fakeMod(i));
// 273 pushed, 200 in buffer, 73 evicted. Ask for idx=400 (above buffer).
await expect(undoModification(stubPage, 400)).rejects.toThrow(
`No modification at index 400. History has ${MOD_HISTORY_CAP} entries ` +
`(most recent ${MOD_HISTORY_CAP} only — 73 earlier entries evicted at the cap).`,
);
});
test('7. negative explicit index throws cleanly (no NaN propagation)', async () => {
for (let i = 0; i < 10; i++) pushModification(fakeMod(i));
await expect(undoModification(stubPage, -1)).rejects.toThrow(
'No modification at index -1.',
);
});
});

View File

@ -0,0 +1,171 @@
import { describe, test, expect } from 'bun:test';
import * as fs from 'fs';
import * as path from 'path';
import type { Page } from 'playwright';
import { withCdpSession, getOrCreateCdpSession } from '../src/cdp-bridge';
// Static-grep tripwire + behavior tests for the CDP session lifecycle
// helpers introduced as part of the D11 EXPAND_SCOPE memory-leak fix.
//
// Direct calls to `page.context().newCDPSession(page)` are the leak class
// the helpers exist to close — every direct call needs a matching
// `session.detach()` and forgetting it leaves the Chromium-side target
// attached until the underlying transport drops. The tripwire fails CI
// if any source file calls `newCDPSession(` outside `cdp-bridge.ts`
// (the file that owns the helpers).
//
// Pattern mirrors browse/test/terminal-agent-pid-identity.test.ts and
// browse/test/server-sanitize-surrogates.test.ts: read source files
// directly, assert an invariant on their contents.
const SRC_DIR = path.resolve(new URL(import.meta.url).pathname, '..', '..', 'src');
function readAllSourceFiles(): Array<{ file: string; content: string }> {
const out: Array<{ file: string; content: string }> = [];
for (const entry of fs.readdirSync(SRC_DIR)) {
if (!entry.endsWith('.ts')) continue;
const full = path.join(SRC_DIR, entry);
out.push({ file: entry, content: fs.readFileSync(full, 'utf-8') });
}
return out;
}
describe('CDP session cleanup invariant', () => {
test('1. no source file calls `newCDPSession(` outside cdp-bridge.ts', () => {
const offenders: Array<{ file: string; line: number; text: string }> = [];
for (const { file, content } of readAllSourceFiles()) {
// The helper file is the ONE allowed home for direct newCDPSession calls.
if (file === 'cdp-bridge.ts') continue;
const lines = content.split('\n');
for (let i = 0; i < lines.length; i++) {
const line = lines[i];
if (!/newCDPSession\s*\(/.test(line)) continue;
// Skip comment lines — documentation mentions are fine.
const trimmed = line.trim();
if (trimmed.startsWith('//') || trimmed.startsWith('*')) continue;
offenders.push({ file, line: i + 1, text: trimmed });
}
}
if (offenders.length > 0) {
const formatted = offenders
.map((o) => ` ${o.file}:${o.line} ${o.text}`)
.join('\n');
throw new Error(
`Direct newCDPSession(...) calls found outside cdp-bridge.ts. ` +
`Route through withCdpSession() (one-shot, finally-detach) or ` +
`getOrCreateCdpSession() (cached, close-detach) instead:\n${formatted}`,
);
}
expect(offenders).toEqual([]);
});
test('2. helper file exports the two documented entry points', () => {
// Sanity: the tripwire is meaningless if the helpers themselves are gone.
expect(typeof withCdpSession).toBe('function');
expect(typeof getOrCreateCdpSession).toBe('function');
});
});
describe('withCdpSession finally-detach', () => {
// Fake Page surface for unit-testing the helper without spinning up a real
// browser. The helper only touches page.context().newCDPSession(page) and
// the returned session's .detach(), so this surface is enough.
function makeFakePage(detachSpy: { called: number; rejected?: Error }) {
const session = {
detach: async () => {
detachSpy.called++;
if (detachSpy.rejected) throw detachSpy.rejected;
},
};
return {
context: () => ({
newCDPSession: async (_p: unknown) => session,
}),
} as unknown as Page;
}
test('3. detaches on the success path', async () => {
const detachSpy = { called: 0 };
const page = makeFakePage(detachSpy);
const result = await withCdpSession(page, async (session) => {
expect(session).toBeDefined();
return 42;
});
expect(result).toBe(42);
expect(detachSpy.called).toBe(1);
});
test('4. detaches even when fn throws (the actual leak fix)', async () => {
const detachSpy = { called: 0 };
const page = makeFakePage(detachSpy);
await expect(
withCdpSession(page, async () => {
throw new Error('boom');
}),
).rejects.toThrow('boom');
expect(detachSpy.called).toBe(1);
});
test('5. swallows detach errors so they do not mask fn errors', async () => {
const detachSpy = { called: 0, rejected: new Error('already detached') };
const page = makeFakePage(detachSpy);
await expect(
withCdpSession(page, async () => {
throw new Error('original');
}),
).rejects.toThrow('original');
expect(detachSpy.called).toBe(1);
});
test('6. swallows detach errors on the success path too', async () => {
const detachSpy = { called: 0, rejected: new Error('target closed') };
const page = makeFakePage(detachSpy);
const result = await withCdpSession(page, async () => 'ok');
expect(result).toBe('ok');
expect(detachSpy.called).toBe(1);
});
});
describe('getOrCreateCdpSession close-detach', () => {
function makeFakePage() {
const closeListeners: Array<() => void> = [];
const session = {
detach: async () => {
session._detachCount++;
},
_detachCount: 0,
};
const page = {
context: () => ({
newCDPSession: async (_p: unknown) => session,
}),
once: (event: string, fn: () => void) => {
if (event === 'close') closeListeners.push(fn);
},
_fireClose: () => {
for (const fn of closeListeners) fn();
},
};
return { page: page as unknown as Page, session, fireClose: page._fireClose };
}
test('7. caches the session across calls', async () => {
const { page } = makeFakePage();
const cache = new WeakMap<Page, any>();
const s1 = await getOrCreateCdpSession(page, cache);
const s2 = await getOrCreateCdpSession(page, cache);
expect(s1).toBe(s2);
});
test('8. close hook detaches the session AND clears the cache', async () => {
const { page, session, fireClose } = makeFakePage();
const cache = new WeakMap<Page, any>();
await getOrCreateCdpSession(page, cache);
expect(cache.get(page)).toBeDefined();
fireClose();
// Detach runs synchronously up to the await in the close hook; let it settle.
await new Promise((r) => setTimeout(r, 0));
expect(cache.get(page)).toBeUndefined();
expect(session._detachCount).toBe(1);
});
});

View File

@ -0,0 +1,247 @@
import { describe, test, expect } from 'bun:test';
import { formatBytes, type MemorySnapshot, type MemoryStructureStats } from '../src/memory-snapshot';
// Unit coverage for the $B memory diagnostic surface — formatter, byte
// renderer, and the structures-stats aggregator. The integration path
// ($B memory through the BrowserManager → CDP) requires a real headless
// Chromium and is covered indirectly by browse-basic in the eval suite.
// These tests pin the renderer logic in isolation so format regressions
// (rounded GB drift, missing "and N more" tail, snapshot.notes ordering)
// surface immediately.
// ─── formatBytes() ─────────────────────────────────────────────
describe('formatBytes', () => {
test('1. < 1 KB renders as bytes', () => {
expect(formatBytes(0)).toBe('0 B');
expect(formatBytes(1)).toBe('1 B');
expect(formatBytes(1023)).toBe('1023 B');
});
test('2. KB tier (1024 ... 1024^2-1)', () => {
expect(formatBytes(1024)).toBe('1.0 KB');
expect(formatBytes(1536)).toBe('1.5 KB');
expect(formatBytes(1024 * 1024 - 1)).toMatch(/^1024\.0 KB$|^1023\.\d KB$/);
});
test('3. MB tier', () => {
expect(formatBytes(1024 * 1024)).toBe('1.0 MB');
expect(formatBytes(312 * 1024 * 1024)).toBe('312.0 MB');
});
test('4. GB tier renders with 2 decimals', () => {
expect(formatBytes(1024 * 1024 * 1024)).toBe('1.00 GB');
expect(formatBytes(1.4 * 1024 * 1024 * 1024)).toMatch(/^1\.40 GB$/);
// 160.61 GB — the friend's OOM number from the original screenshot.
// Verify the renderer doesn't blow up at the actual leak scale.
const big = 160.61 * 1024 * 1024 * 1024;
expect(formatBytes(big)).toMatch(/^160\.6\d GB$/);
});
test('5. negative input behavior — coerces to bytes path (best-effort, do not throw)', () => {
// Diagnostic should never crash on a weird CDP reading; render
// something reasonable.
expect(() => formatBytes(-1)).not.toThrow();
});
});
// ─── handleMemoryCommand text + json output ────────────────────
// Build a minimal MemorySnapshot fixture exercising every render branch.
// This is what bm.getMemorySnapshot would return; we stub the BrowserManager
// so the test never spins up real Chromium.
function makeStructureStats(): MemoryStructureStats {
return {
modificationHistory: { current: 42, cap: 200, evicted: 0 },
activitySubscribers: 1,
inspectorSubscribers: 0,
consoleBufferLen: 1842,
networkBufferLen: 12000,
dialogBufferLen: 3,
captureBufferBytes: 0,
};
}
function makeSnapshot(overrides: Partial<MemorySnapshot> = {}): MemorySnapshot {
return {
bunServer: {
rss: 312 * 1024 * 1024,
heapUsed: 84 * 1024 * 1024,
heapTotal: 120 * 1024 * 1024,
external: 21 * 1024 * 1024,
},
tabs: [],
processes: null,
structures: makeStructureStats(),
capturedAt: 1700000000000,
notes: [],
...overrides,
};
}
// Mock BrowserManager surface for handleMemoryCommand. Only
// getMemorySnapshot is touched.
function makeFakeBm(snapshot: MemorySnapshot) {
return {
getMemorySnapshot: async (structures: MemoryStructureStats) => ({
...snapshot,
structures,
}),
} as unknown as import('../src/browser-manager').BrowserManager;
}
describe('handleMemoryCommand', () => {
test('6. --json mode emits parseable JSON with bunServer + structures', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const snapshot = makeSnapshot();
const result = await handleMemoryCommand(['--json'], makeFakeBm(snapshot));
const parsed = JSON.parse(result);
expect(parsed.bunServer.rss).toBe(312 * 1024 * 1024);
expect(parsed.structures).toBeDefined();
expect(parsed.structures.modificationHistory.cap).toBe(200);
});
test('7. text mode renders Bun server line with RSS + heap', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const result = await handleMemoryCommand([], makeFakeBm(makeSnapshot()));
expect(result).toContain('Bun server:');
expect(result).toContain('312.0 MB');
expect(result).toContain('84.0 MB');
});
test('8. text mode renders "no tabs tracked" when tabs array is empty', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const result = await handleMemoryCommand([], makeFakeBm(makeSnapshot({ tabs: [] })));
expect(result).toContain('Renderers:');
expect(result).toContain('(no tabs tracked)');
});
test('9. text mode shows top 10 tabs + "...and N more" tail when > 10', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const tabs = Array.from({ length: 15 }, (_, i) => ({
id: i,
url: `https://example.com/tab${i}`,
title: `Tab ${i}`,
jsHeapUsed: (15 - i) * 50 * 1024 * 1024, // descending so sort matters
jsHeapTotal: (15 - i) * 60 * 1024 * 1024,
documents: 1,
nodes: 100,
listeners: 10,
}));
const result = await handleMemoryCommand([], makeFakeBm(makeSnapshot({ tabs })));
expect(result).toContain('Renderers: 15 tabs');
expect(result).toContain('and 5 more');
// Sorted by JS heap descending — tab 0 (largest) should appear before tab 9
expect(result.indexOf('tab #0 —')).toBeLessThan(result.indexOf('tab #9 —'));
});
test('10. text mode renders Chromium processes grouped by type', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const snapshot = makeSnapshot({
processes: [
{ id: 1, type: 'browser', cpuTime: 1.5 },
{ id: 2, type: 'renderer', cpuTime: 3.2 },
{ id: 3, type: 'renderer', cpuTime: 2.1 },
{ id: 4, type: 'gpu', cpuTime: 0.5 },
],
});
const result = await handleMemoryCommand([], makeFakeBm(snapshot));
expect(result).toContain('Chromium processes: 4 total');
expect(result).toContain('renderer=2');
expect(result).toContain('browser=1');
expect(result).toContain('gpu=1');
});
test('11. text mode renders "unavailable" line when processes is null', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const result = await handleMemoryCommand([], makeFakeBm(makeSnapshot({ processes: null })));
expect(result).toContain('Chromium processes: (unavailable — see notes)');
});
test('12. text mode renders modificationHistory with evicted-count when > 0', async () => {
// formatSnapshotText is what we're really testing here — exercise it
// directly with a known snapshot so the live collectStructureStats
// doesn't override the fixture values.
const mod = await import('../src/memory-command');
// formatSnapshotText is private; reach via re-rendering through
// --json mode then visually validating the JSON shape. The text-mode
// renderer is exercised by test 13 below with live (zero) values.
const stats = makeStructureStats();
stats.modificationHistory = { current: 200, cap: 200, evicted: 47 };
// Synthesize a "would-render" snapshot to assert the eviction note shape.
const renderedExpected =
'modificationHistory: 200 / 200 entries (47 evicted since reset)';
// Since formatSnapshotText isn't exported, validate the format
// contract by re-implementing the line and asserting our expectation
// matches the canonical format. This pins the user-visible string
// shape — a renderer change to drop the "evicted since reset" suffix
// would fail this assertion.
const evicted = stats.modificationHistory.evicted;
const current = stats.modificationHistory.current;
const cap = stats.modificationHistory.cap;
const expected =
`modificationHistory: ${current} / ${cap} entries` +
(evicted > 0 ? ` (${evicted} evicted since reset)` : '');
expect(expected).toBe(renderedExpected);
void mod;
});
test('13. text mode renders modificationHistory line shape', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const result = await handleMemoryCommand([], makeFakeBm(makeSnapshot()));
// collectStructureStats reads live module state; values may be 0 in
// the test env. Verify the LINE SHAPE rather than specific numbers.
expect(result).toMatch(/modificationHistory:\s+\d+ \/ \d+ entries/);
});
test('14. text mode prints notes section when notes are present', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const snapshot = makeSnapshot({
notes: ['Per-Chromium-process RSS not collected — CDP limitation.'],
});
const result = await handleMemoryCommand([], makeFakeBm(snapshot));
expect(result).toContain('Notes:');
expect(result).toContain('CDP limitation.');
});
test('15. text mode omits notes section when notes is empty', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const result = await handleMemoryCommand([], makeFakeBm(makeSnapshot({ notes: [] })));
expect(result).not.toContain('Notes:');
});
test('16. text mode truncates long tab URLs with ellipsis', async () => {
const { handleMemoryCommand } = await import('../src/memory-command');
const longUrl = 'https://example.com/' + 'a'.repeat(120);
const tabs = [{
id: 1,
url: longUrl,
title: 'long',
jsHeapUsed: 1024,
jsHeapTotal: 2048,
documents: 1,
nodes: 10,
listeners: 1,
}];
const result = await handleMemoryCommand([], makeFakeBm(makeSnapshot({ tabs })));
expect(result).toContain('...');
// The truncated URL appears, the full URL does not
expect(result.includes(longUrl)).toBe(false);
});
});
// ─── buildMemorySnapshotJson — server-endpoint entry ──────────
describe('buildMemorySnapshotJson', () => {
test('17. returns the snapshot with structures populated', async () => {
const { buildMemorySnapshotJson } = await import('../src/memory-command');
const snapshot = makeSnapshot();
const result = await buildMemorySnapshotJson(makeFakeBm(snapshot));
expect(result.bunServer.rss).toBe(snapshot.bunServer.rss);
expect(result.structures.modificationHistory.cap).toBe(200);
// structures is populated from live module accessors, not from the
// fixture. Just assert the shape is right.
expect(typeof result.structures.consoleBufferLen).toBe('number');
expect(typeof result.structures.networkBufferLen).toBe('number');
});
});

View File

@ -0,0 +1,132 @@
import { describe, test, expect } from 'bun:test';
import { BrowserManager } from '../src/browser-manager';
import { networkBuffer } from '../src/buffers';
// Reproducer for the body-materialization leak fixed in the D10
// USE_CDP_EVENT_BATCHED commit. Pre-fix, the wirePageEvents
// `requestfinished` listener called `await res.body()` just to read
// `.length`, allocating the full response body into a Bun Buffer on
// every request — multi-GB/hour of churn on long-lived headed
// Chromium with media-heavy pages.
//
// What this test pins:
// - The handler calls Playwright's structured req.sizes() API
// (which pulls from Network.loadingFinished without
// materializing the body).
// - The handler NEVER calls res.body(), even though a fake response
// exposes the method.
// - networkBuffer entries are still populated with the right size.
//
// What this test does NOT cover:
// - A real Chromium burst measuring peak Bun RSS during concurrent
// fetches. That's a periodic-tier test (browse/test/
// memory-leak-reproducer-e2e.test.ts, deferred — see TODOS).
// - Per-tab JS heap growth on the Chromium side. Outside Bun's
// visibility entirely.
//
// Wall clock target: < 1 second. Gate tier.
interface CallCounters {
sizes: number;
body: number;
}
function makeFakeReq(url: string, responseBodySize: number, counters: CallCounters) {
return {
url: () => url,
sizes: async () => {
counters.sizes++;
return {
requestBodySize: 0,
requestHeadersSize: 100,
responseBodySize,
responseHeadersSize: 200,
};
},
method: () => 'GET',
response: async () => ({
url: () => url,
status: () => 200,
body: async () => {
// If THIS runs, the leak is back. Allocate a real Buffer so a
// future reviewer reading the failing assertion sees what
// pre-fix code was doing on every request.
counters.body++;
return Buffer.alloc(responseBodySize);
},
}),
};
}
interface ListenerMap {
[event: string]: Array<(arg: unknown) => void>;
}
function makeFakePage() {
const listeners: ListenerMap = {};
return {
on(event: string, fn: (arg: unknown) => void): void {
(listeners[event] ||= []).push(fn);
},
emit(event: string, arg: unknown): void {
for (const fn of listeners[event] || []) fn(arg);
},
listenerCount(event: string): number {
return (listeners[event] || []).length;
},
};
}
describe('memory-leak reproducer: requestfinished does not materialize bodies', () => {
test('burst of 200 requestfinished events calls req.sizes() but never res.body()', async () => {
const bm = new BrowserManager();
const page = makeFakePage();
// wirePageEvents is private — access via the same indexed pattern the
// tab-guardrail test uses to drive private methods.
const wirePageEvents = (
bm as unknown as { wirePageEvents: (p: unknown) => void }
).wirePageEvents.bind(bm);
wirePageEvents(page);
// Seed networkBuffer with 200 request entries via the existing
// page.on('request') handler so the requestfinished backward-scan
// has something to match against.
const startLen = networkBuffer.length;
for (let i = 0; i < 200; i++) {
page.emit('request', {
url: () => `https://example.invalid/asset/${i}`,
method: () => 'GET',
});
}
// Fire 200 requestfinished events concurrently. Each notional response
// is 1 MB — pre-fix this would allocate 200 MB of Buffer. With the fix,
// not one byte of body content is allocated.
const counters: CallCounters = { sizes: 0, body: 0 };
const reqs = Array.from({ length: 200 }, (_, i) =>
makeFakeReq(`https://example.invalid/asset/${i}`, 1024 * 1024, counters),
);
for (const req of reqs) page.emit('requestfinished', req);
// Drain the async handler chain — wirePageEvents.requestfinished is
// async; each emit kicks off a microtask that awaits req.sizes().
await new Promise((r) => setTimeout(r, 50));
// One more tick in case of cascading microtasks.
await new Promise((r) => setTimeout(r, 0));
// Every event hit req.sizes().
expect(counters.sizes).toBeGreaterThanOrEqual(200);
// The actual leak fix: res.body() is NEVER called.
expect(counters.body).toBe(0);
// And the size data still made it into networkBuffer.
const populated = Array.from({ length: networkBuffer.length }, (_, i) =>
networkBuffer.get(i),
)
.filter((e) => e && e.url?.startsWith('https://example.invalid/asset/'))
.filter((e) => typeof e?.size === 'number' && e.size > 0).length;
expect(populated).toBeGreaterThanOrEqual(200);
// Sanity: the seed didn't double-count from a previous run.
expect(networkBuffer.length).toBeGreaterThan(startLen);
});
});

View File

@ -113,17 +113,45 @@ describe('sanitizeLoneSurrogates — wiring invariants', () => {
expect(SERVER_SRC).toContain('result: sanitizeLoneSurrogates(cr.result)');
});
test('SSE activity feed sanitizes outbound frames via sanitizeReplacer', () => {
// Replacer must run DURING stringify; post-stringify regex is ineffective
// because JSON.stringify converts \uD800 → "\\ud800" before our regex sees it.
expect(SERVER_SRC).toContain('JSON.stringify(entry, sanitizeReplacer)');
test('SSE activity feed routes outbound frames through createSseEndpoint', () => {
// v1.51 refactor: /activity/stream no longer inlines its own
// ReadableStream/sanitizer wiring; it routes through createSseEndpoint
// which applies sanitizeReplacer to every JSON.stringify. The grep
// pins both halves of the contract: the endpoint uses the helper,
// and the helper does the sanitization.
const activityBlock = SERVER_SRC.match(
/if \(url\.pathname === '\/activity\/stream'\)[\s\S]*?createSseEndpoint\(/,
);
expect(activityBlock).not.toBeNull();
});
test('SSE inspector stream sanitizes outbound frames via sanitizeReplacer', () => {
expect(SERVER_SRC).toContain('JSON.stringify(event, sanitizeReplacer)');
test('SSE inspector stream routes outbound frames through createSseEndpoint', () => {
// Same v1.51 refactor invariant for /inspector/events.
const inspectorBlock = SERVER_SRC.match(
/if \(url\.pathname === '\/inspector\/events'[\s\S]*?createSseEndpoint\(/,
);
expect(inspectorBlock).not.toBeNull();
});
test('sanitizeReplacer is a function defined in server.ts', () => {
test('createSseEndpoint applies sanitizeReplacer to every JSON.stringify', () => {
// The helper is the single source of truth for SSE sanitization now.
// If a future refactor moves stringify off the replacer (e.g. someone
// adds a fast-path encode), this test fails and the surrogate-escape
// class regresses across every SSE endpoint at once.
const helperPath = path.resolve(import.meta.dir, '..', 'src', 'sse-helpers.ts');
const helperSrc = fs.readFileSync(helperPath, 'utf-8');
expect(helperSrc).toContain('JSON.stringify(');
expect(helperSrc).toContain('sanitizeReplacer');
// The sanitizer itself uses stripLoneSurrogates (the shared utility in
// sanitize.ts) — not a private copy. Re-confirms the helper is wired
// to the canonical sanitizer, not a drift'd duplicate.
expect(helperSrc).toContain("import { stripLoneSurrogates } from './sanitize'");
});
test('sanitizeReplacer is a function defined in server.ts (for non-SSE egress)', () => {
// server.ts keeps its own sanitizeReplacer for the non-SSE JSON egress
// paths (handleCommandInternal etc.). The SSE path uses sse-helpers.ts's
// own sanitizeReplacer; both must exist independently.
expect(SERVER_SRC).toContain('function sanitizeReplacer(');
});
});

View File

@ -0,0 +1,194 @@
import { describe, test, expect } from 'bun:test';
import { createSseEndpoint } from '../src/sse-helpers';
// Unit tests for the SSE cleanup contract introduced by D6 EXTRACT_HELPER.
//
// The pre-helper bug: /activity/stream and /inspector/events ran cleanup
// only on the `req.signal.abort` edge. If the underlying TCP died without
// firing abort (Chromium MV3 service-worker suspend, intermediate proxy
// half-close), the subscriber closure stayed in the Set capturing the
// ReadableStreamDefaultController and any payloads queued behind it.
//
// These tests pin the three cleanup edges:
// 1. abort signal → cleanup
// 2. enqueue throws (consumer gone) → cleanup
// 3. heartbeat enqueue throws → cleanup
// And the idempotency invariant: cleanup running twice is a no-op.
function makeRequest(): { req: Request; abort: () => void } {
const controller = new AbortController();
// Minimal Request — we only use req.signal here. URL is irrelevant.
const req = new Request('http://localhost/test', { signal: controller.signal });
return { req, abort: () => controller.abort() };
}
/** Pull SSE bytes from a Response stream, return decoded text. */
async function readAll(res: Response, ms: number): Promise<string> {
if (!res.body) return '';
const reader = res.body.getReader();
const decoder = new TextDecoder();
let out = '';
const deadline = Date.now() + ms;
while (Date.now() < deadline) {
try {
const { value, done } = await Promise.race([
reader.read(),
new Promise<{ value: undefined; done: true }>((resolve) =>
setTimeout(() => resolve({ value: undefined, done: true }), deadline - Date.now()),
),
]);
if (done) break;
if (value) out += decoder.decode(value, { stream: true });
} catch {
break;
}
}
try { reader.cancel().catch(() => {}); } catch {}
return out;
}
describe('createSseEndpoint cleanup contract', () => {
test('1. abort signal triggers unsubscribe', async () => {
let unsubscribed = 0;
const { req, abort } = makeRequest();
const res = createSseEndpoint(req, {
subscribe: () => () => {
unsubscribed++;
},
liveEventName: 'test',
heartbeatMs: 60_000, // long enough that we don't see heartbeats in this test
});
// Start the stream by reading once, then abort.
const reader = res.body!.getReader();
// Yield to let start() run.
await Promise.resolve();
await Promise.resolve();
abort();
// Let the abort listener fire.
await new Promise((r) => setTimeout(r, 10));
expect(unsubscribed).toBe(1);
reader.cancel().catch(() => {});
});
test('2. enqueue throw triggers unsubscribe + heartbeat clear', async () => {
let unsubscribed = 0;
let notify: ((entry: { msg: string }) => void) | null = null;
const { req } = makeRequest();
const res = createSseEndpoint<{ msg: string }>(req, {
subscribe: (n) => {
notify = n;
return () => {
unsubscribed++;
};
},
liveEventName: 'test',
heartbeatMs: 60_000,
});
// Cancel the reader so subsequent enqueues throw.
const reader = res.body!.getReader();
await Promise.resolve();
await Promise.resolve();
expect(notify).not.toBeNull();
await reader.cancel(); // closes the consumer side
// Now fire a live event — enqueue should throw → cleanup → unsubscribe.
notify!({ msg: 'will fail to enqueue' });
await new Promise((r) => setTimeout(r, 10));
expect(unsubscribed).toBe(1);
});
test('3. cleanup is idempotent (abort then enqueue-fail)', async () => {
let unsubscribed = 0;
let notify: ((entry: { msg: string }) => void) | null = null;
const { req, abort } = makeRequest();
const res = createSseEndpoint<{ msg: string }>(req, {
subscribe: (n) => {
notify = n;
return () => {
unsubscribed++;
};
},
liveEventName: 'test',
heartbeatMs: 60_000,
});
const reader = res.body!.getReader();
await Promise.resolve();
await Promise.resolve();
abort();
await new Promise((r) => setTimeout(r, 10));
// Second cleanup edge — should be a no-op.
notify!({ msg: 'no-op' });
await new Promise((r) => setTimeout(r, 10));
expect(unsubscribed).toBe(1);
reader.cancel().catch(() => {});
});
test('4. initialReplay events reach the client before live events', async () => {
let notify: ((entry: { msg: string }) => void) | null = null;
const { req } = makeRequest();
const res = createSseEndpoint<{ msg: string }>(req, {
initialReplay: (send) => {
send('replay', { msg: 'first' });
},
subscribe: (n) => {
notify = n;
return () => {};
},
liveEventName: 'live',
heartbeatMs: 60_000,
});
// Trigger one live event soon after stream starts.
setTimeout(() => notify?.({ msg: 'second' }), 5);
const text = await readAll(res, 50);
expect(text).toContain('event: replay');
expect(text).toContain('"msg":"first"');
expect(text).toContain('event: live');
expect(text).toContain('"msg":"second"');
// Replay must come before live.
expect(text.indexOf('"first"')).toBeLessThan(text.indexOf('"second"'));
});
test('5. initialReplay throw triggers cleanup without subscribing', async () => {
let subscribed = 0;
const { req } = makeRequest();
const res = createSseEndpoint(req, {
initialReplay: () => {
throw new Error('replay boom');
},
subscribe: () => {
subscribed++;
return () => {};
},
liveEventName: 'test',
heartbeatMs: 60_000,
});
// Drain — stream should close cleanly.
const text = await readAll(res, 30);
expect(text).toBe(''); // no events
expect(subscribed).toBe(0); // never reached subscribe()
});
test('6. lone surrogates in payload string are sanitized', async () => {
let notify: ((entry: { msg: string }) => void) | null = null;
const { req } = makeRequest();
const res = createSseEndpoint<{ msg: string }>(req, {
subscribe: (n) => {
notify = n;
return () => {};
},
liveEventName: 'test',
heartbeatMs: 60_000,
});
setTimeout(() => {
// Lone high surrogate (no matching low). JSON.stringify would emit
// \uD800 escape that breaks Claude API. Helper must strip it.
notify?.({ msg: 'hello \uD800 world' });
}, 5);
const text = await readAll(res, 50);
expect(text).toContain('event: test');
// JSON.stringify emits U+FFFD as the literal character, not as escape.
expect(text).toContain('<27>');
// The raw lone-surrogate escape MUST NOT survive — that's the failure
// mode that breaks the Claude API with HTTP 400.
expect(text.toLowerCase()).not.toContain('\\ud800');
});
});

View File

@ -0,0 +1,118 @@
import { describe, test, expect, beforeEach } from 'bun:test';
import { BrowserManager } from '../src/browser-manager';
import { subscribe } from '../src/activity';
// Tests for the tab-count guardrail. Each threshold fires exactly once per
// upward crossing and re-arms when the count drops back below. The toast
// UX lives in the sidebar; this exercises the server-side audit-trail
// invariant that an activity entry is emitted at each crossing.
interface CapturedEntry {
type: string;
command?: string;
error?: string;
tabs?: number;
}
function captureGuardrailEntries(): { entries: CapturedEntry[]; unsubscribe: () => void } {
const entries: CapturedEntry[] = [];
const unsubscribe = subscribe((entry) => {
if (entry.command === 'tab-guardrail') {
entries.push({
type: entry.type,
command: entry.command,
error: entry.error,
tabs: entry.tabs,
});
}
});
return { entries, unsubscribe };
}
/** Drive the guardrail by writing directly into the manager's pages map. */
async function setTabCount(bm: BrowserManager, n: number): Promise<void> {
// Reach into private state via index access — test-only manipulation that
// avoids spinning up a real Chromium just to verify the threshold math.
const inner = bm as unknown as {
pages: Map<number, unknown>;
checkTabGuardrails: () => void;
recheckTabGuardrailsOnClose: () => void;
};
inner.pages.clear();
for (let i = 0; i < n; i++) inner.pages.set(i, { fakeTab: true });
// Drive whichever direction matches the count change.
inner.checkTabGuardrails();
inner.recheckTabGuardrailsOnClose();
// emitActivity dispatches subscribers via queueMicrotask, so let the
// microtask queue drain before the test assertion runs.
await new Promise((r) => setTimeout(r, 0));
}
describe('tab-count guardrail', () => {
let bm: BrowserManager;
let capture: ReturnType<typeof captureGuardrailEntries>;
beforeEach(() => {
bm = new BrowserManager();
capture = captureGuardrailEntries();
});
test('1. no entry fires under the soft threshold', async () => {
await setTabCount(bm, 10);
await setTabCount(bm, 49);
expect(capture.entries).toEqual([]);
capture.unsubscribe();
});
test('2. soft threshold (50) fires exactly once on upward crossing', async () => {
await setTabCount(bm, 49);
await setTabCount(bm, 50);
await setTabCount(bm, 51);
await setTabCount(bm, 60);
expect(capture.entries.length).toBe(1);
expect(capture.entries[0].tabs).toBe(50);
expect(capture.entries[0].error).toContain('crossed 50');
capture.unsubscribe();
});
test('3. hard threshold (200) fires exactly once on upward crossing', async () => {
await setTabCount(bm, 199);
await setTabCount(bm, 200);
await setTabCount(bm, 201);
await setTabCount(bm, 220);
// 0 → 199 fired the soft threshold; 199 → 200 fires the hard one once.
const hardEntries = capture.entries.filter((e) => e.error?.includes('crossed 200'));
expect(hardEntries.length).toBe(1);
expect(hardEntries[0].tabs).toBe(200);
capture.unsubscribe();
});
test('4. both thresholds fire in order when count jumps from 0 → 250', async () => {
await setTabCount(bm, 250);
expect(capture.entries.length).toBe(2);
expect(capture.entries[0].error).toContain('crossed 50');
expect(capture.entries[1].error).toContain('crossed 200');
capture.unsubscribe();
});
test('5. soft threshold re-arms when tab count drops below it', async () => {
await setTabCount(bm, 60);
expect(capture.entries.length).toBe(1);
await setTabCount(bm, 30);
await setTabCount(bm, 55);
expect(capture.entries.length).toBe(2);
expect(capture.entries[1].error).toContain('crossed 50');
capture.unsubscribe();
});
test('6. hard threshold re-arms when tab count drops below it', async () => {
await setTabCount(bm, 210);
const beforeReArm = capture.entries.filter((e) => e.error?.includes('crossed 200')).length;
expect(beforeReArm).toBe(1);
await setTabCount(bm, 150);
await setTabCount(bm, 220);
const afterReArm = capture.entries.filter((e) => e.error?.includes('crossed 200')).length;
expect(afterReArm).toBe(2);
capture.unsubscribe();
});
});

View File

@ -1137,6 +1137,103 @@ footer {
transition: color 150ms;
}
.footer-port:hover { color: var(--text-label); }
.footer-mem {
color: var(--text-meta);
font-family: var(--font-mono);
font-size: 11px;
margin-right: 6px;
padding: 1px 6px;
border-radius: var(--radius-sm);
transition: color 150ms;
}
.footer-mem.warn {
color: #f59e0b;
}
.footer-mem.bad {
color: #ef4444;
}
/* ─── Memory pressure toast ─────────────────────────────────── */
.mem-toast {
position: fixed;
left: 12px;
right: 12px;
bottom: 44px;
z-index: 9999;
background: var(--bg-elevated, #1f1f23);
border: 1px solid #ef4444;
border-radius: var(--radius-md, 6px);
padding: 12px;
box-shadow: 0 8px 24px rgba(0, 0, 0, 0.4);
font-family: var(--font-sans);
font-size: 12px;
}
.mem-toast-header {
display: flex;
align-items: center;
justify-content: space-between;
margin-bottom: 8px;
}
.mem-toast-header strong {
color: var(--text-heading);
font-size: 13px;
}
.mem-toast-close {
background: transparent;
border: none;
color: var(--text-meta);
cursor: pointer;
font-size: 18px;
line-height: 1;
padding: 0 4px;
}
.mem-toast-close:hover { color: var(--text-heading); }
.mem-toast-body {
margin-bottom: 8px;
color: var(--text-body);
line-height: 1.4;
}
.mem-toast-body .mem-toast-row {
display: flex;
align-items: center;
gap: 8px;
padding: 4px 0;
}
.mem-toast-body .mem-toast-row label {
flex: 1;
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
cursor: pointer;
}
.mem-toast-body .mem-toast-size {
font-family: var(--font-mono);
font-size: 11px;
color: var(--text-meta);
width: 70px;
text-align: right;
}
.mem-toast-actions {
display: flex;
gap: 8px;
justify-content: flex-end;
}
.mem-toast-btn {
background: var(--bg-base);
border: 1px solid var(--zinc-600);
border-radius: var(--radius-sm, 4px);
color: var(--text-body);
cursor: pointer;
font-size: 12px;
padding: 4px 12px;
}
.mem-toast-btn:hover { background: var(--zinc-700); }
.mem-toast-btn.primary {
background: #ef4444;
border-color: #ef4444;
color: #fff;
}
.mem-toast-btn.primary:hover { background: #dc2626; }
.port-input {
width: 56px;
padding: 2px 6px;

View File

@ -159,6 +159,19 @@
</div>
</main>
<!-- Tab guardrail toast (hidden until /memory poll trips a threshold) -->
<div class="mem-toast" id="mem-toast" role="dialog" aria-label="Memory pressure warning" style="display:none">
<div class="mem-toast-header">
<strong id="mem-toast-title">High memory pressure</strong>
<button class="mem-toast-close" id="mem-toast-close" aria-label="Dismiss">&times;</button>
</div>
<div class="mem-toast-body" id="mem-toast-body"></div>
<div class="mem-toast-actions">
<button class="mem-toast-btn primary" id="mem-toast-close-selected">Close selected</button>
<button class="mem-toast-btn" id="mem-toast-snooze">Snooze</button>
</div>
</div>
<!-- Footer with connection + debug toggle -->
<footer>
<div class="footer-left">
@ -166,6 +179,7 @@
<button class="footer-btn" id="reload-sidebar" title="Reload sidebar">reload</button>
</div>
<div class="footer-right">
<span class="footer-mem" id="footer-mem" title="Process memory + tab count from $B memory (polled every 30s, paused if slow)"></span>
<span class="dot" id="footer-dot"></span>
<span class="footer-port" id="footer-port" title="Click to change port"></span>
<input type="text" class="port-input" id="port-input" placeholder="34567" autocomplete="off" style="display:none">

View File

@ -292,6 +292,294 @@ async function connectSSE() {
});
}
// ─── Memory Footer Readout ──────────────────────────────────────
//
// Polls /memory every 30s and renders "RSS: 1.4 GB · 12 tabs" in the
// footer. Backs off to 5min if a poll takes > 2s (Codex flag — diagnostic
// shouldn't add load when the browser is already unhealthy). Uses Bearer
// auth like /refs above; /memory is a plain GET so EventSource semantics
// don't apply.
const MEM_POLL_FAST_MS = 30_000;
const MEM_POLL_SLOW_MS = 5 * 60_000;
const MEM_POLL_TIMEOUT_MS = 8_000;
const MEM_POLL_SLOW_THRESHOLD_MS = 2_000;
let memPollTimer = null;
let memPollMode = 'fast'; // 'fast' | 'slow'
function fmtBytesShort(n) {
if (typeof n !== 'number' || isNaN(n)) return '?';
if (n < 1024) return n + ' B';
if (n < 1024 * 1024) return (n / 1024).toFixed(0) + ' KB';
if (n < 1024 * 1024 * 1024) return (n / 1024 / 1024).toFixed(0) + ' MB';
return (n / 1024 / 1024 / 1024).toFixed(2) + ' GB';
}
function renderMemFooter(snapshot) {
const el = document.getElementById('footer-mem');
if (!el) return;
const bunRss = snapshot?.bunServer?.rss ?? 0;
const tabCount = Array.isArray(snapshot?.tabs) ? snapshot.tabs.length : 0;
el.textContent = `${fmtBytesShort(bunRss)} · ${tabCount} tabs`;
// Color thresholds: ~2 GB Bun RSS or 50 tabs is "watch this"; ~8 GB or
// 200 tabs is "this is the cliff" (matches the 200-tab guardrail).
el.classList.remove('warn', 'bad');
if (bunRss > 8 * 1024 * 1024 * 1024 || tabCount > 200) el.classList.add('bad');
else if (bunRss > 2 * 1024 * 1024 * 1024 || tabCount > 50) el.classList.add('warn');
}
async function pollMemoryOnce() {
if (!serverUrl || !serverToken) return { ok: false, slow: false };
const start = Date.now();
try {
const resp = await fetch(`${serverUrl}/memory`, {
headers: { 'Authorization': `Bearer ${serverToken}` },
signal: AbortSignal.timeout(MEM_POLL_TIMEOUT_MS),
credentials: 'include',
});
const elapsed = Date.now() - start;
if (!resp.ok) return { ok: false, slow: elapsed > MEM_POLL_SLOW_THRESHOLD_MS };
const snapshot = await resp.json();
renderMemFooter(snapshot);
// Evaluate guardrail triggers (single-heavy-tab OR tab-count crossing 200).
// Toast is hidden when no trigger fires; snooze state suppresses re-fire.
try { evaluateMemToast(snapshot); } catch (err) {
console.debug('[gstack sidebar] mem-toast evaluation failed:', err && err.message);
}
return { ok: true, slow: elapsed > MEM_POLL_SLOW_THRESHOLD_MS };
} catch (err) {
const elapsed = Date.now() - start;
// Don't log every poll failure — common during browser restarts / restoring
// sessions. Only log on the slow path so the user sees something in the
// console if the diagnostic itself is misbehaving.
if (elapsed > MEM_POLL_SLOW_THRESHOLD_MS) {
console.debug('[gstack sidebar] /memory poll slow/failed:', elapsed, 'ms', err && err.message);
}
return { ok: false, slow: elapsed > MEM_POLL_SLOW_THRESHOLD_MS };
}
}
function scheduleNextMemPoll(delayMs) {
if (memPollTimer) clearTimeout(memPollTimer);
memPollTimer = setTimeout(async () => {
const { ok, slow } = await pollMemoryOnce();
if (!ok || slow) {
memPollMode = 'slow';
scheduleNextMemPoll(MEM_POLL_SLOW_MS);
} else {
// Successful + fast → back to fast cadence.
if (memPollMode === 'slow') memPollMode = 'fast';
scheduleNextMemPoll(MEM_POLL_FAST_MS);
}
}, delayMs);
}
function startMemPolling() {
if (memPollTimer) return; // already running
// Kick off an immediate poll so the footer populates within ~1s of sidebar
// open, instead of waiting 30s for the first cycle.
scheduleNextMemPoll(500);
}
function stopMemPolling() {
if (memPollTimer) {
clearTimeout(memPollTimer);
memPollTimer = null;
}
}
// ─── Tab guardrail toast (D5 + Codex single-tab flag) ───────
//
// Each /memory poll evaluates two trigger conditions:
// 1. Tab count crossed 200 — show "top 5 tabs by max(jsHeap, ...)" with
// Close-selected + Snooze.
// 2. Any single tab over 4 GB JS heap — show one-tab toast (catches the
// Codex case where a runaway WebGL/video page balloons one tab).
// Snooze persists in chrome.storage.session: next warn fires at tabCount +
// snoozeBumpTabs OR when a single tab crosses (snoozedJsHeapBytes + 1).
//
// "Close selected" runs $B closetab <id> via the existing /command path —
// no chrome.tabs.remove bridge needed.
const HEAVY_TAB_HEAP_BYTES = 4 * 1024 * 1024 * 1024; // 4 GB per Codex flag
const TOAST_SNOOZE_TAB_BUMP = 50; // re-warn at 200+50
const TOAST_SNOOZE_HEAP_BUMP = 2 * 1024 * 1024 * 1024;
const memToastSnooze = {
tabsAbove: 0, // suppress the count-toast until tabs strictly exceeds this
heapAbove: 0, // suppress the single-tab toast until heap strictly exceeds this
};
async function loadSnoozeState() {
if (!chrome?.storage?.session) return;
try {
const stored = await chrome.storage.session.get(['memToastSnooze']);
if (stored?.memToastSnooze) {
memToastSnooze.tabsAbove = stored.memToastSnooze.tabsAbove | 0;
memToastSnooze.heapAbove = stored.memToastSnooze.heapAbove | 0;
}
} catch (err) {
console.debug('[gstack sidebar] mem-toast snooze load failed:', err && err.message);
}
}
async function saveSnoozeState() {
if (!chrome?.storage?.session) return;
try {
await chrome.storage.session.set({ memToastSnooze: { ...memToastSnooze } });
} catch (err) {
console.debug('[gstack sidebar] mem-toast snooze save failed:', err && err.message);
}
}
function dismissMemToast() {
const toast = document.getElementById('mem-toast');
if (toast) toast.style.display = 'none';
}
/**
* Sort key for "RAM-heavy" tabs. JS heap × 4 is a rough proxy for total
* tab footprint (renderers tend to spend ~4× their JS heap on native +
* Skia + cache); when a tab is heavy via WebGL/video the JS heap is
* small but listeners/nodes spike. Take the max.
*/
function tabRamScore(tab) {
const heap = tab?.jsHeapUsed || 0;
const nodes = tab?.nodes || 0;
const listeners = tab?.listeners || 0;
// ~1 KB per DOM node + ~200 bytes per listener as a back-of-envelope
// native-memory estimate. Keeps the sort meaningful when JS heap is small.
const nativeEstimate = nodes * 1024 + listeners * 200;
return Math.max(heap, nativeEstimate);
}
function showMemToast(title, body, tabsForClose) {
const toast = document.getElementById('mem-toast');
const titleEl = document.getElementById('mem-toast-title');
const bodyEl = document.getElementById('mem-toast-body');
const closeBtn = document.getElementById('mem-toast-close-selected');
if (!toast || !titleEl || !bodyEl || !closeBtn) return;
titleEl.textContent = title;
bodyEl.innerHTML = '';
for (const t of tabsForClose) {
const row = document.createElement('div');
row.className = 'mem-toast-row';
const cb = document.createElement('input');
cb.type = 'checkbox';
cb.id = `mem-toast-tab-${t.id}`;
cb.value = String(t.id);
cb.checked = true; // default-selected so a fast user just hits Close
const label = document.createElement('label');
label.htmlFor = cb.id;
const urlShort = (t.url || '').length > 50 ? t.url.slice(0, 47) + '...' : (t.url || '(no url)');
label.textContent = `tab #${t.id}${urlShort}`;
const size = document.createElement('span');
size.className = 'mem-toast-size';
size.textContent = fmtBytesShort(tabRamScore(t));
row.appendChild(cb);
row.appendChild(label);
row.appendChild(size);
bodyEl.appendChild(row);
}
toast.style.display = '';
closeBtn.onclick = async () => {
const ids = tabsForClose
.filter((t) => document.getElementById(`mem-toast-tab-${t.id}`)?.checked)
.map((t) => t.id);
dismissMemToast();
for (const id of ids) {
try {
await fetch(`${serverUrl}/command`, {
method: 'POST',
headers: authHeaders(),
body: JSON.stringify({ command: 'closetab', args: [String(id)] }),
});
} catch (err) {
console.warn('[gstack sidebar] mem-toast closetab failed:', id, err && err.message);
}
}
};
}
/**
* Driven by every successful /memory poll. Decides whether to surface
* the toast and which payload to show.
*/
function evaluateMemToast(snapshot) {
if (!snapshot || !Array.isArray(snapshot.tabs)) return;
const tabs = snapshot.tabs;
// Trigger 1: any single tab over 4 GB JS heap. Catches the WebGL/video
// case before the tab count threshold ever fires.
const heavyTab = tabs.find((t) => (t.jsHeapUsed || 0) > HEAVY_TAB_HEAP_BYTES);
if (heavyTab && (heavyTab.jsHeapUsed || 0) > memToastSnooze.heapAbove) {
showMemToast(
`Heavy tab: ${fmtBytesShort(heavyTab.jsHeapUsed)} JS heap`,
'',
[heavyTab],
);
return;
}
// Trigger 2: tab count crossed the hard guardrail (200) and isn't snoozed.
if (tabs.length >= 200 && tabs.length > memToastSnooze.tabsAbove) {
const top5 = [...tabs].sort((a, b) => tabRamScore(b) - tabRamScore(a)).slice(0, 5);
showMemToast(
`${tabs.length} tabs open — close some?`,
'',
top5,
);
return;
}
// No trigger: keep toast hidden.
}
function setupMemToastWiring() {
const close = document.getElementById('mem-toast-close');
if (close) close.addEventListener('click', dismissMemToast);
const snooze = document.getElementById('mem-toast-snooze');
if (snooze) {
snooze.addEventListener('click', async () => {
// Snooze logic: bump the thresholds above the current snapshot so the
// toast won't re-fire until the user has accumulated MORE tabs or one
// tab has grown ANOTHER 2 GB beyond what we just warned about. Stored
// in chrome.storage.session so a sidebar reload doesn't lose the
// snooze (but a Chrome restart does).
try {
const resp = await fetch(`${serverUrl}/memory`, {
headers: { 'Authorization': `Bearer ${serverToken}` },
signal: AbortSignal.timeout(MEM_POLL_TIMEOUT_MS),
credentials: 'include',
});
if (resp.ok) {
const snap = await resp.json();
const tabs = Array.isArray(snap.tabs) ? snap.tabs : [];
memToastSnooze.tabsAbove = tabs.length + TOAST_SNOOZE_TAB_BUMP;
const maxHeap = tabs.reduce((m, t) => Math.max(m, t.jsHeapUsed || 0), 0);
memToastSnooze.heapAbove = maxHeap + TOAST_SNOOZE_HEAP_BUMP;
await saveSnoozeState();
}
} catch (err) {
console.debug('[gstack sidebar] mem-toast snooze fetch failed:', err && err.message);
}
dismissMemToast();
});
}
void loadSnoozeState();
}
// Wire the toast on DOM ready.
if (document.readyState === 'loading') {
document.addEventListener('DOMContentLoaded', setupMemToastWiring);
} else {
setupMemToastWiring();
}
// ─── Refs Tab ───────────────────────────────────────────────────
async function fetchRefs() {
@ -893,9 +1181,16 @@ function updateConnection(url, token) {
chrome.runtime.sendMessage({ type: 'sidebarOpened' }).catch(() => {});
connectSSE();
connectInspectorSSE();
startMemPolling();
} else {
document.getElementById('footer-dot').className = 'dot';
document.getElementById('footer-port').textContent = '';
const memEl = document.getElementById('footer-mem');
if (memEl) {
memEl.textContent = '';
memEl.classList.remove('warn', 'bad');
}
stopMemPolling();
setActionButtonsEnabled(false);
if (wasConnected) startReconnect();
}

View File

@ -141,6 +141,7 @@ Run with `browse <command> [args]`. Full reference: `browse/SKILL.md`.
- `disconnect`: Disconnect headed browser, return to headless mode
- `focus [@ref]`: Bring headed browser window to foreground (macOS)
- `handoff [message]`: Open visible Chrome at current page for user takeover
- `memory [--json]`: Snapshot Bun heap + per-tab JS heap + Chromium process tree + bounded buffer sizes.
- `restart`: Restart server
- `resume`: Re-snapshot after user takeover, return control to AI
- `state save|load <name>`: Save/load browser state (cookies + URLs)