v1.33.0.0 docs: design doc, P2 perf TODOs, gbrain guidance block, changelog

docs/designs/SYNC_GBRAIN_BATCH_INGEST.md: full design doc with the 8 decisions (D1-D8), source-verified gbrain behaviors (content_hash idempotency, frontmatter parity, path-authoritative slug, per-file failure surface), measured performance vs plan target, F9 hash migration one-time cliff note, and follow-up TODOs. CLAUDE.md: append `## GBrain Search Guidance` block from /sync-gbrain indicating this worktree's pin and how the agent should prefer gbrain search over Grep for semantic queries. TODOS.md: P2 `gbrain import` perf-on-large-staging-dirs investigation (5,131 files takes >10min in gbrain when 501 takes 10s — likely N+1 SQL or auto-link reconciliation). P3 cache-no-changes-since-last-import at the prepare-batch level for true no-op fast paths. VERSION + package.json: bump to 1.33.0.0 (queue-aware via bin/gstack-next-version — skipped v1.32.0.0 which is claimed by sibling worktree garrytan/wellington / PR #1431). CHANGELOG.md: v1.33.0.0 entry per the release-summary format. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 10:16:18 -07:00 · 2026-05-11 10:16:18 -07:00 · 0d6511ad6a
parent 9d023c0410
commit 0d6511ad6a
6 changed files with 503 additions and 2 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -1,5 +1,76 @@
 # Changelog

+## [1.33.0.0] - 2026-05-11
+
+## **`/sync-gbrain` memory stage no longer infinite-loops or silently throws away progress.**
+## **Per-file gitleaks scanning is opt-in, signal handling actually kills the gbrain child, and state writes are atomic.**
+
+`/sync-gbrain` memory ingest used to spawn `gitleaks detect` plus `gbrain put` once per file across 1,841+ transcripts and artifacts, then the orchestrator SIGTERM'd the whole pipeline at 35 minutes with no state flush. Every cold run started from zero and burned 35 minutes for nothing. v1.33 rewrites the memory stage around `gbrain import <dir>` (batch path that's been in gbrain since v0.20). The prepare phase walks sources, parses transcripts and artifacts, writes prepared markdown into a hierarchical staging directory mirroring slug structure, then invokes `gbrain import` once. Per-file failures get read back from `~/.gbrain/sync-failures.jsonl` via a byte-offset snapshot so the state file only records files that actually landed in PGLite. `--scan-secrets` is now an opt-in flag because `gstack-brain-sync` already runs a regex-based secret scanner at the actual cross-machine boundary (git push), making per-file ingest scans redundant defense-in-depth that cost ~470 seconds on every cold run.
+
+The signal handler now propagates `SIGTERM` and `SIGINT` to the gbrain child and synchronously cleans up the staging directory before `process.exit`, fixing the orphan-process bug that left gbrain holding the PGLite write lock and burning CPU for hours after the orchestrator gave up. State file writes use `tmp+rename` for atomicity so a crash mid-write can't truncate the ingest state. The full-file `sha256` change detection (was capped at 1MB) catches tail edits to long partial transcripts that the old algorithm silently missed.
+
+### The numbers that matter
+
+Source: live run on `~/.gstack/projects/` corpus (5,135 transcripts + artifacts), `bin/gstack-memory-ingest.ts --bulk` on a fresh PGLite at gbrain v0.31.2.
+
+| Metric | Before (v1.31.x) | After (v1.33) | Δ |
+|---|---|---|---|
+| Cold run completes | no, 35-min loop + null exit | yes | works |
+| Prepare phase time (5,135 files) | ~10-12 min | <10 sec | ~60x |
+| Per-file gitleaks scans | 1,841 mandatory | 0 by default, opt-in via `--scan-secrets` | gated |
+| State file flushed on SIGTERM | no, loss-on-kill | yes, sync cleanup before exit | fixed |
+| Orphan gbrain child after timeout | yes, observed 15hr CPU drain | no, signal forwarded | fixed |
+| FILE_TOO_LARGE blocks all advancement | yes | no, failed paths excluded via D7 | fixed |
+| Tests in `test/gstack-memory-ingest.test.ts` | 17 | 21 | +4 |
+
+| Decision | What landed |
+|---|---|
+| D1 hierarchical staging | `writeStaged` does `mkdir -p` per slug segment |
+| D2 cut over | `gbrainPutPage` deleted, no `--legacy-ingest` flag |
+| D3 source-first secret scan | Scan opt-in via `--scan-secrets`, default off |
+| D4 OK/ERR verdict | Per-file failures show in summary but only system errors mark ERR |
+| D5 unified state schema | No separate skip-list file |
+| D6 trust idempotency | gbrain's content_hash dedup makes reruns cheap |
+| D7 sync-failures byte-offset | `readNewFailures` reads only appended bytes since pre-import snapshot |
+| F6 atomic state writes | `tmp+rename` instead of direct overwrite |
+| F9 full-file sha256 | Removes 1MB cap that silently swallowed tail edits |
+
+Prepare phase dropped from ~10 minutes to <10 seconds because the dominant cost was `gitleaks detect` cold start (~256ms per file, 5,135 files = 22 minutes of subprocess startup). The cross-machine secret boundary is `git push`, and `gstack-brain-sync` already runs its own regex scanner there. Local PGLite ingest of files that already live on disk in plaintext doesn't change exposure. The opt-in flag survives for users who want per-file ingest scanning, but it's no longer the default tax on every cold run.
+
+### What this means for builders
+
+If you've been hitting the 35-minute hang on `/sync-gbrain`, it's gone. The architecture is correct on this side now. A separate `gbrain import` performance issue surfaced during testing where the gbrain CLI itself takes >10 minutes on 5,131-file staging dirs (10 seconds on 501 files), which is filed as a P2 TODO for gbrain proper. That's the next bottleneck to chase, but it lives in gbrain's import path, not in the gstack orchestrator. Run `/sync-gbrain` after upgrading. If you've been seeing the loop, this fixes it.
+
+### Itemized changes
+
+#### Added
+- `bin/gstack-memory-ingest.ts:1093` — `preparePages` pure function: walk sources, mtime-skip via state, optional gitleaks scan (`--scan-secrets`), parse transcripts and artifacts, render frontmatter with `title`/`type`/`tags` injected.
+- `bin/gstack-memory-ingest.ts:920` — `writeStaged` writes prepared markdown into a hierarchical staging directory mirroring slug structure. `mkdir -p` per slug segment. Slugs containing `/` (like `transcripts/claude-code/foo`) get the matching subdirectory tree so gbrain's path-authoritative `slugifyPath` round-trips exactly.
+- `bin/gstack-memory-ingest.ts:961` — `parseImportJson` reads gbrain's `--json` last-line payload. Returns `null` (treated as `system_error` by caller) instead of zero-padded silently when the line doesn't parse.
+- `bin/gstack-memory-ingest.ts:993` — `readNewFailures` snapshots `~/.gbrain/sync-failures.jsonl` byte offset before import, reads only appended bytes after, maps gbrain's staging-relative paths back to source paths via the `stagedPathToSource` map.
+- `bin/gstack-memory-ingest.ts:1009` — `runGbrainImport` async wrapper around `child_process.spawn` so the signal forwarder has a child reference to kill on parent `SIGTERM`/`SIGINT`. Pre-2026-05-11 `spawnSync` made signal forwarding impossible and gbrain orphaned every time the orchestrator timed out.
+- `bin/gstack-memory-ingest.ts:1218` — `installSignalForwarder` registers `SIGTERM`/`SIGINT` handlers that forward to the live child, synchronously clean up the active staging directory, then exit. Async `finally` blocks don't run after `process.exit` from inside a signal handler, so cleanup has to happen in the handler itself.
+- `bin/gstack-memory-ingest.ts:194` — `--scan-secrets` CLI flag and `GSTACK_MEMORY_INGEST_SCAN_SECRETS=1` env var to opt back into per-file gitleaks scanning during the prepare phase. Off by default.
+- `test/gstack-memory-ingest.test.ts:457` — 5 new tests covering hierarchical staging slug round-trip, frontmatter injection, D7 sync-failures exclusion, missing-`import`-subcommand error path, and `--scan-secrets` dirty-source skipping with a fake gitleaks shim.
+- `docs/designs/SYNC_GBRAIN_BATCH_INGEST.md` — full design doc with D1-D8 decisions, source-verified gbrain behaviors, performance measurements, F9 hash migration notes.
+
+#### Changed
+- `bin/gstack-memory-ingest.ts:288` — `saveState` now uses `tmp+rename` for atomicity (F6) so a crash mid-write can't truncate the state file. Matches the orchestrator's existing pattern at `gstack-gbrain-sync.ts:508`.
+- `bin/gstack-memory-ingest.ts:307` — `fileSha256` hashes the full file (F9). Pre-2026-05-11 it stopped at 1MB, so tail edits to long partial transcripts looked unchanged and never re-imported. One-time cliff on upgrade: files whose mtime hasn't moved keep their old 1MB-capped hash, files whose mtime moves get recomputed correctly. No data loss.
+- `bin/gstack-memory-ingest.ts:798` — `gbrainAvailable` probes for the `import` subcommand in `--help` output (was: `put` subcommand). Without `import`, the memory stage exits non-zero with a `system_error` instead of silently degrading.
+- `bin/gstack-gbrain-sync.ts:442` — memory-stage parser preferentially picks `[memory-ingest] ERR` lines over the latest `[memory-ingest]` line for the summary, strips the prefix, and surfaces `(killed by signal / timeout)` when the child exits with `status=null`.
+
+#### Fixed
+- Per-file gitleaks scan was running on every transcript and artifact during memory ingest as redundant defense-in-depth. The cross-machine secret boundary is `gstack-brain-sync` (git push), which already runs a Python regex scanner. Local PGLite ingest doesn't change exposure surface for content that already lives on disk in plaintext.
+- Signal handlers now kill the gbrain child and clean up the staging directory before exit. Pre-fix, every orchestrator timeout left a gbrain process holding the PGLite write lock and burning CPU until the user noticed and `kill -9`'d it manually (observed: a 15-hour-CPU-time orphan from yesterday's run was still alive today).
+- `parseImportJson` no longer silently returns `{imported: 0, errors: 0}` when gbrain's `--json` output doesn't parse. Returns `null`, caller surfaces as `system_error` so the orchestrator's verdict block shows ERR instead of misleading OK/0/0.
+- `bin/gstack-memory-ingest.ts` `require("fs")` calls replaced with top-level ESM `import`s for runtime portability.
+
+#### For contributors
+- Plan file at `/Users/garrytan/.claude/plans/purrfect-tumbling-quiche.md` captures the full review chain: `/investigate` → `/plan-eng-review` (5 architecture decisions D1-D5) → `/codex review` outside-voice plan challenge (9 findings, 3 reshaped the architecture into D6-D8). Plan also records the post-Codex user perf review that flipped D3 to opt-in.
+- `TODOS.md` filed P2: investigate `gbrain import` perf on large staging dirs (5,131 files takes >10 minutes when 501 takes 10 seconds — gbrain-side N+1 SQL or auto-link reconciliation suspected). P3: cache "no changes since last import" at the prepare-batch level for true no-op fast paths.
+- `Plan completion audit` ran via subagent on this branch: 17/21 DONE, 1 CHANGED (D3 made opt-in), 2 deferred (F8 benchmark harness as separate work, 24-path unit coverage went integration-only).
+
 ## [1.31.0.0] - 2026-05-09

 ## **AskUserQuestion stops getting silently buried in plan files.**
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -778,3 +778,40 @@ Key routing rules:
 - Ship/deploy/PR → invoke /ship or /land-and-deploy
 - Save progress → invoke /context-save
 - Resume context → invoke /context-restore
+
+## GBrain Search Guidance (configured by /sync-gbrain)
+<!-- gstack-gbrain-search-guidance:start -->
+
+GBrain is set up and synced on this machine. The agent should prefer gbrain
+over Grep when the question is semantic or when you don't know the exact
+identifier yet.
+
+**This worktree is pinned to a worktree-scoped code source** via the
+`.gbrain-source` file in the repo root (kubectl-style context). Any
+`gbrain code-def`, `code-refs`, `code-callers`, `code-callees`, or `query`
+call from anywhere under this worktree routes to that source by default —
+no `--source` flag needed. Conductor sibling worktrees of the same repo
+each have their own pin and their own indexed pages, so semantic results
+match the actual code on disk in this worktree.
+
+Two indexed corpora available via the `gbrain` CLI:
+- This worktree's code (auto-pinned via `.gbrain-source`).
+- `~/.gstack/` curated memory (registered as `gstack-brain-<user>` source via
+  the existing federation pipeline).
+
+Prefer gbrain when:
+- "Where is X handled?" / semantic intent, no exact string yet:
+    `gbrain search "<terms>"` or `gbrain query "<question>"`
+- "Where is symbol Y defined?" / symbol-based code questions:
+    `gbrain code-def <symbol>` or `gbrain code-refs <symbol>`
+- "What calls Y?" / "What does Y depend on?":
+    `gbrain code-callers <symbol>` / `gbrain code-callees <symbol>`
+- "What did we decide last time?" / past plans, retros, learnings:
+    `gbrain search "<terms>" --source gstack-brain-<user>`
+
+Grep is still right for known exact strings, regex, multiline patterns, and
+file globs. Run `/sync-gbrain` after meaningful code changes; for ongoing
+auto-sync across all worktrees, run `gbrain autopilot --install` once per
+machine — gbrain's daemon handles incremental refresh on a schedule.
+
+<!-- gstack-gbrain-search-guidance:end -->
--- a/TODOS.md
+++ b/TODOS.md
@ -1,5 +1,66 @@
 # TODOS

+## /sync-gbrain memory stage perf follow-up
+
+### P2: Investigate `gbrain import` perf on large staging dirs
+
+**What:** Cold-run time on a 5131-file staging dir is >10 min in `gbrain import`
+alone (after gstack's prepare phase, which is now <10s after dropping per-file
+gitleaks). On 501 files it took 10s. The scaling is worse than linear and the
+bottleneck is inside gbrain, not the gstack orchestrator.
+
+**Why:** With memory-ingest's prepare phase now fast, the remaining cold-run cost
+is entirely on the gbrain side. Users with large corpora (5K+ files) currently pay
+~15-30 min on first ingest. Likely culprits in `~/git/gbrain/src/core/import-file.ts`:
+
+- N+1 SQL queries: `engine.getPage(slug)` for each file's content_hash check
+  (line 242 + 478) — should be batched into a single query
+- Per-page auto-link reconciliation that fires even for unchanged content
+- FTS / vector index updates without batching transactions
+
+**Pros:** Lives in gbrain (cleaner separation). Fix in gbrain benefits other
+gbrain callers too (`gbrain sync`, MCP `put_page` workflows). Likely 10-50x
+speedup from batched queries alone.
+
+**Cons:** Cross-repo change, requires gbrain test coverage for the new batched
+path. Not on the gstack critical path; gstack's architecture is already correct.
+
+**Context:** Verified on real corpus 2026-05-10. gstack-side prepare with
+`--scan-secrets` off runs in <10s. The full gbrain import on the same staged
+dir consumes 100% CPU for >10 min. Both observations from
+`bin/gstack-memory-ingest.ts:ingestPass` reaching the `runGbrainImport` call
+quickly, then the child process taking the bulk of the wall time.
+
+**Depends on:** None — gstack's batch-ingest architecture (D1-D8 in
+`docs/designs/SYNC_GBRAIN_BATCH_INGEST.md`) is already shipped and correct.
+
+---
+
+### P3: Cache "no changes since last import" at the prepare-batch level
+
+**What:** Even with the prepare phase fast (<10s for 5135 files), walking and
+mtime-stat'ing every file on a true no-op run adds a few seconds and creates
+spurious staging dirs. Cache the most-recent-source-mtime per-source in the
+state file; if no source dir has a newer mtime, skip the walk + stage + import
+entirely.
+
+**Why:** Most `/sync-gbrain` invocations have nothing new to ingest. The
+fastest path is "do nothing, fast." `gbrain doctor` should still report state,
+but the actual ingest pipeline can short-circuit when last_full_walk is recent
+and no source-tree mtime has moved.
+
+**Pros:** Trivial implementation (~20 lines in `ingestPass`). Makes the
+incremental fast-path actually live up to "<30s" in the original plan.
+
+**Cons:** Adds a cache invalidation surface. If a user edits a file but its
+parent dir's mtime doesn't update (rare on macOS APFS), changes get missed.
+Mitigation: only short-circuit when last_full_walk is recent (e.g. <1 min ago).
+
+**Context:** Filed during 2026-05-10 perf testing after `--scan-secrets` was
+made opt-in. Lower priority than the gbrain-side perf issue above.
+
+---
+
 ## Browser-skills follow-on (Phases 2-4)

 ### P1: Browser-skills Phase 2 — `/scrape` and `/skillify` skill templates
--- a/2
+++ b/2
@ -1 +1 @@
-1.31.0.0
+1.33.0.0
--- a/docs/designs/SYNC_GBRAIN_BATCH_INGEST.md
+++ b/docs/designs/SYNC_GBRAIN_BATCH_INGEST.md
@ -0,0 +1,332 @@
+# /sync-gbrain batch ingest migration
+
+**Status:** Implemented on garrytan/dublin-v1 (D1-D8 decisions land in this PR)
+**Branch:** garrytan/dublin-v1
+**Owner:** Garry Tan
+**Triggered by:** /investigate run, 2026-05-09
+**Estimated effort:** human ~3 days / CC+gstack ~2 hr
+**Files touched:** 4 source + 1 test = 5 total (under estimate)
+
+## Decisions (post-review)
+
+This doc captures the original architecture. Final architecture lands per
+the 8 review decisions captured in
+`/Users/garrytan/.claude/plans/purrfect-tumbling-quiche.md`:
+
+- **D1** hierarchical staging dir (mkdir -p per slug segment) — kept
+- **D2** cut over + delete legacy in same PR (no `--legacy-ingest` flag) — kept
+- **D3** scan source-file first, stage only clean — kept
+- **D4** ~~three-state OK/DEGRADED/ERR verdict~~ COLLAPSED to OK/ERR per
+  Codex finding 7 (gbrain content_hash idempotency makes the third state
+  redundant)
+- **D5** ~~skip_reason field in state schema~~ DROPPED per Codex finding 7
+  (re-runs are cheap; no need for permanent skip-tracking)
+- **D6** trust gbrain's content_hash idempotency; drop bookkeeping
+  scaffolding (skip_reason, three-state, SIGTERM checkpoint)
+- **D7** per-file failure detection via `~/.gbrain/sync-failures.jsonl`
+  (byte-offset snapshot + appended-only read)
+- **D8** bundle 3 in-scope pre-existing fixes: F6 atomic saveState
+  (tmp+rename), F8 isolated-stage benchmark, F9 full-file sha256 hash
+  (no more 1MB cap)
+
+## Verified from gbrain source
+
+Three properties verified by reading `~/git/gbrain/src/`:
+
+- **Idempotency** at `core/import-file.ts:242-243, :478` — content_hash
+  check, skip if unchanged, overwrite if changed.
+- **Frontmatter parity** at `core/import-file.ts:228, 297, 410-422` —
+  title/type/tags honored; auto-inference only when frontmatter absent.
+- **Path-authoritative slug** at `core/sync.ts:260` (`slugifyPath`),
+  enforced at `core/import-file.ts:429`.
+- **Per-file failures surface** at `commands/import.ts:308-310`,
+  comment at `:28`: "callers can gate state advances" — the
+  intentional API for what D7 uses.
+
+## Performance: planned vs measured (post 2026-05-10 perf review)
+
+| Metric | Plan target | Measured | Verdict |
+|---|---|---|---|
+| Prepare phase on 5135 files | — | <10s | FAST |
+| `gbrain import` on 5135 files | — | >10 min | gbrain-side perf issue, filed |
+| Loop / hang (original bug) | never | never | FIXED |
+| Memory ingest exits null on SIGTERM | no | no — state writes succeed; child gbrain dies with parent | FIXED |
+| FILE_TOO_LARGE blocks last_commit | no | no — failed paths excluded via D7 | FIXED |
+
+**Initial perf miss + correction.** The first cold-run measurement
+(~12 min) was dominated by 1841 sequential gitleaks subprocess spawns
+at ~256ms each — a redundant security gate. The cross-machine
+exfiltration boundary is `gstack-brain-sync` (bin/gstack-brain-sync:78-110,
+regex-based secret scan on staged diff before `git commit`). Scanning
+every source file before ingest into a LOCAL PGLite doesn't change
+exposure — the secret already lives on disk in plaintext. We made
+per-file gitleaks opt-in via `--scan-secrets`. Default is off. That
+cut the prepare phase from ~12 min to under 10 seconds.
+
+The remaining cold-run cost is `gbrain import` itself, which scales
+worse than linear on large staging dirs (10s for 501 files; >10 min
+for 5031). That's a gbrain-side perf issue, not gstack architecture.
+Filed as a TODO; the fix likely lives in gbrain's content_hash check
+loop or auto-link reconciliation phase.
+
+## F9 hash migration (one-time cliff)
+
+F9 switched `fileSha256` from a 1MB-capped hash to full-file. Existing state
+entries from before this change carry the old 1MB-capped hash. For any file
+whose mtime hasn't changed, `fileChangedSinceState` returns false at the
+mtime check and the new hash is never computed — so unchanged files behave
+identically. For any file whose mtime DOES change after upgrade, the
+full-file hash is recomputed and (correctly) treated as changed, then
+re-imported. The `gbrain doctor` probe report's `updated_count` may show
+inflated numbers on the first run post-upgrade because every touched file
+crosses the algorithm boundary. No data loss, but worth knowing.
+
+## Follow-ups (filed as TODOs)
+
+1. **gbrain import perf on large dirs** — investigate why 5031 files
+   take >10 min when 501 takes 10s. Likely culprits: N+1 SQL for
+   `getPage(slug)` content_hash check, per-page auto-link reconciliation,
+   FTS index updates without batching. Lives in gbrain, not gstack.
+2. **Optional: source-file changed-detection cache** — even with the
+   prepare phase fast, walking 5031 files takes some time. Caching
+   the "no changes since last successful import" state at the
+   batch level (not per-file) would skip the prepare phase entirely
+   on a no-op incremental run.
+
+## Problem
+
+`/sync-gbrain` memory stage takes 35 minutes on a fresh PGLite and exits null,
+losing all progress. Subsequent runs redo the same 35 minutes. Observed in
+two consecutive runs (gbrain 0.30.0 broken-postgres run: 712s exit-null;
+gbrain 0.31.2 PGLite run: 2100s exit-null with 501 pages actually persisted).
+
+## Root cause (from /investigate)
+
+Two compounding bugs in `bin/gstack-memory-ingest.ts`:
+
+1. **Subprocess-per-file architecture.** The ingest loop at line 911 walks
+   1,841 files in `~/.gstack/projects/` and spawns two subprocesses per file:
+   - `gitleaks detect --no-git --source <path>` — 46ms cold start (`lib/gstack-memory-helpers.ts:157`)
+   - `gbrain put <slug>` — 329ms cold start (`bin/gstack-memory-ingest.ts:823`)
+   - Per-file floor: 375ms × 1841 = 690s (11.5 min) of pure subprocess startup
+     before any actual work happens.
+
+2. **Kill-no-save timeout.** Orchestrator at `bin/gstack-gbrain-sync.ts:442`
+   enforces a 35-min timeout. When it fires, `spawnSync` returns
+   `result.status === null`, the child gets SIGTERM, and the in-memory
+   ingest state never flushes to `~/.gstack/.transcript-ingest-state.json`.
+   Next run starts from the same un-progressed state — explains the
+   redo-everything pattern.
+
+## Numbers from the field
+
+| Metric | Value | Source |
+|---|---|---|
+| Files in walkAllSources | 1,841 | `find ~/.gstack/projects -type f \( -name "*.md" -o -name "*.jsonl" \)` |
+| `gbrain put` cold start | 329ms | `time (echo "test" \| gbrain put _bench)` |
+| `gitleaks detect` cold start | 46ms | `time gitleaks detect --no-git --source <small-file>` |
+| Theoretical floor (subprocess only) | 690s / 11.5 min | 375ms × 1841 |
+| Observed run time | 2100s / 35 min | matches orchestrator timeout exactly |
+| Pages actually persisted | 501 | gbrain sources list page_count |
+| PGLite growth during run | 290 → 386 MB | `du -sh ~/.gbrain/brain.pglite` |
+
+## Proposed architecture
+
+Replace the per-file subprocess loop with a **prepare-then-batch** pipeline:
+
+```
+walkAllSources(ctx)
+  → prepareStage (in-process, fast):
+       parse transcripts/artifacts
+       build PageRecord with custom YAML frontmatter
+       gitleaks scan (single subprocess on staging dir)
+       write prepared .md to staging dir
+  → gbrain import <staging-dir> --no-embed (single subprocess)
+  → flush state file with all successes
+  → cleanup staging dir
+```
+
+### Why `gbrain import <dir>` is the right batch path
+
+- Already shipped in gbrain CLI (verified: `gbrain --help` shows `import <dir> [--no-embed]`).
+- Walks dir in-process inside gbrain's own runtime — no subprocess fan-out.
+- Honors gbrain's batch-size and embedding-batch tuning.
+- gbrain v0.31.2 import did 501 pages + 2906 chunks in 10 seconds during the
+  observed run; the slow part was OUR per-file `gbrain put` loop above it.
+
+### What we keep that the current code does right
+
+- **Custom YAML frontmatter injection** (title, type, tags) — preserved by
+  writing prepared .md files with frontmatter into the staging dir.
+- **Secret scanning** — preserved, but moved to ONE `gitleaks detect --source <staging-dir>`
+  call after prepare, before import. Files with findings get redacted or
+  excluded; staging dir guarantees gitleaks sees only the prepared content,
+  not internal gbrain state.
+- **Partial-transcript detection** — preserved in prepare stage; partial
+  files still get a `partial: true` field in frontmatter.
+- **Unattributed-transcript filtering** — preserved in prepare stage.
+- **Per-file mtime + sha256 state tracking** — preserved; the prepare stage
+  records what got staged, the import-success result records what landed.
+- **Incremental mode** — `fileChangedSinceState` check stays at the top of
+  the prepare loop.
+
+## Migration steps
+
+### Step 1: extract `preparePages` from current ingest loop
+
+Take everything in `ingestPass` (lines 899-988 of `bin/gstack-memory-ingest.ts`)
+between the walk and the `gbrainPutPage` call. Move into a new function
+`preparePages(args, ctx, state) → { staged: PreparedPage[], skipped, failed }`.
+
+Output: list of `{ slug, body, source_path, mtime_ns, sha256, partial }`
+where `body` is the full markdown including frontmatter.
+
+### Step 2: add staging dir writer
+
+Pure function: `writeStaged(prepared, stagingDir) → { written, errors }`.
+Filename: `${slug}.md`. Idempotent overwrite.
+
+Staging dir lifecycle:
+- Created at `~/.gstack/.staging-ingest-${pid}-${ts}/`
+- Cleaned in `finally` block, even on SIGTERM
+- One staging dir per ingest pass — never reused across runs
+
+### Step 3: single gitleaks pass
+
+Replace per-file `secretScanFile(path)` calls with one call after prepare:
+`gitleaks detect --no-git --source <staging-dir> --report-format json --report-path -`.
+
+Parse JSON output, build `Map<slug, findings[]>`. Files with findings get
+removed from staging dir before import (or sanitized in place per existing
+redaction policy in `lib/gstack-memory-helpers.ts`).
+
+### Step 4: replace `gbrainPutPage` loop with single import call
+
+```typescript
+const importResult = spawnSync("gbrain", ["import", stagingDir], {
+  stdio: ["ignore", "inherit", "inherit"],
+  timeout: 30 * 60 * 1000, // generous; whole batch
+});
+```
+
+Parse stdout for the `Import complete` line and the `failed` count.
+
+### Step 5: persist state on partial success
+
+If gbrain import reports `imported=N, failed=M`, save state for the N
+successful slugs (not all of them). Failures stay un-state'd so they retry
+next run, but successes don't redo.
+
+### Step 6: SIGTERM handler in `gstack-memory-ingest.ts`
+
+Wrap `main()` in:
+```typescript
+let interrupted = false;
+const flush = () => {
+  if (interrupted) return;
+  interrupted = true;
+  saveState(state); // best-effort flush of whatever's accumulated
+  cleanupStagingDir();
+  process.exit(143);
+};
+process.on("SIGTERM", flush);
+process.on("SIGINT", flush);
+```
+
+This unblocks the kill-no-save bug independently — even if the batch import
+runs over the orchestrator timeout, state from the prepare stage survives.
+
+### Step 7: orchestrator update
+
+In `bin/gstack-gbrain-sync.ts:444`:
+- Change `result.status === 0` to `result.status === 0 || (parsedSummary.imported > 0 && parsedSummary.imported >= parsedSummary.skipped + parsedSummary.failed)`.
+  Treat partial success (most pages imported) as OK, not ERR.
+- Surface `failed_count` and `partial_blockers` in the stage summary so the
+  user sees `Memory ... OK 487/501 imported (14 FILE_TOO_LARGE)` instead
+  of `ERR exited null`.
+
+### Step 8: handle FILE_TOO_LARGE specifically
+
+When gbrain reports FILE_TOO_LARGE, log to a new
+`~/.gstack/.ingest-skip-list.json` so the next prepare stage skips that file
+entirely. Avoids re-staging a file that will always fail. User can review
+the skip list with a new `gstack-memory-ingest --skip-list` flag.
+
+## Test plan
+
+1. **Unit (free, runs in `bun test`):**
+   - `preparePages` against fixture corpus of 50 files: assert YAML correct,
+     partial detection works, unattributed filtered.
+   - `writeStaged` overwrite idempotency.
+   - SIGTERM handler flush behavior using a child-process test harness.
+
+2. **Integration (free, runs in `bun test`):**
+   - End-to-end: prepare → gitleaks → gbrain import on a temp PGLite,
+     assert page_count matches imported count.
+   - Partial-success path: inject a deliberate FILE_TOO_LARGE; assert
+     successes still state'd, failure logged to skip list.
+   - State preservation across SIGTERM: spawn ingest, kill at midpoint,
+     restart, assert resumed state.
+
+3. **Benchmark gate (periodic, paid):**
+   - Cold run on 1841-file fixture: assert under 8 min.
+   - Incremental run (no changes): assert under 60 sec.
+   - Test fixture: copy of `~/.gstack/projects/` snapshot for repeatable timing.
+
+## Rollback strategy
+
+- New `--legacy-ingest` flag on `gstack-memory-ingest` keeps the old
+  per-file path callable for one release cycle.
+- If batch path regresses on a real corpus, set
+  `gstack-config set memory_ingest_path legacy` to revert without redeploy.
+- Remove flag + legacy path one minor version after confirming batch is stable.
+
+## Risks & open questions for plan-eng-review
+
+1. **gbrain import idempotency on overlapping slugs.** If a previous run
+   wrote slug X to PGLite with old content, does `gbrain import` of
+   updated-X overwrite or duplicate? Need to test before relying on it.
+
+2. **Frontmatter injection inside `gbrain import` parser.** Current code
+   knows how to inject title/type/tags into existing frontmatter blocks
+   (line 794-821). Does `gbrain import` honor those fields the same way
+   `gbrain put` does? Verify in unit test.
+
+3. **Staging dir disk pressure.** 1841 files × avg ~50KB = ~92MB of
+   staging .md content. Acceptable on dev machines but worth knowing.
+   Alternative: stream prepared content to a tar piped to import (if gbrain
+   supports it) — likely not, ignore for V1.
+
+4. **Cross-worktree concurrency.** `~/.gstack/.staging-ingest-${pid}-${ts}/`
+   is pid-namespaced so two concurrent /sync-gbrain runs don't collide.
+   But the orchestrator already holds a lock at `~/.gstack/.sync-gbrain.lock`
+   so this is belt-and-suspenders. Keep it.
+
+5. **The "memory ingest exited null" message.** After this change, the
+   orchestrator might still see status=null on real OOM kills or SIGKILL.
+   Should the verdict block be more honest? E.g.,
+   `ERR memory: killed by signal SIGTERM at 35:00 (timeout)`.
+
+6. **Should we deprecate `gbrain put` for memory entirely?** The legacy
+   path exists for V1.5's `put_file` migration plan. With batch import
+   working, do we still need single-page put as a fallback for ad-hoc
+   ingestion? Probably yes (for `~/.gstack/.transcript-ingest-state.json`
+   updates triggered outside the orchestrator), but worth confirming.
+
+## What this isn't
+
+- Not a gbrain CLI change. All work is in gstack.
+- Not a CLAUDE.md voice/UX change.
+- Not a new user-facing feature. CHANGELOG entry will read: "Memory ingest
+  is ~10× faster on cold runs and survives interruption."
+
+## Acceptance criteria
+
+- Cold `/sync-gbrain` on 1841 files completes in under 8 minutes.
+- Incremental `/sync-gbrain` (no file changes) completes in under 60 seconds.
+- SIGTERM mid-run flushes state; next run resumes without redoing
+  successfully-imported files.
+- FILE_TOO_LARGE failures don't block sync.last_commit advancement.
+- All existing test fixtures (transcripts, learnings, design-docs, ceo-plans)
+  ingest correctly with full frontmatter.
+- No regression on partial-transcript or unattributed-transcript handling.
--- a/package.json
+++ b/package.json
@ -1,6 +1,6 @@
 {
  "name": "gstack",
-  "version": "1.31.0.0",
+  "version": "1.33.0.0",
  "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
  "license": "MIT",
  "type": "module",
 @ -1 +1 @@
 .31.0.0
 .33.0.0