bin/gstack-memory-ingest.ts: rewrite memory ingest around `gbrain import <dir>`
batch path. Replaces per-file gbrainPutPage loop (~470s of subprocess startup
per cold run) with prepare-then-batch:
walkAllSources
-> preparePages: mtime-skip + optional gitleaks (--scan-secrets) + parse
-> writeStaged: mkdir -p per slug segment, hierarchical (D1)
-> snapshot ~/.gbrain/sync-failures.jsonl byte offset
-> runGbrainImport (async spawn) -> parseImportJson
-> readNewFailures: read appended bytes, map back to source paths (D7)
-> state.sessions[path] = {...} for files NOT in failed set
-> saveStateAtomic (F6) + cleanupStagingDir
Architecture decisions:
D1 hierarchical staging dir
D2 cut over, deleted gbrainPutPage entirely
D3 source-file gitleaks made opt-in via --scan-secrets (gstack-brain-sync
owns the cross-machine boundary; per-file scan was redundant ~470s tax)
D4 OK/ERR verdict (no DEGRADED tri-state)
D5 unified state schema (no separate skip-list)
D6 trust gbrain content_hash idempotency (no skip_reason bookkeeping)
D7 byte-offset snapshot of sync-failures.jsonl + per-source mapping
F6 saveState uses tmp+rename atomic write
F9 fileSha256 removes 1MB cap; full-file hash (no more silent tail-edit
misses on long partial transcripts)
Signal handling: installSignalForwarder propagates SIGTERM/SIGINT to the
gbrain child process AND synchronously cleans the staging dir before
process.exit. Pre-fix, orchestrator timeouts left gbrain processes
orphaned holding the PGLite write lock (observed: 15-hour-CPU-time
orphan still alive a day later).
parseImportJson returns null on unparseable output (treated as ERR by
caller) instead of silently zeroing through.
gbrainAvailable() probes for the `import` subcommand instead of `put`.
Plan + review chain at /Users/garrytan/.claude/plans/purrfect-tumbling-quiche.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>