gstack/browse
Garry Tan 63a56e6789
feat(security): add security-classifier.ts with TestSavantAI + Haiku
This module holds the ML classifier code that the compiled browse binary
cannot link (onnxruntime-node native dylib doesn't load from Bun compile's
temp extract dir — see CEO plan §"Pre-Impl Gate 1 Outcome"). It's imported
ONLY by sidebar-agent.ts, which runs as a non-compiled bun script.

Two layers:

L4 testsavant_content — TestSavantAI BERT-small ONNX classifier. First call
triggers a one-time 112MB model download to ~/.gstack/models/testsavant-small/
(files staged into the onnx/ layout transformers.js v4 expects). Classifies
page snapshots and tool outputs for indirect prompt injection + jailbreak
attempts. On benign-corpus dry-run: Wikipedia/HN/Reddit/tech-blog all score
SAFE 0.98+, attack text scores INJECTION 0.99+, Stack Overflow
instruction-writing now scores SAFE 0.98 on the shorter form (was 0.99
INJECTION on the longer form — instruction-density threshold). Ensemble
combiner downgrades single-layer high to WARN to cover this case.

L4b transcript_classifier — Claude Haiku reasoning-blind pre-tool-call scan.
Sees only {user_message, last 3 tool_calls}, never Claude's chain-of-thought
or tool results (those are how self-persuasion attacks leak). 2000ms hard
timeout. Fail-open on any subprocess failure so sidebar stays functional.
Gated by shouldRunTranscriptCheck() — only runs when another layer already
fired at >= LOG_ONLY, saving ~70% of Haiku spend.

Both layers degrade gracefully: load/spawn failures set status to 'degraded'
and return confidence=0. Shield icon reflects this via getClassifierStatus()
which security.ts's getStatus() composes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 19:03:36 +08:00
..
bin feat: multi-agent support — gstack works on Codex, Gemini CLI, and Cursor (v0.9.0) (#226) 2026-03-19 18:20:50 -07:00
scripts fix: ngrok Windows build + close CI error-swallowing gap (v0.18.0.1) (#1024) 2026-04-16 13:49:04 -07:00
src feat(security): add security-classifier.ts with TestSavantAI + Haiku 2026-04-19 19:03:36 +08:00
test test(security): make sidebar-agent destructure check regex-tolerant 2026-04-19 18:51:18 +08:00
PLAN-snapshot-dropdown-interactive.md fix: snapshot -i auto-detects dropdown/popover interactive elements (#845) 2026-04-05 22:57:45 -07:00
SKILL.md feat(browse): Puppeteer parity — load-html, screenshot --selector, viewport --scale, file:// (v1.1.0.0) (#1062) 2026-04-18 23:25:33 +08:00
SKILL.md.tmpl feat(browse): Puppeteer parity — load-html, screenshot --selector, viewport --scale, file:// (v1.1.0.0) (#1062) 2026-04-18 23:25:33 +08:00