gstack/test
Garry Tan 3df8ea8695 security: harden migration + context-save after adversarial review
Adversarial review (Claude + Codex, both high confidence) identified 6
critical production-harm findings in the /ship pre-landing pass.
All folded in.

Migration v1.0.1.0.sh hardening:
- Add explicit `[ -z "${HOME:-}" ]` guard. HOME="" survives set -u and
  expands paths to /.claude/skills/... which could hit absolute paths
  under root/containers/sudo-without-H.
- Add python3 fallback inside resolve_real() (was missing; broken
  symlinks silently defeated ownership check).
- Ownership-guard Shape 2 (~/.claude/skills/gstack/checkpoint/). Was
  unconditional rm -rf. Now: if symlink, check target resolves inside
  gstack; if regular dir, check realpath resolves inside gstack. A
  user's hand-edited customization or a symlink pointing outside gstack
  is preserved with a notice.
- Use `rm --` and `rm -r --` consistently to resist hostile basenames.
- Use `find -type f -not -name .DS_Store -not -name ._*` instead of
  `ls -A | grep`. macOS sidecars no longer mask a legit prefix-mode
  install. Strip sidecars explicitly before removing the dir.

context-save/SKILL.md.tmpl:
- Sanitize title in bash, not LLM prose. Allowlist [a-z0-9.-], cap 60
  chars, default to "untitled". Closes a prompt-injection surface where
  `/context-save $(rm -rf ~)` could propagate into subsequent commands.
- Collision-safe filename. If ${TIMESTAMP}-${SLUG}.md already exists
  (same-second double-save with same title), append a 4-char random
  suffix. The skill contract says "saved files are append-only" — this
  enforces it. Silent overwrite was a data-loss bug.

context-restore/SKILL.md.tmpl:
- Cap `find ... | sort -r` at 20 entries via `| head -20`. A user with
  10k+ saved files no longer blows the context window just to pick one.
  /context-save list still handles the full-history listing path.

test/skill-e2e-autoplan-dual-voice.test.ts:
- Filter transcript to tool_use / tool_result / assistant entries
  before matching, so prompt-text mentions of "plan-ceo-review" don't
  force the reachedPhase1 assertion to pass. Phase-1 assertion now
  requires completion markers ("Phase 1 complete", "Phase 2 started"),
  not mere name occurrence.
- claudeVoiceFired now requires JSON evidence of an Agent tool_use
  (name:"Agent" or subagent_type field), not the literal string
  "Agent(" which could appear anywhere.
- codexVoiceFired now requires a Bash tool_use with a `codex exec/review`
  command string, not prompt-text mentions.

All SKILL.md files regenerated. Golden fixtures updated. bun test: 0
failures across 80+ targeted tests and the full suite.

Review source: /ship Step 11 adversarial pass (claude subagent + codex
exec). Same findings independently surfaced by both reviewers — this is
cross-model high confidence.
2026-04-18 23:28:12 +08:00
..
fixtures merge: origin/main v1.0.0.0 into garrytan/fix-checkpoints 2026-04-18 17:24:03 +08:00
helpers merge: origin/main v1.0.0.0 into garrytan/fix-checkpoints 2026-04-18 17:24:03 +08:00
analytics.test.ts feat: safety hook skills + skill usage telemetry (v0.7.1) (#189) 2026-03-18 23:57:59 -05:00
audit-compliance.test.ts fix: security audit round 2 (v0.13.4.0) (#640) 2026-03-29 22:46:33 -06:00
builder-profile.test.ts feat: relationship closing — office-hours adapts to repeat users (v0.16.2.0) (#937) 2026-04-08 22:21:28 -10:00
codex-e2e.test.ts feat: worktree isolation for E2E tests + infrastructure elegance (v0.11.12.0) (#425) 2026-03-23 23:05:22 -07:00
codex-hardening.test.ts codex + Apple Silicon hardening wave (v0.18.4.0) (#1056) 2026-04-18 12:30:54 +08:00
diff-scope.test.ts feat: Review Army — parallel specialist reviewers for /review (v0.14.3.0) (#692) 2026-03-30 22:07:50 -06:00
explain-level-config.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
gemini-e2e.test.ts feat: Confusion Protocol, Hermes + GBrain hosts, brain-first resolver (v0.18.0.0) (#1005) 2026-04-16 10:41:38 -07:00
gen-skill-docs.test.ts codex + Apple Silicon hardening wave (v0.18.4.0) (#1056) 2026-04-18 12:30:54 +08:00
global-discover.test.ts fix: close redundant PRs + friendly error on all design commands (v0.15.8.1) (#817) 2026-04-05 02:02:06 -07:00
gstack-developer-profile.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
gstack-question-log.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
gstack-question-preference.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
hook-scripts.test.ts feat: safety hook skills + skill usage telemetry (v0.7.1) (#189) 2026-03-18 23:57:59 -05:00
host-config.test.ts community wave: 6 PRs + hardening (v0.18.1.0) (#1028) 2026-04-17 00:45:13 -07:00
jargon-list.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
learnings-injection.test.ts fix: community security wave — 8 PRs, 4 contributors (v0.15.13.0) (#847) 2026-04-06 00:47:04 -07:00
learnings.test.ts feat: GStack Learns — per-project self-learning infrastructure (v0.13.4.0) (#622) 2026-03-29 17:02:01 -06:00
migration-checkpoint-ownership.test.ts merge: origin/main v1.0.0.0 into garrytan/fix-checkpoints 2026-04-18 17:24:03 +08:00
openclaw-native-skills.test.ts community wave: 6 PRs + hardening (v0.18.1.0) (#1028) 2026-04-17 00:45:13 -07:00
plan-tune.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
readme-throughput.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
relink.test.ts fix: headed browser auto-shutdown + disconnect cleanup (v0.18.1.0) (#1025) 2026-04-16 15:39:44 -07:00
review-log.test.ts fix: community PRs + security hardening + E2E stability (v0.12.7.0) (#552) 2026-03-26 23:21:27 -06:00
setup-codesign.test.ts codex + Apple Silicon hardening wave (v0.18.4.0) (#1056) 2026-04-18 12:30:54 +08:00
skill-e2e-autoplan-dual-voice.test.ts security: harden migration + context-save after adversarial review 2026-04-18 23:28:12 +08:00
skill-e2e-bws.test.ts fix: cookie picker auth token leak (v0.15.17.0) (#904) 2026-04-08 10:10:13 -07:00
skill-e2e-cso.test.ts feat: /cso v2 — infrastructure-first security audit (v0.11.6.0) (#384) 2026-03-23 06:57:22 -07:00
skill-e2e-deploy.test.ts feat: /land-and-deploy first-run dry run + staging-first + trust ladder (v0.12.2.0) (#518) 2026-03-26 11:08:31 -07:00
skill-e2e-design.test.ts feat: CI evals on Ubicloud — 12 parallel runners + Docker image (v0.11.10.0) (#360) 2026-03-23 10:17:33 -07:00
skill-e2e-learnings.test.ts feat: recursive self-improvement — operational learning + full skill wiring (v0.13.8.0) (#647) 2026-03-31 23:08:22 -06:00
skill-e2e-plan-tune.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
skill-e2e-plan.test.ts test: E2E tests for plan review report and Codex offering (v0.11.15.0) (#449) 2026-03-24 07:30:24 -07:00
skill-e2e-qa-bugs.test.ts feat: CI evals on Ubicloud — 12 parallel runners + Docker image (v0.11.10.0) (#360) 2026-03-23 10:17:33 -07:00
skill-e2e-qa-workflow.test.ts feat: CI evals on Ubicloud — 12 parallel runners + Docker image (v0.11.10.0) (#360) 2026-03-23 10:17:33 -07:00
skill-e2e-review-army.test.ts feat: Review Army — parallel specialist reviewers for /review (v0.14.3.0) (#692) 2026-03-30 22:07:50 -06:00
skill-e2e-review.test.ts feat: Confusion Protocol, Hermes + GBrain hosts, brain-first resolver (v0.18.0.0) (#1005) 2026-04-16 10:41:38 -07:00
skill-e2e-session-intelligence.test.ts tests: split checkpoint-save-resume into context-save + context-restore E2Es 2026-04-18 16:42:52 +08:00
skill-e2e-sidebar.test.ts feat: declarative multi-host platform + OpenCode, Slate, Cursor, OpenClaw (v0.15.5.0) (#793) 2026-04-04 15:32:20 -07:00
skill-e2e-workflow.test.ts refactor: extract TabSession for per-tab state isolation (v0.15.16.0) (#873) 2026-04-07 00:23:36 -07:00
skill-e2e.test.ts feat: recursive self-improvement — operational learning + full skill wiring (v0.13.8.0) (#647) 2026-03-31 23:08:22 -06:00
skill-llm-eval.test.ts feat: voice directive for all skills (v0.12.3.0) (#520) 2026-03-26 17:31:53 -06:00
skill-parser.test.ts feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41) 2026-03-13 21:08:12 -07:00
skill-routing-e2e.test.ts feat: Confusion Protocol, Hermes + GBrain hosts, brain-first resolver (v0.18.0.0) (#1005) 2026-04-16 10:41:38 -07:00
skill-validation.test.ts feat: context rot defense for /ship — subagent isolation + clean step numbering (v0.18.1.0) (#1030) 2026-04-16 23:14:03 -07:00
team-mode.test.ts feat: Confusion Protocol, Hermes + GBrain hosts, brain-first resolver (v0.18.0.0) (#1005) 2026-04-16 10:41:38 -07:00
telemetry.test.ts feat: community wave — 7 fixes, relink, sidebar Write, discoverability (v0.13.5.0) (#641) 2026-03-29 21:43:36 -06:00
timeline.test.ts feat: Session Intelligence Layer — /checkpoint + /health + context recovery (v0.15.0.0) (#733) 2026-04-01 00:50:42 -06:00
touchfiles.test.ts feat: recursive self-improvement — operational learning + full skill wiring (v0.13.8.0) (#647) 2026-03-31 23:08:22 -06:00
uninstall.test.ts feat: community PRs — faster install, skill namespacing, uninstall, Codex fallback, Windows fix, Python patterns (v0.12.9.0) (#561) 2026-03-27 00:44:37 -06:00
upgrade-migration-v1.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
v0-dormancy.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00
worktree.test.ts feat: content security — 4-layer prompt injection defense for pair-agent (#815) 2026-04-06 14:41:06 -07:00
writing-style-resolver.test.ts feat: gstack v1 — simpler prompts + real LOC receipts (v1.0.0.0) (#1039) 2026-04-18 15:05:42 +08:00