Four exported functions in scripts/gen-skill-docs.ts handle every skill's
frontmatter rewrite at gen time but had zero unit tests. Both real bugs we
shipped (and fixed) on this branch lived in these functions:
v1.45.0.0 design-consultation: when the first sentence exceeded 200 chars,
routing-prose extraction lost the entire tail (anchored on truncated lead
with "..." that didn't substring-match the original).
v1.45.0.0 CI freshness: root-skill key leaked the checkout directory
name ("seville-v3" vs "gstack") and aggregate order was filesystem-
iteration order.
Both shapes are now regression-tested:
- splitCatalogDescription: 7 tests covering simple multi-line, >200-char
first sentence (design-consultation regression), voice-trigger
extraction, no-(gstack) handling, embedded periods (documents known
fallback), no-period fragments, and idempotency.
- buildTrimmedDescription: 3 tests.
- buildWhenToInvokeSection: 3 tests.
- applyCatalogTrim: 4 tests covering the standard rewrite, no-op for
already-short descriptions, the YAML-collision newline fix, and the
malformed-frontmatter null return.
- proactive-suggestions.json determinism: 3 tests asserting sorted keys,
root keyed as "gstack" (not the worktree directory), and no
timestamp/generated_at field that would flap CI freshness.
Test plan:
- bun test test/catalog-trim.test.ts: 20 pass, 0 fail
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>