test: refresh ship golden baselines + gbrain-detection union after carves

Two follow-ups the carve commits should have carried (caught by the full suite,
missed by targeted subsets):
- ship golden baselines (claude/codex/factory) regenerated: the preamble CJK
  trim (v1.58) changed ship's always-loaded AskUserQuestion block.
- gbrain-detection-override probes the office-hours skeleton+section union:
  GBRAIN_SAVE_RESULTS moved into sections/design-and-handoff.md when office-hours
  was carved, so the detection assertions now check both files.

Full `bun test` green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan 2026-06-01 23:22:34 -07:00
parent dce715ae23
commit b0a6977c3f
No known key found for this signature in database
GPG Key ID: C1F69E85C74EFE1D
4 changed files with 28 additions and 62 deletions

View File

@ -367,25 +367,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr
**Full rule + worked examples + Hold/dependency semantics:** see
`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4.
**Non-ASCII characters — write directly, never \u-escape.** When any
string field (question, option label, option description) contains
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
the literal UTF-8 characters in the JSON string. **Never escape them
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
and passes characters through unchanged. Manually escaping requires
recalling each codepoint from training, which is unreliable for long
CJK strings — the model regularly emits the wrong codepoint (e.g.
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`).
The trigger is long, multi-line questions with hundreds of CJK
characters: that is exactly when reflexive escaping kicks in and
exactly when miscoding is most damaging. Long ≠ escape. Keep
characters literal.
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
Right: `"question": "請選擇管理工具"`
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
**Non-ASCII characters — write directly, never \u-escape.** When any string
field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text,
emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is
UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`,
`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see
`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK.
### Self-check before emitting

View File

@ -353,25 +353,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr
**Full rule + worked examples + Hold/dependency semantics:** see
`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4.
**Non-ASCII characters — write directly, never \u-escape.** When any
string field (question, option label, option description) contains
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
the literal UTF-8 characters in the JSON string. **Never escape them
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
and passes characters through unchanged. Manually escaping requires
recalling each codepoint from training, which is unreliable for long
CJK strings — the model regularly emits the wrong codepoint (e.g.
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`).
The trigger is long, multi-line questions with hundreds of CJK
characters: that is exactly when reflexive escaping kicks in and
exactly when miscoding is most damaging. Long ≠ escape. Keep
characters literal.
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
Right: `"question": "請選擇管理工具"`
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
**Non-ASCII characters — write directly, never \u-escape.** When any string
field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text,
emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is
UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`,
`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see
`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK.
### Self-check before emitting

View File

@ -355,25 +355,12 @@ so split chains are never AUTO_DECIDE-eligible — the user's option set is sacr
**Full rule + worked examples + Hold/dependency semantics:** see
`docs/askuserquestion-split.md` in the gstack repo. Read on demand when N>4.
**Non-ASCII characters — write directly, never \u-escape.** When any
string field (question, option label, option description) contains
Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text, emit
the literal UTF-8 characters in the JSON string. **Never escape them
as `\uXXXX`.** Claude Code's tool parameter pipe is UTF-8 native
and passes characters through unchanged. Manually escaping requires
recalling each codepoint from training, which is unreliable for long
CJK strings — the model regularly emits the wrong codepoint (e.g.
writes `\u3103` thinking it is 管 U+7BA1, but `\u3103` is
actually ㄃, so the user sees `管理工具` rendered as `㄃3用箱`).
The trigger is long, multi-line questions with hundreds of CJK
characters: that is exactly when reflexive escaping kicks in and
exactly when miscoding is most damaging. Long ≠ escape. Keep
characters literal.
Wrong: `"question": "請選擇\uXXXX\uXXXX\uXXXX\uXXXX"`
Right: `"question": "請選擇管理工具"`
Only JSON-mandatory escapes remain allowed: `\n`, `\t`, `\"`, `\\`.
**Non-ASCII characters — write directly, never \u-escape.** When any string
field contains Chinese (繁體/簡體), Japanese, Korean, or other non-ASCII text,
emit the literal UTF-8 characters; never escape them as `\uXXXX` (the pipe is
UTF-8 native, and manual escaping miscodes long CJK strings). Only `\n`,
`\t`, `\"`, `\\` remain allowed. Full rationale + worked example: see
`docs/askuserquestion-cjk.md`. Read on demand when a question contains CJK.
### Self-check before emitting

View File

@ -105,7 +105,12 @@ describe('gbrain detection override → gen-skill-docs', () => {
// Single skill probe is enough to assert the override pipeline. The
// resolver unit test (test/resolvers-gbrain-save-results.test.ts) covers
// per-skill metadata correctness already.
const PROBE_FILES = ['office-hours/SKILL.md'];
// office-hours is carved (v2 plan T9): GBRAIN_CONTEXT_LOAD stays in the
// skeleton, GBRAIN_SAVE_RESULTS moved into sections/design-and-handoff.md.
// Probe the union so the detection override is asserted wherever the blocks land.
const PROBE_FILES = ['office-hours/SKILL.md', 'office-hours/sections/design-and-handoff.md'];
const probeUnion = (snap: Map<string, string>): string =>
(snap.get('office-hours/SKILL.md') ?? '') + '\n' + (snap.get('office-hours/sections/design-and-handoff.md') ?? '');
test('with detected:true, Claude-host SKILL.md gains brain-aware blocks', () => {
const { tmpHome, cleanup } = makeFixture(
@ -117,7 +122,7 @@ describe('gbrain detection override → gen-skill-docs', () => {
tmpHome,
files: PROBE_FILES,
});
const content = snap.get('office-hours/SKILL.md')!;
const content = probeUnion(snap);
// GBRAIN_SAVE_RESULTS un-suppressed → resolver output rendered.
expect(content).toContain('## Save Results to Brain');
@ -141,7 +146,7 @@ describe('gbrain detection override → gen-skill-docs', () => {
tmpHome,
files: PROBE_FILES,
});
const content = snap.get('office-hours/SKILL.md')!;
const content = probeUnion(snap);
// GBRAIN_SAVE_RESULTS suppressed → no rendered block, no gbrain put line.
expect(content).not.toContain('gbrain put "office-hours/');
@ -162,7 +167,7 @@ describe('gbrain detection override → gen-skill-docs', () => {
tmpHome,
files: PROBE_FILES,
});
const content = snap.get('office-hours/SKILL.md')!;
const content = probeUnion(snap);
expect(content).not.toContain('gbrain put "office-hours/');
} finally {
cleanup();
@ -183,7 +188,7 @@ describe('gbrain detection override → gen-skill-docs', () => {
tmpHome,
files: PROBE_FILES,
});
const content = snap.get('office-hours/SKILL.md')!;
const content = probeUnion(snap);
expect(content).not.toContain('gbrain put "office-hours/');
expect(content).not.toContain('## Save Results to Brain');
} finally {