28 KiB
Design — i18n-e2e-english-verification
Overview
Purpose: This spec produces a deterministic, re-runnable verification pass that proves (or disproves) the MiroFish 5-step pipeline runs cleanly in English, and posts a structured report on issue #10 with a pass / gap / manual-pending status per checklist item.
Users: i18n maintainers reviewing the epic (#11), and any future verifier re-running the audit after subsequent merges. The deliverable is read by humans on GitHub (issue comment) and re-run by humans (or CI in a future iteration) to confirm parity.
Impact: No production code is modified. The repository gains one new directory tree (.kiro/specs/i18n-e2e-english-verification/) containing the spec, the audit scripts, and the captured outputs. One GitHub comment is posted on #10. Up to four follow-up issues are filed.
Goals
- Static-audit
backend/app,frontend/src,locales/en.jsonfor CJK characters; classify every match. - Verify EN / ZH locale catalogue parity and flag suspect untranslated entries.
- Verify LLM-prompt assets respect the requested locale.
- Document locale-propagation gaps across Flask →
Task→ OASIS subprocess → ReACT agent. - Post a single canonical comment on issue #10 with per-checklist statuses.
- File follow-up issues for every gap (no inline fixes).
- Make the audit re-runnable by capturing artefacts under
.kiro/specs/.../audit/<commit-sha>/.
Non-Goals
- Patching any
gapdiscovered (R7.3 — strictly verification). - Performance / load testing.
- Adding new locales beyond EN / ZH.
- Building a permanent CI guard (filed as a follow-up issue, not implemented here).
- Live UI / Docker walkthrough — captured as
manual-pendingin this run's report.
Boundary Commitments
This Spec Owns
- The audit scripts and the captured audit outputs under
.kiro/specs/i18n-e2e-english-verification/audit/. - The
gap-report.mdartefact and the comment body posted on issue #10. - The grouping rule for follow-up issues (one per category — UI strings, backend log strings, backend LLM-prompt labels, suggested CI guard).
- The
pass/gap/manual-pending/review-neededclassification scheme.
Out of Boundary
- Any modification of files under
backend/app/,frontend/src/, orlocales/. - Fixing the gaps the audit discovers — those land in their own follow-up issues.
- Live UI walkthrough, Docker run, or LLM execution.
- A permanent CI check — filed as a separate follow-up issue.
Allowed Dependencies
git(forgit grep, capturing HEAD sha).ghCLI (for the comment + follow-up issues; with documented fallback when unavailable).python3(for the catalogue parity diff).- The repo working tree at HEAD of the working branch.
Revalidation Triggers
- Any merge to
mainthat toucheslocales/,backend/app/, orfrontend/src/invalidates the captured audit; a re-run should produce a newaudit/<commit-sha>/directory. - A change to issue #10's checklist body (e.g. a new sub-item) requires re-mapping in
gap-report.md. - A change to the four follow-up categories (e.g. project decides to file one issue per file) requires re-running the issue-filing script with new grouping.
Architecture
Existing Architecture Analysis
- The MiroFish backend is Flask + Python
Taskworkers + an OASIS subprocess (per CLAUDE.md). i18n surfaces are:vue-i18nfor the SPA,locales/*.jsonshared by both ends, a backend logger that resolves keys per locale, and inline LLM prompts inbackend/app/services/*.py. - The verification pass does not hook into any of these — it reads files only. No Flask blueprint, no
Taskmodel, no Neo4j query.
Architecture Pattern & Boundary Map
graph TB
Verifier[Verifier shell entrypoint]
Audit[audit_cjk.sh]
Parity[check_parity.py]
Classify[classify.py]
Report[render_report.py]
Comment[post_comment.sh]
FollowUp[file_followups.sh]
Repo[Working tree]
Captures[audit slash sha slash]
GH[GitHub via gh CLI]
Verifier --> Audit
Verifier --> Parity
Audit --> Classify
Parity --> Classify
Classify --> Report
Report --> Captures
Report --> Comment
Report --> FollowUp
Audit --> Repo
Parity --> Repo
Comment --> GH
FollowUp --> GH
Architecture Integration:
- Selected pattern: Linear pipeline of read-only scripts that each emit a single artefact, composed by a thin shell entrypoint. No mutable state outside
audit/<sha>/. - Domain boundaries:
audit_cjk.showns the raw grep;check_parity.pyowns the catalogue diff;classify.pyowns the four-class labels;render_report.pyowns the comment body;post_comment.shandfile_followups.shown GitHub side effects. - Existing patterns preserved: Shell + Python script pair (matches the project's existing
setup/runstyle); no new test runner, no new linter. - New components rationale: Each script is single-purpose so failures (e.g.
ghpermission issues) are isolated and the pipeline can resume from the failed step. - Steering compliance: No production-code touch (R7.3); 4-space indent in any committed Python; double quotes;
snake_case; reserved Bash exits with a non-zero status on any uncaught error.
Technology Stack
| Layer | Choice / Version | Role in Feature | Notes |
|---|---|---|---|
| CLI / Audit runner | Bash 5+, git grep -P (PCRE) |
Run the canonical CJK audit | \x{...} ranges require PCRE — git grep -E will fail on this regex (verified). |
| Static checks | Python 3.11 (project minimum per CLAUDE.md) | Catalogue parity + classification + report rendering | Standard library only — no new deps. |
| GitHub integration | gh CLI |
Post the comment, file follow-ups | Falls back to audit/<sha>/PENDING-* files when missing. |
| Output formats | Plain text + Markdown | Captures + comment body | No HTML, no JSON beyond gh's own. |
File Structure Plan
Directory Structure
.kiro/specs/i18n-e2e-english-verification/
├── spec.json
├── requirements.md
├── gap-analysis.md
├── research.md
├── design.md
├── tasks.md
├── HANDOFF.md # only if implementation hits the 3-cycle remediation cap
└── audit/
├── scripts/
│ ├── run_audit.sh # entrypoint - chains the steps below
│ ├── audit_cjk.sh # git grep PCRE + bucket counts
│ ├── check_parity.py # locales/en.json vs zh.json key + identical-value diff
│ ├── classify.py # apply 4-class labels to grep matches
│ ├── render_report.py # produce gap-report.md + comment-body.md
│ ├── post_comment.sh # gh issue comment 10 with comment-body.md (or PENDING-*)
│ └── file_followups.sh # gh issue create per category (or PENDING-*)
└── <commit-sha>/ # captured outputs of one verification run
├── cjk-grep.txt # raw `git grep -nP ...` output
├── cjk-grep-bucketed.txt # the same, partitioned by top-level path
├── parity.txt # en/zh diff summary
├── classified.csv # match-by-match label
├── gap-report.md # the canonical structured report
├── comment-body.md # the markdown posted to issue #10
├── PENDING-issue-10-comment.md # only if gh comment failed
└── PENDING-followups/ # only if gh issue create failed
├── 01-frontend-ui-strings.md
├── 02-backend-log-strings.md
├── 03-backend-prompt-labels.md
└── 04-permanent-ci-guard.md
Modified Files
- (None.) The spec explicitly forbids touching production source.
System Flows
sequenceDiagram
participant V as Verifier
participant Run as run_audit.sh
participant FS as Working tree
participant GH as GitHub
V->>Run: bash run_audit.sh
Run->>FS: git grep -nP, git rev-parse HEAD
FS-->>Run: cjk-grep.txt + sha
Run->>FS: read locales json
FS-->>Run: en/zh dicts
Run->>Run: classify
Run->>FS: write audit slash sha slash artefacts
Run->>GH: gh issue comment 10
alt gh succeeds
GH-->>Run: comment URL
Run->>GH: gh issue create x N follow-ups
GH-->>Run: issue URLs
else gh fails
Run->>FS: write PENDING markdown to audit slash sha slash
end
Run-->>V: exit 0 success or exit 2 PENDING
Key flow decisions:
- The audit always writes the captured artefacts to disk first (idempotent, re-runnable). The GitHub side effects are the last steps so any earlier failure leaves a complete capture for inspection.
- A non-zero
ghexit shifts the pipeline to PENDING mode rather than failing the whole run; the script exits2to flag "audit ran but GitHub side-effects didn't apply".
Requirements Traceability
| Requirement | Summary | Components | Interfaces / Artefacts | Flows |
|---|---|---|---|---|
| 1.1 | Run canonical git grep |
audit_cjk.sh | cjk-grep.txt |
Audit step |
| 1.2 | Classify each match | classify.py | classified.csv |
Audit step |
| 1.3 | Record file:line + step tag for gap |
classify.py | classified.csv (step column) |
Audit step |
| 1.4 | No file modifications during audit | run_audit.sh | scripts are read-only | — |
| 1.5 | en.json CJK = always gap |
classify.py | hard rule in classifier | Audit step |
| 2.1 | Enumerate keys recursively | check_parity.py | parity.txt |
Audit step |
| 2.2 | Missing-key gaps recorded | check_parity.py | parity.txt (missing-key block) |
Audit step |
| 2.3 | EN catalogue CJK = gap |
check_parity.py | parity.txt (cjk-in-en block) |
Audit step |
| 2.4 | EN/ZH identical = review-needed |
check_parity.py | parity.txt (identical-value block) |
Audit step |
| 2.5 | No catalogue edits | check_parity.py | read-only stdlib JSON load | — |
| 3.1 | Enumerate prompt files | classify.py (heuristic — known files list) | gap-report.md Section 3 |
— |
| 3.2 | Confirm locale-aware or EN-only | classify.py | gap-report.md Section 3 |
— |
| 3.3 | Hard-coded ZH directive = gap |
classify.py | classified.csv (category=prompt-label) |
— |
| 3.4 | #3, #4, #5 prompts post-merge check | classify.py | gap-report.md Section 3 |
— |
| 4.1 | Identify handoff boundaries | render_report.py | gap-report.md Section 4 |
— |
| 4.2 | Confirm explicit or re-derived locale | render_report.py | gap-report.md Section 4 |
— |
| 4.3 | Silent default = gap |
classify.py | classified.csv (category=propagation) |
— |
| 4.4 | Backend logger EN under EN | classify.py | classified.csv (category=backend-log) |
— |
| 5.1 | Comment lists every checklist item | render_report.py | comment-body.md |
Comment-post |
| 5.2 | Each gap includes file:line + follow-up link |
render_report.py | comment-body.md |
Comment-post |
| 5.3 | manual-pending items state repro steps |
render_report.py | comment-body.md |
Comment-post |
| 5.4 | Comment includes raw audit (or path) | render_report.py | comment-body.md (path reference) |
Comment-post |
| 5.5 | Post via gh issue comment 10 |
post_comment.sh | comment-body.md |
Comment-post |
| 6.1 | ZH covers every EN key | check_parity.py | (already passes per gap-analysis) | — |
| 6.2 | Locale-aware prompts symmetric | render_report.py | gap-report.md Section 6 |
— |
| 6.3 | EN-only ZH value = review-needed |
check_parity.py | parity.txt (identical-value block) |
— |
| 6.4 | ZH regression filed as gap | classify.py | classified.csv |
— |
| 7.1 | File issue per gap | file_followups.sh | gh issue create |
Follow-up |
| 7.2 | Group by category | file_followups.sh | one body per category in PENDING-followups/ |
Follow-up |
| 7.3 | No production-code edits | run_audit.sh | only writes under .kiro/specs/.../ |
— |
| 7.4 | Label follow-ups i18n |
file_followups.sh | gh issue create --label i18n |
Follow-up |
| 7.5 | Fallback inline list when no gh |
file_followups.sh | PENDING-followups/*.md |
Follow-up |
| 8.1 | Capture raw output | run_audit.sh | audit/<sha>/ directory |
Audit step |
| 8.2 | Preserve previous run | run_audit.sh | <sha> subdirectory naming |
Audit step |
| 8.3 | Record HEAD sha | run_audit.sh | git rev-parse HEAD |
Audit step |
| 8.4 | Idempotent re-run | run_audit.sh | re-running on same sha overwrites that sha's dir | Audit step |
Components and Interfaces
| Component | Domain | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts |
|---|---|---|---|---|---|
| run_audit.sh | Verification pipeline | Compose the audit and route artefacts | 1.4, 7.3, 8.1, 8.2, 8.3, 8.4 | git (P0), python3 (P0), gh (P1) | Batch |
| audit_cjk.sh | Static audit | Run git grep -nP and bucket |
1.1, 1.5 | git (P0) | Batch |
| check_parity.py | Catalogue diff | Diff en/zh + identical-value heuristic | 2.1, 2.2, 2.3, 2.4, 2.5, 6.1, 6.3 | python3 stdlib (P0) | Batch |
| classify.py | Classification | Apply the 4-class label per match | 1.2, 1.3, 1.5, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 6.4 | cjk-grep.txt (P0), parity.txt (P0) | Batch |
| render_report.py | Report assembly | Produce gap-report.md + comment-body.md | 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.2 | classified.csv (P0) | Batch |
| post_comment.sh | GitHub side-effect | Post the comment on #10 | 5.5 | gh (P0), comment-body.md (P0) | Service |
| file_followups.sh | GitHub side-effect | Open follow-up issues | 7.1, 7.2, 7.4, 7.5 | gh (P0), PENDING-followups/* (P0) | Service |
Verification pipeline
run_audit.sh
| Field | Detail |
|---|---|
| Intent | Single shell entrypoint that runs every step in order and persists artefacts under audit/<commit-sha>/ |
| Requirements | 1.4, 7.3, 8.1, 8.2, 8.3, 8.4 |
Responsibilities & Constraints
- Must NOT modify any file outside
.kiro/specs/i18n-e2e-english-verification/. - Must capture HEAD sha before any other step (so the artefact path is set).
- Must exit
0on full success (audit + GitHub side effects) and2on PENDING (audit succeeded, side effects didn't). - Must be safely re-runnable on the same sha (overwriting that sha's directory is acceptable).
Dependencies
- Inbound: invoked manually by the verifier (
bash run_audit.sh) — Criticality: P0. - Outbound:
audit_cjk.sh,check_parity.py,classify.py,render_report.py,post_comment.sh,file_followups.sh— Criticality: P0 each. - External:
git,python3,gh(P1 — fallback supported).
Contracts: Service [ ] / API [ ] / Event [ ] / Batch [x] / State [ ]
Batch / Job Contract
- Trigger: manual
bash .kiro/specs/i18n-e2e-english-verification/audit/scripts/run_audit.sh. - Input / validation: working tree at any commit; rejects detached non-clean trees? — no, the audit reads tracked files only via
git grep, so unstaged edits are ignored deliberately. - Output / destination:
.kiro/specs/i18n-e2e-english-verification/audit/<commit-sha>/. - Idempotency & recovery: Re-running on the same sha overwrites that sha's directory. PENDING outputs survive across runs until a
gh-enabled run replaces them.
Implementation Notes
- Integration: invoked by humans only — no CI hookup in this spec.
- Validation: confirm
gh auth statusbefore attempting comment/issue posts; on failure, branch to PENDING. - Risks: shell quoting around the PCRE pattern (
[\x{4e00}-\x{9fff}]) — use single-quoted argument togit grep -P.
audit_cjk.sh
| Field | Detail |
|---|---|
| Intent | Run the canonical PCRE grep + per-bucket counts |
| Requirements | 1.1, 1.5 |
Responsibilities & Constraints
- Output:
cjk-grep.txt(rawgit grep -nPlines) andcjk-grep-bucketed.txt(one section per top-level path:backend/app,frontend/src,locales/en.json). - Excludes binary file matches (e.g.
.jpegfalse positives).
Dependencies
- Inbound:
run_audit.sh(P0). - External:
git2.x (P0 — must support-Pfor PCRE).
Contracts: Batch [x]
Batch / Job Contract
- Trigger: invoked by
run_audit.sh. - Input / validation: receives the target output directory as argv[1]; aborts if missing.
- Output / destination:
cjk-grep.txt,cjk-grep-bucketed.txtin<sha>/. - Idempotency & recovery: deterministic — same tree → same output.
Implementation Notes
- Integration: pure read-only against
git. - Validation:
git --versionprecondition; abort with a clear error if PCRE unsupported. - Risks: ripgrep is NOT used (avoids a hard
rgdependency);git grep -Pis built-in to git's PCRE2 binding.
check_parity.py
| Field | Detail |
|---|---|
| Intent | Compare locales/en.json and locales/zh.json: key parity, CJK in EN, identical-value heuristic |
| Requirements | 2.1, 2.2, 2.3, 2.4, 2.5, 6.1, 6.3 |
Responsibilities & Constraints
- Recursively flattens nested-dict keys with dotted paths.
- Reports three blocks:
missing-keys,cjk-in-en,identical-values. - Treats values as
review-neededonly if (a) en value == zh value, (b) value is non-empty, (c) value is more than two ASCII words.
Dependencies
- Inbound:
run_audit.sh(P0). - External:
jsonfrom Python stdlib (P0).
Contracts: Batch [x]
Batch / Job Contract
- Trigger: invoked by
run_audit.shwith the<sha>directory as argv[1]. - Input / validation: reads
locales/en.jsonandlocales/zh.jsonfrom cwd (must be invoked from repo root); fails fast on JSON parse error. - Output / destination:
parity.txtin<sha>/. - Idempotency & recovery: pure function of catalogue contents.
Implementation Notes
- Integration: invoked from repo root so relative paths resolve.
- Validation: parse-on-load, both files must be objects.
- Risks: the "more than two ASCII words" heuristic may produce noise —
review-neededis intentionally a soft label not agap.
classify.py
| Field | Detail |
|---|---|
| Intent | Apply the 4-class label (deliberate / gap / non-applicable / review-needed) and a category tag per match |
| Requirements | 1.2, 1.3, 1.5, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 6.4 |
Responsibilities & Constraints
- Reads
cjk-grep.txtandparity.txt; emitsclassified.csvwith columns:file,line,match,class,category,pipeline_step. - Categories (closed set):
frontend-ui-string,frontend-regex-parser,backend-docstring,backend-comment,backend-log,backend-prompt-label,propagation,catalogue-parity,binary-false-positive. - Pipeline-step tags (closed set):
Graph Build,Env Setup,Simulation,Report,Interaction,Logs,UI,n/a. - Classification rules:
locales/en.jsonCJK → alwaysgap/catalogue-parity/n/a(R1.5).- File path under
frontend/src/views/orfrontend/src/components/AND match is inside a string literal (heuristic: enclosed in'…'/"…"/`…`) →gap/frontend-ui-string. - Match inside a
text.match(/.../)call in a.vuefile →frontend-regex-parser/gap(cause: backend emits CJK). - Backend
.pyfile, line starts with#or appears inside a triple-quoted docstring →deliberate-blocked-by-#7/backend-docstring(orbackend-comment) — counted but not filed as a fresh follow-up since #7 already covers it. - Backend
.pyfile, line containslogger.,log.,print(and CJK in a string literal →gap/backend-log/ appropriate step tag. - Backend
.pyfile inservices/{ontology,oasis_profile,simulation_config,report_agent}_generator.pyand CJK appears inside an LLM-prompt context label (heuristic: a string literal not preceded by#) →gap/backend-prompt-label. - Binary files (e.g.
.jpegripgrep matches):non-applicable/binary-false-positive. - Anything else:
review-needed(forces a human look).
Dependencies
- Inbound:
audit_cjk.sh,check_parity.py(P0). - External:
csvfrom Python stdlib.
Contracts: Batch [x]
Batch / Job Contract
- Trigger: invoked by
run_audit.shafter the two preceding steps. - Input / validation:
cjk-grep.txtandparity.txtmust exist in<sha>/. - Output / destination:
classified.csv. - Idempotency & recovery: deterministic — same inputs → same csv.
Implementation Notes
- Integration: classification rules are heuristics, not a parser; correctness is bounded by careful regexes and an explicit "fallthrough =
review-needed" rule. - Validation: every input row produces an output row (no silent drops); a count-equality assertion runs at the end.
- Risks: false negatives (e.g. a Chinese log string that doesn't contain
logger.on the same line) —review-neededfallthrough catches these.
render_report.py
| Field | Detail |
|---|---|
| Intent | Produce gap-report.md and comment-body.md |
| Requirements | 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.2 |
Responsibilities & Constraints
gap-report.md: Sections: Overview, Section 1 (static audit), Section 2 (parity), Section 3 (prompt verification), Section 4 (propagation), Section 5 (issue-#10 checklist mapping), Section 6 (ZH regression), Section 7 (follow-up plan).comment-body.md: Markdown comment for issue #10 — mirrors the issue's checklist withpass/gap/manual-pendingfor each line, plus a "How to re-run" footer.- Reads
classified.csvand the issue body (snapshot at.ticket/10.md).
Dependencies
- Inbound:
classify.py(P0),.ticket/10.md(P0). - External: Python stdlib only.
Contracts: Batch [x]
Batch / Job Contract
- Trigger:
run_audit.shafterclassify.py. - Input / validation:
classified.csvand.ticket/10.mdmust exist. - Output / destination:
gap-report.md,comment-body.mdin<sha>/. - Idempotency & recovery: deterministic.
Implementation Notes
- Integration: the comment body must include a
Run on commit <sha>header so the comment is traceable. - Validation: confirm every issue-body checkbox has been mapped (count check).
- Risks: rendering CJK characters in markdown — Python writes UTF-8 by default; comment body is verified to round-trip via
gh.
post_comment.sh
| Field | Detail |
|---|---|
| Intent | Post comment-body.md as a comment on issue #10 |
| Requirements | 5.5 |
Responsibilities & Constraints
gh issue comment 10 --repo salestech-group/MiroFish --body-file <sha>/comment-body.md.- On non-zero exit, copies the body to
<sha>/PENDING-issue-10-comment.mdand exits non-zero.
Dependencies
- External:
gh(P0; degrades to PENDING when missing).
Contracts: Service [x]
Service Interface
post_comment.sh <sha-dir>
precondition: <sha-dir>/comment-body.md exists
postcondition (success): comment posted; URL printed to stdout
postcondition (failure): <sha-dir>/PENDING-issue-10-comment.md present; exit code 2
Implementation Notes
- Integration: must be the second-to-last step (so failures don't block the issue-filing fallback).
- Validation: parses
gh's URL output and writes it to<sha>/comment-url.txton success. - Risks: PR-time rate limits — unlikely for a single comment.
file_followups.sh
| Field | Detail |
|---|---|
| Intent | Open one follow-up issue per gap category |
| Requirements | 7.1, 7.2, 7.4, 7.5 |
Responsibilities & Constraints
- Iterates
<sha>/PENDING-followups/*.md(whichrender_report.pyalways writes; the ones whose category had zero gaps stay empty placeholders). - For each non-empty body, runs
gh issue create --repo salestech-group/MiroFish --title <title> --body-file <body> --label i18n. - On
ghfailure for any single category, leaves the correspondingPENDING-followups/<n>-*.mdin place and exits non-zero at the end (after attempting all categories).
Dependencies
- External:
gh(P0; degrades to PENDING).
Contracts: Service [x]
Service Interface
file_followups.sh <sha-dir>
precondition: <sha-dir>/PENDING-followups/*.md exist (possibly empty placeholders)
postcondition (success): all non-empty bodies posted; URLs appended to <sha-dir>/followup-urls.txt; bodies removed from PENDING-followups/
postcondition (partial): URLs in followup-urls.txt for the ones that posted; the rest stay in PENDING-followups/; exit code 2
Implementation Notes
- Integration: must be the last step.
- Validation: post-hoc count check (
ghURLs + remaining PENDING bodies = total categories). - Risks: a category that the spec already considers covered (e.g. backend docstrings → blocked by #7) is not re-filed; the spec's category list is closed and excludes that case.
Data Models
Domain Model
The audit operates on three logical concepts:
- Match — a single line of
git grepoutput.(file, line, raw_text). - Classification —
(match, class ∈ {deliberate, gap, non-applicable, review-needed}, category ∈ closed-set, pipeline_step ∈ closed-set). - Follow-up —
(category, title, body, status ∈ {posted, pending}, url?).
Invariant: every Match produces exactly one Classification; every Classification with class == gap belongs to exactly one Follow-up category (which may aggregate multiple gaps).
Logical Data Model
classified.csv schema (CSV, UTF-8, header row):
| Column | Type | Notes |
|---|---|---|
file |
string | repo-relative path |
line |
int | 1-indexed |
match |
string | trimmed grep line |
class |
enum | deliberate / gap / non-applicable / review-needed |
category |
enum | closed set listed in classify.py rules |
pipeline_step |
enum | closed set listed in classify.py rules |
Natural key: (file, line).
parity.txt structure (text, three labelled blocks):
[missing-keys]
en-only: <key.path>
zh-only: <key.path>
[cjk-in-en]
<key.path>: <value snippet>
[identical-values]
<key.path>: <value> # review-needed if non-trivial English prose
Data Contracts & Integration
comment-body.mdmust be valid GitHub-flavoured Markdown; checkbox lines preserve the issue's original ordering.- Follow-up issue body must be valid GitHub-flavoured Markdown; first line is a one-sentence summary; subsequent sections are:
## Evidence(file:line list),## Linked from(#10 + comment URL),## Acceptance(a small checklist).
Error Handling
Error Strategy
- Read-only operations (steps 1–4): on any uncaught error (missing file, JSON parse error), the script aborts with a non-zero exit before any artefact is half-written. The orchestrator uses
set -euo pipefail. - GitHub side effects (steps 5–6): wrapped — failure routes to PENDING outputs and the orchestrator exits
2.
Error Categories and Responses
- User errors: invoked from wrong directory → fail fast with "must be run from repo root".
- System errors:
git/python3/ghmissing → fail fast with "install ";gh auth statusnot OK → branch to PENDING. - Business errors: classification produces 0 matches but
cjk-grep.txtnon-empty → assertion failure (count-equality bug).
Monitoring
- The orchestrator prints a one-line status per step.
- Final summary block to stdout: total matches, gaps,
manual-pending, follow-ups posted vs PENDING.
Testing Strategy
- Unit tests: not introduced — the scripts are simple enough that a one-shot dry run on the live tree is the canonical validation.
- Integration test: a single
bash run_audit.shagainst the working tree; success criteria below. - Validation checklist (run during implementation):
- The audit produces a non-empty
cjk-grep.txt. parity.txtreports 0 missing keys (matches the live state at HEAD).classified.csvrow count ==cjk-grep.txtline count.gap-report.mdandcomment-body.mdparse as valid markdown (manual eyeball — no toolchain required).- The classifier marks every
locales/en.jsonCJK asgap(currently zero such matches, so this asserts the negative). - With
ghavailable: a comment is posted on #10 and follow-up issues are created. - With
ghsimulated as absent (e.g.PATH=/dev/null): PENDING outputs appear under<sha>/.
- The audit produces a non-empty
Out of scope for testing
- The live UI walkthrough is
manual-pending(R5.3) and not part of the test plan. - Performance, scalability, security: nothing to test — read-only single-shot scripts.