29 KiB
Design — i18n-locale-parity-guard
Overview
This feature extends the project's PR-time i18n CI guard so that any pull request which introduces a key in only one of locales/en.json / locales/zh.json fails. It satisfies acceptance criterion #4 of epic #11 (locale-key parity) with a permanent automated check.
Purpose: Lock in locale-catalogue key parity as a permanent CI invariant so that AC #4 of epic #11 cannot regress as new strings are added.
Users: Project maintainers and PR authors. Maintainers gain a hard regression gate; PR authors gain a script they can run locally to confirm parity before pushing.
Impact: Adds a third check to the existing PR-time guard scripts/ci/i18n_cjk_guard.py. No production source under backend/app/, frontend/src/, or locales/ is modified by this spec.
Goals
- Fail any PR whose flattened-key set in
locales/en.jsondiffers from that oflocales/zh.json. - Print actionable failure lines (
<file>:<line>: parity-<en|zh>-only: <dotted-key>) and a summary count. - Compose with the existing CJK-clean and per-path-ratchet checks in a single CLI invocation, with a single exit code, no short-circuit.
- Run end-to-end in well under one second on the live catalogues; stdlib-only.
- Pass on
mainat the moment this spec ships (live catalogues are already parity-clean).
Non-Goals
- Re-implementing the manual audit pipeline at
.kiro/specs/i18n-e2e-english-verification/audit/scripts/. The new check is the CI extract; the audit retains its own copy ofcheck_parity.py. - Cross-locale value-equality, identical-value heuristics, or ICU-placeholder-shape checks.
- Auto-creating missing keys, suggesting translations, or reformatting the catalogues.
- Modifying the
locales/schema, thevue-i18nruntime, orbackend/app/utils/locale.py. - Adding a new GitHub Actions workflow or workflow step.
Boundary Commitments
This Spec Owns
- The new parity-check helpers (
_flatten_keys,_locate_key_line,_format_parity_finding,run_parity_check) and constants (ZH_JSON_REL_PATH) insidescripts/ci/i18n_cjk_guard.py. - The new third block of
run_checkthat invokesrun_parity_checkand integrates its result into the existingfailedaccumulator andsuccess_summarycollector. - The pass/fail semantics of the locale-key parity check.
- New unit / integration tests under
scripts/ci/tests/covering the parity check and its composition.
Out of Boundary
- The audit pipeline at
.kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py(independent, manual-only). - The structure or format of the baseline file
.kiro/specs/i18n-ci-guard/baseline.txt(parity is binary; no baseline needed). - The workflow file
.github/workflows/i18n-cjk-guard.yml(unchanged; samepython scripts/ci/i18n_cjk_guard.pyinvocation already covers the new check). - Any change to
locales/en.jsonorlocales/zh.jsoncontent. - Open follow-up issues #7, #23, #25 (out-of-scope translation work).
Allowed Dependencies
- Python ≥3.11 standard library (
json,os,pathlib,re,subprocess,sys,argparse,unittest). - The existing helpers
_flatten,_value_line_number,_truncate, theEN_JSON_REL_PATHconstant, and therun_check/update_baselinefunctions inscripts/ci/i18n_cjk_guard.py. git(for the existing CJK-counting block, untouched here).
Revalidation Triggers
- Adding a third locale catalogue → parity becomes pairwise; design must be revisited.
- Changing the
flattencontract (e.g. encoding non-dict containers like lists) → the parity check's "exact match withcheck_parity.py" clause must be re-asserted against the new contract. - Splitting the guard into multiple CLI scripts → Requirement 3 ("one invocation") must be re-anchored.
Architecture
Existing Architecture Analysis
The guard is a single-file Python CLI: scripts/ci/i18n_cjk_guard.py (~393 lines, stdlib-only) invoked by one workflow step in .github/workflows/i18n-cjk-guard.yml. Its run_check(repo_root, baseline_path) -> int function is the orchestrator; today it composes two checks without short-circuit:
scan_locale_cjk(en_json_path)— fail whenlocales/en.jsoncontains any CJK character.- Per-path baseline ratchet — fail when
count_path_cjk(repo_root, p)exceedsread_baseline(...)[p]for anypin("backend/app", "frontend/src").
A failed: bool accumulator is set independently by each block; a success_summary: list[str] collects "OK …" lines that print only on full success. This design extends it with a third block.
The audit pipeline at .kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py already implements the algorithm we need (recursive flatten + symmetric difference). Its logic is the canonical reference for Requirement 1.1.
Architecture Pattern & Boundary Map
graph TB
Workflow[GitHub Actions step]
Main[main entry]
UpdateBaseline[update_baseline]
RunCheck[run_check orchestrator]
CjkClean[scan_locale_cjk]
Ratchet[count_path_cjk + read_baseline]
Parity[run_parity_check NEW]
EnJson[locales en.json]
ZhJson[locales zh.json]
BaselineFile[baseline.txt]
Workflow --> Main
Main -->|--update-baseline| UpdateBaseline
Main --> RunCheck
RunCheck --> CjkClean
RunCheck --> Ratchet
RunCheck --> Parity
CjkClean --> EnJson
Ratchet --> BaselineFile
Parity --> EnJson
Parity --> ZhJson
Architecture Integration:
- Selected pattern: Composed checks inside a single orchestrator (
run_check). Each check is an independent function that returns a pass/fail signal and a list of human-readable lines; the orchestrator accumulates them. - Domain/feature boundaries: Parity logic is internal to the guard module. It does not depend on the audit pipeline, the per-path ratchet, or the locale runtime.
- Existing patterns preserved: No-short-circuit composition, stderr-for-failure / stdout-for-success, lexicographic ordering for determinism, atomic-write / tmp-rename for any new persistence (none added here).
- New components rationale:
run_parity_checkis the only new orchestrator-level function; small private helpers (_flatten_keys,_locate_key_line,_format_parity_finding) keeprun_parity_check's body short and individually testable. - Steering compliance: Stdlib-only; explicit type hints (PEP 604 union syntax already in use in this module); single-responsibility helpers; module dependency direction unchanged (still no imports from
backend/,frontend/, orlocales/runtime code).
Technology Stack
| Layer | Choice / Version | Role in Feature | Notes |
|---|---|---|---|
| Backend / Services | n/a | n/a | This is a CI tool; no backend or service code is touched. |
| Infrastructure / Runtime | Python 3.11 stdlib (json, pathlib, re, subprocess, sys, argparse); GitHub Actions ubuntu-latest; actions/checkout@v4; actions/setup-python@v5 |
Runtime for the guard script and its new parity check. | Versions match the existing guard. No new dependencies; pyproject.toml and CI image unchanged. |
| Test Tooling | Python unittest (stdlib) |
Drives parity check unit + integration tests. | Same framework as existing tests in scripts/ci/tests/test_i18n_cjk_guard.py. |
File Structure Plan
Directory Structure
scripts/
└── ci/
├── i18n_cjk_guard.py # Extended: adds parity helpers + third block in run_check
└── tests/
└── test_i18n_cjk_guard.py # Extended: adds ParityCheckTests + composition test
Modified Files
scripts/ci/i18n_cjk_guard.py- Add module-level constants:
ZH_JSON_REL_PATH = "locales/zh.json". - Add private helpers:
_flatten_keys,_locate_key_line,_format_parity_finding. - Add public function:
run_parity_check(repo_root: Path) -> ParityResult. - Add a new
NamedTuple(or@dataclass(frozen=True, slots=True))ParityResultwith fields(passed: bool, failure_lines: list[str], success_summary: str | None). - Edit
run_check: insert the parity block after the per-path-ratchet block, before the finalif not failed: print(success_summary)block. Match the existing accumulator idiom. - Update the module docstring to list three checks.
- Add module-level constants:
scripts/ci/tests/test_i18n_cjk_guard.py- Extend
_make_full_repo(or add a sibling_make_full_repo_with_zh) to write alocales/zh.jsonalongside the existinglocales/en.json. Keep the default ZH a parity-clean mirror of the EN fixture so existing tests do not need to change semantically. - Add new test class
ParityCheckTestscovering Requirements 1.1, 1.2, 1.3, 1.4, 1.5, 2.1, 2.2, 2.3, 2.5. - Add one composition test (Requirement 5.1.f) inside
RunCheckEndToEndTests(or a newRunCheckCompositionTestsclass) that plants a CJK string and a parity divergence in the same repo and asserts both failure lines + exit 1. - Update existing
RunCheckEndToEndTests.test_*to either commit a parity-cleanlocales/zh.jsonor assert the parity check now also runs but does not flip the test outcome.
- Extend
Files Not Created
- No new source file is created. Option C (separate
locale_parity.pyhelper module) was rejected ingap-analysis.mdandresearch.md. - No new workflow file. The existing
.github/workflows/i18n-cjk-guard.ymlis invoked unchanged.
Requirements Traceability
| Requirement | Summary | Components | Interfaces | Flows |
|---|---|---|---|---|
| 1.1 | Flatten EN/ZH into matching dotted-key sets | i18n_cjk_guard._flatten_keys (new), reuses _flatten |
_flatten_keys(data: dict) -> set[str] |
n/a |
| 1.2 | Pass on identical key sets, success line includes shared count | run_parity_check, run_check |
ParityResult.success_summary |
Run-Check Composition |
| 1.3 / 1.4 | Fail on en-only or zh-only keys | run_parity_check |
ParityResult.passed, ParityResult.failure_lines |
Run-Check Composition |
| 1.5 | Dict leaves are non-leaves; scalar leaves are leaves | _flatten_keys (no type narrowing) |
n/a | n/a |
| 2.1 | <file>:<line>: parity-<side>-only: <key> lines |
_format_parity_finding, _locate_key_line |
_format_parity_finding(file, line, key, side) -> str |
n/a |
| 2.2 | Line-1 fallback when key not located | _locate_key_line |
_locate_key_line(text_lines, key) -> int (returns 1 on miss) |
n/a |
| 2.3 | Final parity: en-only=N, zh-only=M summary |
run_parity_check |
Last entry of ParityResult.failure_lines on failure |
n/a |
| 2.4 | All parity output to stderr | run_check integration block |
print(..., file=sys.stderr) |
Run-Check Composition |
| 2.5 | Lexicographic ordering | run_parity_check |
sorted(...) over symmetric difference |
n/a |
| 3.1 | All checks run, no short-circuit | run_check (existing accumulator pattern) |
failed: bool accumulator |
Run-Check Composition |
| 3.2 / 3.3 | Single exit code: 1 on any fail, 0 otherwise | run_check |
Returns 1 if failed else 0 |
Run-Check Composition |
| 3.4 / 3.5 | --update-baseline, --baseline, --repo-root flags unchanged |
main, _build_parser |
Existing argparse surface | n/a |
| 3.6 | Workflow file unchanged | .github/workflows/i18n-cjk-guard.yml |
n/a (no edit) | n/a |
| 4.1 | Stdlib-only | i18n_cjk_guard imports |
No new imports | n/a |
| 4.2 | Sub-second runtime | _flatten_keys is O(keys); set-diff is O(keys) |
n/a | n/a |
| 4.3 | Deterministic output | All sorts lexicographic | n/a | n/a |
| 5.1 (a–f) | Tests for success, en-only, zh-only, both, scalar-leaf, composition | scripts/ci/tests/test_i18n_cjk_guard.py:ParityCheckTests + composition test |
n/a | n/a |
| 5.2 / 5.3 / 5.4 | Match existing test style; isolated fixtures; clean run on parity-clean repo | Same test file | n/a | n/a |
| 6.1 | Guard passes on live catalogues at HEAD | Manual run at implementation time | python scripts/ci/i18n_cjk_guard.py exit 0 |
n/a |
| 6.2 | If divergence found, document in tasks.md and fix | n/a (does not trigger; live parity holds) | n/a | n/a |
System Flows
Run-Check Composition
sequenceDiagram
participant CLI as main
participant Orch as run_check
participant CjkChk as scan_locale_cjk
participant RatChk as ratchet block
participant ParChk as run_parity_check
participant Out as stderr/stdout
CLI->>Orch: run_check repo baseline
Orch->>CjkChk: scan en.json
CjkChk-->>Orch: findings list
alt findings non-empty
Orch->>Out: stderr cjk-in-en lines
Note over Orch: failed = True
else
Note over Orch: success summary append
end
Orch->>RatChk: count + read baseline
RatChk-->>Orch: regressions list
alt regressions non-empty
Orch->>Out: stderr cjk-regression lines + refresh hint
Note over Orch: failed = True
else
Note over Orch: success summary append
end
Orch->>ParChk: run parity check
ParChk-->>Orch: ParityResult
alt parity failed
Orch->>Out: stderr parity lines + parity summary
Note over Orch: failed = True
else
Note over Orch: success summary append
end
alt failed false
Orch->>Out: stdout success lines
end
Orch-->>CLI: 1 if failed else 0
Key decisions:
- The parity block is appended last so its (potentially long) failure list is contiguous in the failure stream.
- The
failedaccumulator is shared with the prior two blocks; this is the only mechanism for cross-block signalling. - The summary line
parity: en-only=N, zh-only=Mis appended toParityResult.failure_lines(last entry) so the orchestrator can print all failure lines uniformly without a special-case branch.
Components and Interfaces
| Component | Domain/Layer | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts |
|---|---|---|---|---|---|
_flatten_keys |
Guard / helper | Return the dotted-key set of a parsed JSON catalogue, mirroring check_parity.py.flatten. |
1.1, 1.5 | _flatten (P0, existing) |
Service |
_locate_key_line |
Guard / helper | Best-effort line-number resolution for a dotted key in raw JSON text, with line-1 fallback. | 2.1, 2.2 | none | Service |
_format_parity_finding |
Guard / helper | Format one failure line as <file>:<line>: parity-<side>-only: <key>. |
2.1 | none | Service |
ParityResult |
Guard / DTO | Carry parity-check outcome (passed flag, failure lines, success-summary line). | 1.2, 2.3, 2.5 | none | State |
run_parity_check |
Guard / orchestrator-leaf | Read both catalogues, compute symmetric difference, build ParityResult. |
1.1–1.5, 2.1–2.5 | _flatten_keys (P0), _locate_key_line (P0), _format_parity_finding (P0) |
Service |
run_check (modified) |
Guard / orchestrator | Compose the three checks with a single failed accumulator and exit code. |
3.1–3.3 | All three checks (P0) | Service |
ParityCheckTests (test) |
Tests | Unit + integration coverage for parity. | 5.1 (a–f), 5.2–5.4 | run_parity_check, run_check (P0) |
Service |
Guard / helper layer
_flatten_keys
| Field | Detail |
|---|---|
| Intent | Return the set of dotted-key paths of a parsed JSON object, mirroring check_parity.py.flatten. |
| Requirements | 1.1, 1.5 |
Responsibilities & Constraints
- Iterate via the existing
_flatten(prefix, value, out)helper to guarantee identical path semantics. - Descend only into
dict. Any non-dict (string, number, bool, null, list) at a leaf produces a key. - Return a
set[str]so the parity caller can compute symmetric differences without re-deduplicating.
Dependencies
- Inbound:
run_parity_check(P0). - Outbound:
_flatten(P0, existing private helper in same module).
Contracts: Service [x]
Service Interface
def _flatten_keys(data: dict[str, object]) -> set[str]:
...
- Preconditions:
datais the result ofjson.loadsover a catalogue file (i.e., adictat the top level). - Postconditions: every dotted path returned corresponds to a non-
dictleaf indata. The set is unordered; callers must sort before formatting output (Requirement 2.5). - Invariants:
_flatten_keys({}) == set(). For any cataloguec,_flatten_keys(c)is identical to the set of keys produced bycheck_parity.py.flatten(c).
Implementation Notes
- Integration: One call site (
run_parity_check). - Validation: Unit-test against a hand-rolled fixture with mixed leaf types (string, number, bool, null) and at least three nesting levels (Requirement 5.1.e).
- Risks: None. Reuses the existing flatten primitive verbatim.
_locate_key_line
| Field | Detail |
|---|---|
| Intent | Best-effort line-number resolution for a dotted key in the raw JSON source text, with a deterministic line-1 fallback. |
| Requirements | 2.1, 2.2 |
Responsibilities & Constraints
- Accept the splitlines view of a JSON file (
text_lines: list[str]) and a dotted key (dotted_key: str). - Search for the leaf segment of the dotted key (after the last
.) wrapped in JSON quotes, e.g."missingKey". Return the 1-based line number of the first match. - Fall back to
1when no match is found (mirrors_value_line_number). - Performance must remain linear in the number of lines.
Dependencies
- Inbound:
run_parity_check(P0). - Outbound: none.
Contracts: Service [x]
Service Interface
def _locate_key_line(text_lines: list[str], dotted_key: str) -> int:
...
- Preconditions:
dotted_keynon-empty;text_linesis the result ofPath.read_text(...).splitlines(). - Postconditions: returns an integer ≥ 1.
- Invariants: When the leaf segment appears in
text_lineswrapped in"...", the return is the (1-based) line number of the first occurrence. Otherwise the return is1.
Implementation Notes
- Integration: One call site (
run_parity_check). - Validation: Unit-test the exact-match path, the multi-occurrence path (first match wins), and the not-found fallback.
- Risks: A leaf segment that also appears as part of another (unrelated) key or in a value text could yield a slightly misleading line number. Acceptable: the dotted key in the failure message is the source of truth; the line is a navigation aid. Documented in the docstring.
_format_parity_finding
| Field | Detail |
|---|---|
| Intent | Format a single parity-failure line in the canonical layout used by the guard. |
| Requirements | 2.1 |
Responsibilities & Constraints
- Produce strings of the exact form
<file>:<line>: parity-en-only: <dotted-key>or<file>:<line>: parity-zh-only: <dotted-key>. - Mirror the existing
_format_locale_findingstyle (<file>:<line>: <category>: <payload>).
Dependencies
- Inbound:
run_parity_check(P0). - Outbound: none.
Contracts: Service [x]
Service Interface
def _format_parity_finding(file_rel_path: str, line_no: int, dotted_key: str, side: str) -> str:
...
- Preconditions:
side in {"en-only", "zh-only"};file_rel_pathis one ofEN_JSON_REL_PATH/ZH_JSON_REL_PATH;line_no >= 1. - Postconditions: returns a single line with no embedded newline.
- Invariants: The category token in the line is exactly
parity-en-onlyorparity-zh-onlyso log greps match deterministically.
Guard / DTO layer
ParityResult
| Field | Detail |
|---|---|
| Intent | Immutable carrier for parity-check outcome consumed by run_check. |
| Requirements | 1.2, 2.3, 2.5 |
Contracts: State [x]
State Management
- State model:
class ParityResult(NamedTuple):
passed: bool
failure_lines: list[str] # already-formatted lines, including the trailing "parity: en-only=N, zh-only=M" summary on failure
success_summary: str | None # populated only when passed is True
- Persistence & consistency: in-memory only; constructed by
run_parity_checkand consumed byrun_check. - Concurrency strategy: n/a (single-process, single-call).
Guard / orchestrator-leaf
run_parity_check
| Field | Detail |
|---|---|
| Intent | Compute the locale-key parity outcome and produce a ParityResult. |
| Requirements | 1.1–1.5, 2.1–2.5 |
Responsibilities & Constraints
- Read both
locales/en.jsonandlocales/zh.jsonfromrepo_root. - Flatten each via
_flatten_keysand compute the symmetric difference. - For each en-only key (sorted lexicographically): resolve its line via
_locate_key_lineover the EN catalogue's source-text lines, and emit aparity-en-onlyline via_format_parity_finding. - For each zh-only key (sorted lexicographically, after en-only): resolve its line via
_locate_key_lineover the ZH catalogue's source-text lines, and emit aparity-zh-onlyline. - On failure, append a final
parity: en-only=N, zh-only=Msummary line tofailure_lines. - On success, build the success summary
OK locale-parity: <count> keys per side. - If either catalogue file is missing, return a
ParityResult(passed=False, failure_lines=[<single error line>], success_summary=None)and letrun_checkfold the error into the globalfailedflag.
Dependencies
- Inbound:
run_check(P0). - Outbound:
_flatten_keys,_locate_key_line,_format_parity_finding(all P0).
Contracts: Service [x]
Service Interface
def run_parity_check(repo_root: Path) -> ParityResult:
...
- Preconditions:
repo_rootis a valid working-tree directory;locales/en.jsonandlocales/zh.jsonare expected at the relative paths defined byEN_JSON_REL_PATHandZH_JSON_REL_PATH. - Postconditions: returns a
ParityResult. Whenpassed,failure_lines == []andsuccess_summaryis non-None. When notpassed,failure_linesis non-empty and ends with aparity: en-only=…summary line;success_summaryisNone. - Invariants: Flattened-key-set computation matches
check_parity.py.flattenbyte-for-byte for any input. Output is deterministic across runs for identical inputs.
Implementation Notes
- Integration: Called once per
run_checkinvocation. Skipped entirely in--update-baselinemode (covered by Requirement 3.4 —update_baselineis invoked frommaininstead ofrun_check). - Validation: Unit-test all required outcomes (Requirement 5.1 a–e); integration-test composition (5.1 f).
- Risks: A malformed JSON catalogue raises
json.JSONDecodeError. The function should treat this the same as a missing file (returnParityResult(passed=False, …)), so the guard reports a clean failure rather than crashing CI with a Python traceback.
Guard / orchestrator (modified)
run_check (modification)
| Field | Detail |
|---|---|
| Intent | Compose all three checks (CJK-clean, per-path ratchet, parity) into one exit code. |
| Requirements | 3.1, 3.2, 3.3 |
Responsibilities & Constraints
- After the existing per-path-ratchet block (existing line ~258–293) and before the final
if not failedblock (existing line ~295–298), callrun_parity_check(repo_root). - If the result is not passed, set
failed = True, print every entry ofresult.failure_linestosys.stderr, one line perprint(...)call. - If passed, append
result.success_summarytosuccess_summary. - Return
1 if failed else 0(unchanged).
Dependencies
- Inbound:
main(P0, via either standalone CLI or test invocation). - Outbound:
scan_locale_cjk, per-path ratchet helpers,run_parity_check(all P0).
Contracts: Service [x] / State [x]
Service Interface
Unchanged signature: def run_check(repo_root: Path, baseline_path: Path) -> int.
- Preconditions: unchanged.
- Postconditions: exit code reflects all three checks (was: two checks).
- Invariants: still no short-circuit between checks.
Implementation Notes
- Integration: One inserted block of ~10 lines in the existing function.
- Validation: Existing CLI smoke tests continue to pass; new
RunCheckEndToEndTestscases assert correct fail/pass propagation when only the parity check fails, only an existing check fails, or both fail. - Risks: A future maintainer could accidentally short-circuit by inserting an early
returnbetween blocks. Mitigated by the composition test (Requirement 5.1.f) which fails if any block is skipped.
Tests
ParityCheckTests
| Field | Detail |
|---|---|
| Intent | Unit + integration coverage for the parity check, matching the style of existing RunCheckEndToEndTests. |
| Requirements | 5.1 (a–f), 5.2, 5.3, 5.4 |
Responsibilities & Constraints
- Use
unittest,tempfile.TemporaryDirectory, and the existing_make_repo/_commit_filetest helpers. - Each test owns its own ephemeral repo. No reliance on the live
locales/content for negative paths (Requirement 5.3). - Assertions check exit code AND substring presence of the failure category tokens (
parity-en-only,parity-zh-only) AND that the summary line is the last failure line.
Dependencies
- Inbound:
unittest.main. - Outbound:
i18n_cjk_guard.run_parity_check,i18n_cjk_guard.run_check(both P0).
Implementation Notes
- Test cases (one per Requirement 5.1 sub-bullet):
- (a)
test_passes_when_keys_match— both catalogues identical →run_parity_checkreturnspassed=True;run_checkreturns 0. - (b)
test_fails_on_en_only_key—en.jsonhas an extra key →run_parity_checkreturnspassed=False, failure includesparity-en-only, summary isparity: en-only=1, zh-only=0. - (c)
test_fails_on_zh_only_key— symmetric of (b). - (d)
test_fails_on_both_sided_divergence— failure list contains bothparity-en-onlyandparity-zh-onlylines, ordered en-first then zh, each lex-sorted within its group. - (e)
test_passes_with_scalar_leaves_at_same_path— both catalogues have a scalar (e.g.null,42,false) at the same dotted path → parity passes (Requirement 1.5). - (f)
test_run_check_no_short_circuit— one repo plants both a CJK inen.jsonand a parity-divergent key. Expect: exit 1; stderr contains bothcjk-in-enandparity-en-only(orparity-zh-only); the per-path-ratchet success summary is suppressed (since failed).
- (a)
- Risks: Test fixtures must use
ensure_ascii=FalseJSON to match the live catalogue style.
Error Handling
Error Strategy
- Missing catalogue file →
run_parity_checkreturnsParityResult(passed=False, failure_lines=[<missing-file-line>], success_summary=None).run_checkflipsfailed, prints the line to stderr, returns 1. - Malformed JSON → same path as missing catalogue.
json.JSONDecodeErroris caught insiderun_parity_check; the line printed names the offending file and the parser'smsg. - Parity divergence (the expected unhappy path) → fail per Requirements 1.3 / 1.4 / 2.1–2.5.
_locate_key_linecannot find the key → fall back to line 1 (Requirement 2.2). Not an error; the caller proceeds.- No-short-circuit invariant → enforced by the orchestrator's accumulator pattern; covered by Requirement 5.1.f.
Monitoring
CI workflow logs (GitHub Actions) are the sole observability surface. Failure lines are designed to be greppable: parity-en-only, parity-zh-only, parity: en-only=, parity: zh-only= are stable tokens.
Testing Strategy
Unit Tests
_flatten_keys: empty input, flat input, mixed-type leaves, three-level nesting,nulland scalar leaves._locate_key_line: exact match, multi-occurrence (first wins), not found (line-1 fallback)._format_parity_finding: en-only and zh-only sides, embedded special characters in key names (e.g. underscores, digits).ParityResult: pass-shape and fail-shape construction.
Integration Tests
- All six
ParityCheckTestssub-cases listed above. - The composition case (Requirement 5.1.f) inside
RunCheckCompositionTests(or appended toRunCheckEndToEndTests). - A regression of the existing
RunCheckEndToEndTestscases after extending_make_full_repoto write a default parity-cleanlocales/zh.json.
Performance / Load
- One sanity case: parity check on a synthetic 10 000-key catalogue completes in well under one second on the CI runner. Asserted by a
time.perf_counter()budget of 1.0 s in the integration test.
Performance & Scalability
- Catalogue size: ~1000 keys today; growth bounded by the number of UI strings + log keys. Even at 10× the current size,
_flatten+ set-diff remains negligible (<100 ms). - The CI workflow timeout is 1 minute (
.github/workflows/i18n-cjk-guard.yml:timeout-minutes: 1); the new check adds at most tens of milliseconds.
Supporting References
gap-analysis.md(this spec) — implementation-approach options A/B/C with rationale.research.md(this spec) — design decision records..kiro/specs/i18n-ci-guard/design.md— prior CI guard's design doc (style and boundary precedents)..kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py— reference parity algorithm.