MicroFish/.kiro/specs/i18n-locale-parity-guard/design.md

29 KiB
Raw Blame History

Design — i18n-locale-parity-guard

Overview

This feature extends the project's PR-time i18n CI guard so that any pull request which introduces a key in only one of locales/en.json / locales/zh.json fails. It satisfies acceptance criterion #4 of epic #11 (locale-key parity) with a permanent automated check.

Purpose: Lock in locale-catalogue key parity as a permanent CI invariant so that AC #4 of epic #11 cannot regress as new strings are added. Users: Project maintainers and PR authors. Maintainers gain a hard regression gate; PR authors gain a script they can run locally to confirm parity before pushing. Impact: Adds a third check to the existing PR-time guard scripts/ci/i18n_cjk_guard.py. No production source under backend/app/, frontend/src/, or locales/ is modified by this spec.

Goals

  • Fail any PR whose flattened-key set in locales/en.json differs from that of locales/zh.json.
  • Print actionable failure lines (<file>:<line>: parity-<en|zh>-only: <dotted-key>) and a summary count.
  • Compose with the existing CJK-clean and per-path-ratchet checks in a single CLI invocation, with a single exit code, no short-circuit.
  • Run end-to-end in well under one second on the live catalogues; stdlib-only.
  • Pass on main at the moment this spec ships (live catalogues are already parity-clean).

Non-Goals

  • Re-implementing the manual audit pipeline at .kiro/specs/i18n-e2e-english-verification/audit/scripts/. The new check is the CI extract; the audit retains its own copy of check_parity.py.
  • Cross-locale value-equality, identical-value heuristics, or ICU-placeholder-shape checks.
  • Auto-creating missing keys, suggesting translations, or reformatting the catalogues.
  • Modifying the locales/ schema, the vue-i18n runtime, or backend/app/utils/locale.py.
  • Adding a new GitHub Actions workflow or workflow step.

Boundary Commitments

This Spec Owns

  • The new parity-check helpers (_flatten_keys, _locate_key_line, _format_parity_finding, run_parity_check) and constants (ZH_JSON_REL_PATH) inside scripts/ci/i18n_cjk_guard.py.
  • The new third block of run_check that invokes run_parity_check and integrates its result into the existing failed accumulator and success_summary collector.
  • The pass/fail semantics of the locale-key parity check.
  • New unit / integration tests under scripts/ci/tests/ covering the parity check and its composition.

Out of Boundary

  • The audit pipeline at .kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py (independent, manual-only).
  • The structure or format of the baseline file .kiro/specs/i18n-ci-guard/baseline.txt (parity is binary; no baseline needed).
  • The workflow file .github/workflows/i18n-cjk-guard.yml (unchanged; same python scripts/ci/i18n_cjk_guard.py invocation already covers the new check).
  • Any change to locales/en.json or locales/zh.json content.
  • Open follow-up issues #7, #23, #25 (out-of-scope translation work).

Allowed Dependencies

  • Python ≥3.11 standard library (json, os, pathlib, re, subprocess, sys, argparse, unittest).
  • The existing helpers _flatten, _value_line_number, _truncate, the EN_JSON_REL_PATH constant, and the run_check/update_baseline functions in scripts/ci/i18n_cjk_guard.py.
  • git (for the existing CJK-counting block, untouched here).

Revalidation Triggers

  • Adding a third locale catalogue → parity becomes pairwise; design must be revisited.
  • Changing the flatten contract (e.g. encoding non-dict containers like lists) → the parity check's "exact match with check_parity.py" clause must be re-asserted against the new contract.
  • Splitting the guard into multiple CLI scripts → Requirement 3 ("one invocation") must be re-anchored.

Architecture

Existing Architecture Analysis

The guard is a single-file Python CLI: scripts/ci/i18n_cjk_guard.py (~393 lines, stdlib-only) invoked by one workflow step in .github/workflows/i18n-cjk-guard.yml. Its run_check(repo_root, baseline_path) -> int function is the orchestrator; today it composes two checks without short-circuit:

  1. scan_locale_cjk(en_json_path) — fail when locales/en.json contains any CJK character.
  2. Per-path baseline ratchet — fail when count_path_cjk(repo_root, p) exceeds read_baseline(...)[p] for any p in ("backend/app", "frontend/src").

A failed: bool accumulator is set independently by each block; a success_summary: list[str] collects "OK …" lines that print only on full success. This design extends it with a third block.

The audit pipeline at .kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py already implements the algorithm we need (recursive flatten + symmetric difference). Its logic is the canonical reference for Requirement 1.1.

Architecture Pattern & Boundary Map

graph TB
    Workflow[GitHub Actions step]
    Main[main entry]
    UpdateBaseline[update_baseline]
    RunCheck[run_check orchestrator]
    CjkClean[scan_locale_cjk]
    Ratchet[count_path_cjk + read_baseline]
    Parity[run_parity_check NEW]
    EnJson[locales en.json]
    ZhJson[locales zh.json]
    BaselineFile[baseline.txt]

    Workflow --> Main
    Main -->|--update-baseline| UpdateBaseline
    Main --> RunCheck
    RunCheck --> CjkClean
    RunCheck --> Ratchet
    RunCheck --> Parity
    CjkClean --> EnJson
    Ratchet --> BaselineFile
    Parity --> EnJson
    Parity --> ZhJson

Architecture Integration:

  • Selected pattern: Composed checks inside a single orchestrator (run_check). Each check is an independent function that returns a pass/fail signal and a list of human-readable lines; the orchestrator accumulates them.
  • Domain/feature boundaries: Parity logic is internal to the guard module. It does not depend on the audit pipeline, the per-path ratchet, or the locale runtime.
  • Existing patterns preserved: No-short-circuit composition, stderr-for-failure / stdout-for-success, lexicographic ordering for determinism, atomic-write / tmp-rename for any new persistence (none added here).
  • New components rationale: run_parity_check is the only new orchestrator-level function; small private helpers (_flatten_keys, _locate_key_line, _format_parity_finding) keep run_parity_check's body short and individually testable.
  • Steering compliance: Stdlib-only; explicit type hints (PEP 604 union syntax already in use in this module); single-responsibility helpers; module dependency direction unchanged (still no imports from backend/, frontend/, or locales/ runtime code).

Technology Stack

Layer Choice / Version Role in Feature Notes
Backend / Services n/a n/a This is a CI tool; no backend or service code is touched.
Infrastructure / Runtime Python 3.11 stdlib (json, pathlib, re, subprocess, sys, argparse); GitHub Actions ubuntu-latest; actions/checkout@v4; actions/setup-python@v5 Runtime for the guard script and its new parity check. Versions match the existing guard. No new dependencies; pyproject.toml and CI image unchanged.
Test Tooling Python unittest (stdlib) Drives parity check unit + integration tests. Same framework as existing tests in scripts/ci/tests/test_i18n_cjk_guard.py.

File Structure Plan

Directory Structure

scripts/
└── ci/
    ├── i18n_cjk_guard.py         # Extended: adds parity helpers + third block in run_check
    └── tests/
        └── test_i18n_cjk_guard.py # Extended: adds ParityCheckTests + composition test

Modified Files

  • scripts/ci/i18n_cjk_guard.py
    • Add module-level constants: ZH_JSON_REL_PATH = "locales/zh.json".
    • Add private helpers: _flatten_keys, _locate_key_line, _format_parity_finding.
    • Add public function: run_parity_check(repo_root: Path) -> ParityResult.
    • Add a new NamedTuple (or @dataclass(frozen=True, slots=True)) ParityResult with fields (passed: bool, failure_lines: list[str], success_summary: str | None).
    • Edit run_check: insert the parity block after the per-path-ratchet block, before the final if not failed: print(success_summary) block. Match the existing accumulator idiom.
    • Update the module docstring to list three checks.
  • scripts/ci/tests/test_i18n_cjk_guard.py
    • Extend _make_full_repo (or add a sibling _make_full_repo_with_zh) to write a locales/zh.json alongside the existing locales/en.json. Keep the default ZH a parity-clean mirror of the EN fixture so existing tests do not need to change semantically.
    • Add new test class ParityCheckTests covering Requirements 1.1, 1.2, 1.3, 1.4, 1.5, 2.1, 2.2, 2.3, 2.5.
    • Add one composition test (Requirement 5.1.f) inside RunCheckEndToEndTests (or a new RunCheckCompositionTests class) that plants a CJK string and a parity divergence in the same repo and asserts both failure lines + exit 1.
    • Update existing RunCheckEndToEndTests.test_* to either commit a parity-clean locales/zh.json or assert the parity check now also runs but does not flip the test outcome.

Files Not Created

  • No new source file is created. Option C (separate locale_parity.py helper module) was rejected in gap-analysis.md and research.md.
  • No new workflow file. The existing .github/workflows/i18n-cjk-guard.yml is invoked unchanged.

Requirements Traceability

Requirement Summary Components Interfaces Flows
1.1 Flatten EN/ZH into matching dotted-key sets i18n_cjk_guard._flatten_keys (new), reuses _flatten _flatten_keys(data: dict) -> set[str] n/a
1.2 Pass on identical key sets, success line includes shared count run_parity_check, run_check ParityResult.success_summary Run-Check Composition
1.3 / 1.4 Fail on en-only or zh-only keys run_parity_check ParityResult.passed, ParityResult.failure_lines Run-Check Composition
1.5 Dict leaves are non-leaves; scalar leaves are leaves _flatten_keys (no type narrowing) n/a n/a
2.1 <file>:<line>: parity-<side>-only: <key> lines _format_parity_finding, _locate_key_line _format_parity_finding(file, line, key, side) -> str n/a
2.2 Line-1 fallback when key not located _locate_key_line _locate_key_line(text_lines, key) -> int (returns 1 on miss) n/a
2.3 Final parity: en-only=N, zh-only=M summary run_parity_check Last entry of ParityResult.failure_lines on failure n/a
2.4 All parity output to stderr run_check integration block print(..., file=sys.stderr) Run-Check Composition
2.5 Lexicographic ordering run_parity_check sorted(...) over symmetric difference n/a
3.1 All checks run, no short-circuit run_check (existing accumulator pattern) failed: bool accumulator Run-Check Composition
3.2 / 3.3 Single exit code: 1 on any fail, 0 otherwise run_check Returns 1 if failed else 0 Run-Check Composition
3.4 / 3.5 --update-baseline, --baseline, --repo-root flags unchanged main, _build_parser Existing argparse surface n/a
3.6 Workflow file unchanged .github/workflows/i18n-cjk-guard.yml n/a (no edit) n/a
4.1 Stdlib-only i18n_cjk_guard imports No new imports n/a
4.2 Sub-second runtime _flatten_keys is O(keys); set-diff is O(keys) n/a n/a
4.3 Deterministic output All sorts lexicographic n/a n/a
5.1 (af) Tests for success, en-only, zh-only, both, scalar-leaf, composition scripts/ci/tests/test_i18n_cjk_guard.py:ParityCheckTests + composition test n/a n/a
5.2 / 5.3 / 5.4 Match existing test style; isolated fixtures; clean run on parity-clean repo Same test file n/a n/a
6.1 Guard passes on live catalogues at HEAD Manual run at implementation time python scripts/ci/i18n_cjk_guard.py exit 0 n/a
6.2 If divergence found, document in tasks.md and fix n/a (does not trigger; live parity holds) n/a n/a

System Flows

Run-Check Composition

sequenceDiagram
    participant CLI as main
    participant Orch as run_check
    participant CjkChk as scan_locale_cjk
    participant RatChk as ratchet block
    participant ParChk as run_parity_check
    participant Out as stderr/stdout

    CLI->>Orch: run_check repo baseline
    Orch->>CjkChk: scan en.json
    CjkChk-->>Orch: findings list
    alt findings non-empty
        Orch->>Out: stderr cjk-in-en lines
        Note over Orch: failed = True
    else
        Note over Orch: success summary append
    end
    Orch->>RatChk: count + read baseline
    RatChk-->>Orch: regressions list
    alt regressions non-empty
        Orch->>Out: stderr cjk-regression lines + refresh hint
        Note over Orch: failed = True
    else
        Note over Orch: success summary append
    end
    Orch->>ParChk: run parity check
    ParChk-->>Orch: ParityResult
    alt parity failed
        Orch->>Out: stderr parity lines + parity summary
        Note over Orch: failed = True
    else
        Note over Orch: success summary append
    end
    alt failed false
        Orch->>Out: stdout success lines
    end
    Orch-->>CLI: 1 if failed else 0

Key decisions:

  • The parity block is appended last so its (potentially long) failure list is contiguous in the failure stream.
  • The failed accumulator is shared with the prior two blocks; this is the only mechanism for cross-block signalling.
  • The summary line parity: en-only=N, zh-only=M is appended to ParityResult.failure_lines (last entry) so the orchestrator can print all failure lines uniformly without a special-case branch.

Components and Interfaces

Component Domain/Layer Intent Req Coverage Key Dependencies (P0/P1) Contracts
_flatten_keys Guard / helper Return the dotted-key set of a parsed JSON catalogue, mirroring check_parity.py.flatten. 1.1, 1.5 _flatten (P0, existing) Service
_locate_key_line Guard / helper Best-effort line-number resolution for a dotted key in raw JSON text, with line-1 fallback. 2.1, 2.2 none Service
_format_parity_finding Guard / helper Format one failure line as <file>:<line>: parity-<side>-only: <key>. 2.1 none Service
ParityResult Guard / DTO Carry parity-check outcome (passed flag, failure lines, success-summary line). 1.2, 2.3, 2.5 none State
run_parity_check Guard / orchestrator-leaf Read both catalogues, compute symmetric difference, build ParityResult. 1.11.5, 2.12.5 _flatten_keys (P0), _locate_key_line (P0), _format_parity_finding (P0) Service
run_check (modified) Guard / orchestrator Compose the three checks with a single failed accumulator and exit code. 3.13.3 All three checks (P0) Service
ParityCheckTests (test) Tests Unit + integration coverage for parity. 5.1 (af), 5.25.4 run_parity_check, run_check (P0) Service

Guard / helper layer

_flatten_keys

Field Detail
Intent Return the set of dotted-key paths of a parsed JSON object, mirroring check_parity.py.flatten.
Requirements 1.1, 1.5

Responsibilities & Constraints

  • Iterate via the existing _flatten(prefix, value, out) helper to guarantee identical path semantics.
  • Descend only into dict. Any non-dict (string, number, bool, null, list) at a leaf produces a key.
  • Return a set[str] so the parity caller can compute symmetric differences without re-deduplicating.

Dependencies

  • Inbound: run_parity_check (P0).
  • Outbound: _flatten (P0, existing private helper in same module).

Contracts: Service [x]

Service Interface
def _flatten_keys(data: dict[str, object]) -> set[str]:
    ...
  • Preconditions: data is the result of json.loads over a catalogue file (i.e., a dict at the top level).
  • Postconditions: every dotted path returned corresponds to a non-dict leaf in data. The set is unordered; callers must sort before formatting output (Requirement 2.5).
  • Invariants: _flatten_keys({}) == set(). For any catalogue c, _flatten_keys(c) is identical to the set of keys produced by check_parity.py.flatten(c).

Implementation Notes

  • Integration: One call site (run_parity_check).
  • Validation: Unit-test against a hand-rolled fixture with mixed leaf types (string, number, bool, null) and at least three nesting levels (Requirement 5.1.e).
  • Risks: None. Reuses the existing flatten primitive verbatim.

_locate_key_line

Field Detail
Intent Best-effort line-number resolution for a dotted key in the raw JSON source text, with a deterministic line-1 fallback.
Requirements 2.1, 2.2

Responsibilities & Constraints

  • Accept the splitlines view of a JSON file (text_lines: list[str]) and a dotted key (dotted_key: str).
  • Search for the leaf segment of the dotted key (after the last .) wrapped in JSON quotes, e.g. "missingKey". Return the 1-based line number of the first match.
  • Fall back to 1 when no match is found (mirrors _value_line_number).
  • Performance must remain linear in the number of lines.

Dependencies

  • Inbound: run_parity_check (P0).
  • Outbound: none.

Contracts: Service [x]

Service Interface
def _locate_key_line(text_lines: list[str], dotted_key: str) -> int:
    ...
  • Preconditions: dotted_key non-empty; text_lines is the result of Path.read_text(...).splitlines().
  • Postconditions: returns an integer ≥ 1.
  • Invariants: When the leaf segment appears in text_lines wrapped in "...", the return is the (1-based) line number of the first occurrence. Otherwise the return is 1.

Implementation Notes

  • Integration: One call site (run_parity_check).
  • Validation: Unit-test the exact-match path, the multi-occurrence path (first match wins), and the not-found fallback.
  • Risks: A leaf segment that also appears as part of another (unrelated) key or in a value text could yield a slightly misleading line number. Acceptable: the dotted key in the failure message is the source of truth; the line is a navigation aid. Documented in the docstring.

_format_parity_finding

Field Detail
Intent Format a single parity-failure line in the canonical layout used by the guard.
Requirements 2.1

Responsibilities & Constraints

  • Produce strings of the exact form <file>:<line>: parity-en-only: <dotted-key> or <file>:<line>: parity-zh-only: <dotted-key>.
  • Mirror the existing _format_locale_finding style (<file>:<line>: <category>: <payload>).

Dependencies

  • Inbound: run_parity_check (P0).
  • Outbound: none.

Contracts: Service [x]

Service Interface
def _format_parity_finding(file_rel_path: str, line_no: int, dotted_key: str, side: str) -> str:
    ...
  • Preconditions: side in {"en-only", "zh-only"}; file_rel_path is one of EN_JSON_REL_PATH / ZH_JSON_REL_PATH; line_no >= 1.
  • Postconditions: returns a single line with no embedded newline.
  • Invariants: The category token in the line is exactly parity-en-only or parity-zh-only so log greps match deterministically.

Guard / DTO layer

ParityResult

Field Detail
Intent Immutable carrier for parity-check outcome consumed by run_check.
Requirements 1.2, 2.3, 2.5

Contracts: State [x]

State Management
  • State model:
class ParityResult(NamedTuple):
    passed: bool
    failure_lines: list[str]  # already-formatted lines, including the trailing "parity: en-only=N, zh-only=M" summary on failure
    success_summary: str | None  # populated only when passed is True
  • Persistence & consistency: in-memory only; constructed by run_parity_check and consumed by run_check.
  • Concurrency strategy: n/a (single-process, single-call).

Guard / orchestrator-leaf

run_parity_check

Field Detail
Intent Compute the locale-key parity outcome and produce a ParityResult.
Requirements 1.11.5, 2.12.5

Responsibilities & Constraints

  • Read both locales/en.json and locales/zh.json from repo_root.
  • Flatten each via _flatten_keys and compute the symmetric difference.
  • For each en-only key (sorted lexicographically): resolve its line via _locate_key_line over the EN catalogue's source-text lines, and emit a parity-en-only line via _format_parity_finding.
  • For each zh-only key (sorted lexicographically, after en-only): resolve its line via _locate_key_line over the ZH catalogue's source-text lines, and emit a parity-zh-only line.
  • On failure, append a final parity: en-only=N, zh-only=M summary line to failure_lines.
  • On success, build the success summary OK locale-parity: <count> keys per side.
  • If either catalogue file is missing, return a ParityResult(passed=False, failure_lines=[<single error line>], success_summary=None) and let run_check fold the error into the global failed flag.

Dependencies

  • Inbound: run_check (P0).
  • Outbound: _flatten_keys, _locate_key_line, _format_parity_finding (all P0).

Contracts: Service [x]

Service Interface
def run_parity_check(repo_root: Path) -> ParityResult:
    ...
  • Preconditions: repo_root is a valid working-tree directory; locales/en.json and locales/zh.json are expected at the relative paths defined by EN_JSON_REL_PATH and ZH_JSON_REL_PATH.
  • Postconditions: returns a ParityResult. When passed, failure_lines == [] and success_summary is non-None. When not passed, failure_lines is non-empty and ends with a parity: en-only=… summary line; success_summary is None.
  • Invariants: Flattened-key-set computation matches check_parity.py.flatten byte-for-byte for any input. Output is deterministic across runs for identical inputs.

Implementation Notes

  • Integration: Called once per run_check invocation. Skipped entirely in --update-baseline mode (covered by Requirement 3.4 — update_baseline is invoked from main instead of run_check).
  • Validation: Unit-test all required outcomes (Requirement 5.1 ae); integration-test composition (5.1 f).
  • Risks: A malformed JSON catalogue raises json.JSONDecodeError. The function should treat this the same as a missing file (return ParityResult(passed=False, …)), so the guard reports a clean failure rather than crashing CI with a Python traceback.

Guard / orchestrator (modified)

run_check (modification)

Field Detail
Intent Compose all three checks (CJK-clean, per-path ratchet, parity) into one exit code.
Requirements 3.1, 3.2, 3.3

Responsibilities & Constraints

  • After the existing per-path-ratchet block (existing line ~258293) and before the final if not failed block (existing line ~295298), call run_parity_check(repo_root).
  • If the result is not passed, set failed = True, print every entry of result.failure_lines to sys.stderr, one line per print(...) call.
  • If passed, append result.success_summary to success_summary.
  • Return 1 if failed else 0 (unchanged).

Dependencies

  • Inbound: main (P0, via either standalone CLI or test invocation).
  • Outbound: scan_locale_cjk, per-path ratchet helpers, run_parity_check (all P0).

Contracts: Service [x] / State [x]

Service Interface

Unchanged signature: def run_check(repo_root: Path, baseline_path: Path) -> int.

  • Preconditions: unchanged.
  • Postconditions: exit code reflects all three checks (was: two checks).
  • Invariants: still no short-circuit between checks.

Implementation Notes

  • Integration: One inserted block of ~10 lines in the existing function.
  • Validation: Existing CLI smoke tests continue to pass; new RunCheckEndToEndTests cases assert correct fail/pass propagation when only the parity check fails, only an existing check fails, or both fail.
  • Risks: A future maintainer could accidentally short-circuit by inserting an early return between blocks. Mitigated by the composition test (Requirement 5.1.f) which fails if any block is skipped.

Tests

ParityCheckTests

Field Detail
Intent Unit + integration coverage for the parity check, matching the style of existing RunCheckEndToEndTests.
Requirements 5.1 (af), 5.2, 5.3, 5.4

Responsibilities & Constraints

  • Use unittest, tempfile.TemporaryDirectory, and the existing _make_repo / _commit_file test helpers.
  • Each test owns its own ephemeral repo. No reliance on the live locales/ content for negative paths (Requirement 5.3).
  • Assertions check exit code AND substring presence of the failure category tokens (parity-en-only, parity-zh-only) AND that the summary line is the last failure line.

Dependencies

  • Inbound: unittest.main.
  • Outbound: i18n_cjk_guard.run_parity_check, i18n_cjk_guard.run_check (both P0).

Implementation Notes

  • Test cases (one per Requirement 5.1 sub-bullet):
    • (a) test_passes_when_keys_match — both catalogues identical → run_parity_check returns passed=True; run_check returns 0.
    • (b) test_fails_on_en_only_keyen.json has an extra key → run_parity_check returns passed=False, failure includes parity-en-only, summary is parity: en-only=1, zh-only=0.
    • (c) test_fails_on_zh_only_key — symmetric of (b).
    • (d) test_fails_on_both_sided_divergence — failure list contains both parity-en-only and parity-zh-only lines, ordered en-first then zh, each lex-sorted within its group.
    • (e) test_passes_with_scalar_leaves_at_same_path — both catalogues have a scalar (e.g. null, 42, false) at the same dotted path → parity passes (Requirement 1.5).
    • (f) test_run_check_no_short_circuit — one repo plants both a CJK in en.json and a parity-divergent key. Expect: exit 1; stderr contains both cjk-in-en and parity-en-only (or parity-zh-only); the per-path-ratchet success summary is suppressed (since failed).
  • Risks: Test fixtures must use ensure_ascii=False JSON to match the live catalogue style.

Error Handling

Error Strategy

  • Missing catalogue filerun_parity_check returns ParityResult(passed=False, failure_lines=[<missing-file-line>], success_summary=None). run_check flips failed, prints the line to stderr, returns 1.
  • Malformed JSON → same path as missing catalogue. json.JSONDecodeError is caught inside run_parity_check; the line printed names the offending file and the parser's msg.
  • Parity divergence (the expected unhappy path) → fail per Requirements 1.3 / 1.4 / 2.12.5.
  • _locate_key_line cannot find the key → fall back to line 1 (Requirement 2.2). Not an error; the caller proceeds.
  • No-short-circuit invariant → enforced by the orchestrator's accumulator pattern; covered by Requirement 5.1.f.

Monitoring

CI workflow logs (GitHub Actions) are the sole observability surface. Failure lines are designed to be greppable: parity-en-only, parity-zh-only, parity: en-only=, parity: zh-only= are stable tokens.

Testing Strategy

Unit Tests

  • _flatten_keys: empty input, flat input, mixed-type leaves, three-level nesting, null and scalar leaves.
  • _locate_key_line: exact match, multi-occurrence (first wins), not found (line-1 fallback).
  • _format_parity_finding: en-only and zh-only sides, embedded special characters in key names (e.g. underscores, digits).
  • ParityResult: pass-shape and fail-shape construction.

Integration Tests

  • All six ParityCheckTests sub-cases listed above.
  • The composition case (Requirement 5.1.f) inside RunCheckCompositionTests (or appended to RunCheckEndToEndTests).
  • A regression of the existing RunCheckEndToEndTests cases after extending _make_full_repo to write a default parity-clean locales/zh.json.

Performance / Load

  • One sanity case: parity check on a synthetic 10 000-key catalogue completes in well under one second on the CI runner. Asserted by a time.perf_counter() budget of 1.0 s in the integration test.

Performance & Scalability

  • Catalogue size: ~1000 keys today; growth bounded by the number of UI strings + log keys. Even at 10× the current size, _flatten + set-diff remains negligible (<100 ms).
  • The CI workflow timeout is 1 minute (.github/workflows/i18n-cjk-guard.yml:timeout-minutes: 1); the new check adds at most tens of milliseconds.

Supporting References

  • gap-analysis.md (this spec) — implementation-approach options A/B/C with rationale.
  • research.md (this spec) — design decision records.
  • .kiro/specs/i18n-ci-guard/design.md — prior CI guard's design doc (style and boundary precedents).
  • .kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py — reference parity algorithm.