MicroFish/.kiro/specs/i18n-locale-parity-guard/design.md

518 lines
29 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Design — i18n-locale-parity-guard
## Overview
This feature extends the project's PR-time i18n CI guard so that any pull request which introduces a key in only one of `locales/en.json` / `locales/zh.json` fails. It satisfies acceptance criterion #4 of epic #11 (locale-key parity) with a permanent automated check.
**Purpose**: Lock in locale-catalogue key parity as a permanent CI invariant so that AC #4 of epic #11 cannot regress as new strings are added.
**Users**: Project maintainers and PR authors. Maintainers gain a hard regression gate; PR authors gain a script they can run locally to confirm parity before pushing.
**Impact**: Adds a third check to the existing PR-time guard `scripts/ci/i18n_cjk_guard.py`. No production source under `backend/app/`, `frontend/src/`, or `locales/` is modified by this spec.
### Goals
- Fail any PR whose flattened-key set in `locales/en.json` differs from that of `locales/zh.json`.
- Print actionable failure lines (`<file>:<line>: parity-<en|zh>-only: <dotted-key>`) and a summary count.
- Compose with the existing CJK-clean and per-path-ratchet checks in a single CLI invocation, with a single exit code, no short-circuit.
- Run end-to-end in well under one second on the live catalogues; stdlib-only.
- Pass on `main` at the moment this spec ships (live catalogues are already parity-clean).
### Non-Goals
- Re-implementing the manual audit pipeline at `.kiro/specs/i18n-e2e-english-verification/audit/scripts/`. The new check is the CI extract; the audit retains its own copy of `check_parity.py`.
- Cross-locale value-equality, identical-value heuristics, or ICU-placeholder-shape checks.
- Auto-creating missing keys, suggesting translations, or reformatting the catalogues.
- Modifying the `locales/` schema, the `vue-i18n` runtime, or `backend/app/utils/locale.py`.
- Adding a new GitHub Actions workflow or workflow step.
## Boundary Commitments
### This Spec Owns
- The new parity-check helpers (`_flatten_keys`, `_locate_key_line`, `_format_parity_finding`, `run_parity_check`) and constants (`ZH_JSON_REL_PATH`) inside `scripts/ci/i18n_cjk_guard.py`.
- The new third block of `run_check` that invokes `run_parity_check` and integrates its result into the existing `failed` accumulator and `success_summary` collector.
- The pass/fail semantics of the locale-key parity check.
- New unit / integration tests under `scripts/ci/tests/` covering the parity check and its composition.
### Out of Boundary
- The audit pipeline at `.kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py` (independent, manual-only).
- The structure or format of the baseline file `.kiro/specs/i18n-ci-guard/baseline.txt` (parity is binary; no baseline needed).
- The workflow file `.github/workflows/i18n-cjk-guard.yml` (unchanged; same `python scripts/ci/i18n_cjk_guard.py` invocation already covers the new check).
- Any change to `locales/en.json` or `locales/zh.json` content.
- Open follow-up issues #7, #23, #25 (out-of-scope translation work).
### Allowed Dependencies
- Python ≥3.11 standard library (`json`, `os`, `pathlib`, `re`, `subprocess`, `sys`, `argparse`, `unittest`).
- The existing helpers `_flatten`, `_value_line_number`, `_truncate`, the `EN_JSON_REL_PATH` constant, and the `run_check`/`update_baseline` functions in `scripts/ci/i18n_cjk_guard.py`.
- `git` (for the existing CJK-counting block, untouched here).
### Revalidation Triggers
- Adding a third locale catalogue → parity becomes pairwise; design must be revisited.
- Changing the `flatten` contract (e.g. encoding non-dict containers like lists) → the parity check's "exact match with `check_parity.py`" clause must be re-asserted against the new contract.
- Splitting the guard into multiple CLI scripts → Requirement 3 ("one invocation") must be re-anchored.
## Architecture
### Existing Architecture Analysis
The guard is a single-file Python CLI: `scripts/ci/i18n_cjk_guard.py` (~393 lines, stdlib-only) invoked by one workflow step in `.github/workflows/i18n-cjk-guard.yml`. Its `run_check(repo_root, baseline_path) -> int` function is the orchestrator; today it composes two checks without short-circuit:
1. `scan_locale_cjk(en_json_path)` — fail when `locales/en.json` contains any CJK character.
2. Per-path baseline ratchet — fail when `count_path_cjk(repo_root, p)` exceeds `read_baseline(...)[p]` for any `p` in `("backend/app", "frontend/src")`.
A `failed: bool` accumulator is set independently by each block; a `success_summary: list[str]` collects "OK …" lines that print only on full success. This design extends it with a third block.
The audit pipeline at `.kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py` already implements the algorithm we need (recursive `flatten` + symmetric difference). Its logic is the canonical reference for Requirement 1.1.
### Architecture Pattern & Boundary Map
```mermaid
graph TB
Workflow[GitHub Actions step]
Main[main entry]
UpdateBaseline[update_baseline]
RunCheck[run_check orchestrator]
CjkClean[scan_locale_cjk]
Ratchet[count_path_cjk + read_baseline]
Parity[run_parity_check NEW]
EnJson[locales en.json]
ZhJson[locales zh.json]
BaselineFile[baseline.txt]
Workflow --> Main
Main -->|--update-baseline| UpdateBaseline
Main --> RunCheck
RunCheck --> CjkClean
RunCheck --> Ratchet
RunCheck --> Parity
CjkClean --> EnJson
Ratchet --> BaselineFile
Parity --> EnJson
Parity --> ZhJson
```
**Architecture Integration**:
- **Selected pattern**: Composed checks inside a single orchestrator (`run_check`). Each check is an independent function that returns a pass/fail signal and a list of human-readable lines; the orchestrator accumulates them.
- **Domain/feature boundaries**: Parity logic is internal to the guard module. It does not depend on the audit pipeline, the per-path ratchet, or the locale runtime.
- **Existing patterns preserved**: No-short-circuit composition, stderr-for-failure / stdout-for-success, lexicographic ordering for determinism, atomic-write / tmp-rename for any new persistence (none added here).
- **New components rationale**: `run_parity_check` is the only new orchestrator-level function; small private helpers (`_flatten_keys`, `_locate_key_line`, `_format_parity_finding`) keep `run_parity_check`'s body short and individually testable.
- **Steering compliance**: Stdlib-only; explicit type hints (PEP 604 union syntax already in use in this module); single-responsibility helpers; module dependency direction unchanged (still no imports from `backend/`, `frontend/`, or `locales/` runtime code).
### Technology Stack
| Layer | Choice / Version | Role in Feature | Notes |
|-------|------------------|-----------------|-------|
| Backend / Services | n/a | n/a | This is a CI tool; no backend or service code is touched. |
| Infrastructure / Runtime | Python 3.11 stdlib (`json`, `pathlib`, `re`, `subprocess`, `sys`, `argparse`); GitHub Actions `ubuntu-latest`; `actions/checkout@v4`; `actions/setup-python@v5` | Runtime for the guard script and its new parity check. | Versions match the existing guard. No new dependencies; `pyproject.toml` and CI image unchanged. |
| Test Tooling | Python `unittest` (stdlib) | Drives parity check unit + integration tests. | Same framework as existing tests in `scripts/ci/tests/test_i18n_cjk_guard.py`. |
## File Structure Plan
### Directory Structure
```
scripts/
└── ci/
├── i18n_cjk_guard.py # Extended: adds parity helpers + third block in run_check
└── tests/
└── test_i18n_cjk_guard.py # Extended: adds ParityCheckTests + composition test
```
### Modified Files
- `scripts/ci/i18n_cjk_guard.py`
- Add module-level constants: `ZH_JSON_REL_PATH = "locales/zh.json"`.
- Add private helpers: `_flatten_keys`, `_locate_key_line`, `_format_parity_finding`.
- Add public function: `run_parity_check(repo_root: Path) -> ParityResult`.
- Add a new `NamedTuple` (or `@dataclass(frozen=True, slots=True)`) `ParityResult` with fields `(passed: bool, failure_lines: list[str], success_summary: str | None)`.
- Edit `run_check`: insert the parity block after the per-path-ratchet block, before the final `if not failed: print(success_summary)` block. Match the existing accumulator idiom.
- Update the module docstring to list three checks.
- `scripts/ci/tests/test_i18n_cjk_guard.py`
- Extend `_make_full_repo` (or add a sibling `_make_full_repo_with_zh`) to write a `locales/zh.json` alongside the existing `locales/en.json`. Keep the default ZH a parity-clean mirror of the EN fixture so existing tests do not need to change semantically.
- Add new test class `ParityCheckTests` covering Requirements 1.1, 1.2, 1.3, 1.4, 1.5, 2.1, 2.2, 2.3, 2.5.
- Add one composition test (Requirement 5.1.f) inside `RunCheckEndToEndTests` (or a new `RunCheckCompositionTests` class) that plants a CJK string and a parity divergence in the same repo and asserts both failure lines + exit 1.
- Update existing `RunCheckEndToEndTests.test_*` to either commit a parity-clean `locales/zh.json` or assert the parity check now also runs but does not flip the test outcome.
### Files Not Created
- No new source file is created. Option C (separate `locale_parity.py` helper module) was rejected in `gap-analysis.md` and `research.md`.
- No new workflow file. The existing `.github/workflows/i18n-cjk-guard.yml` is invoked unchanged.
## Requirements Traceability
| Requirement | Summary | Components | Interfaces | Flows |
|-------------|---------|------------|------------|-------|
| 1.1 | Flatten EN/ZH into matching dotted-key sets | `i18n_cjk_guard._flatten_keys` (new), reuses `_flatten` | `_flatten_keys(data: dict) -> set[str]` | n/a |
| 1.2 | Pass on identical key sets, success line includes shared count | `run_parity_check`, `run_check` | `ParityResult.success_summary` | Run-Check Composition |
| 1.3 / 1.4 | Fail on en-only or zh-only keys | `run_parity_check` | `ParityResult.passed`, `ParityResult.failure_lines` | Run-Check Composition |
| 1.5 | Dict leaves are non-leaves; scalar leaves are leaves | `_flatten_keys` (no type narrowing) | n/a | n/a |
| 2.1 | `<file>:<line>: parity-<side>-only: <key>` lines | `_format_parity_finding`, `_locate_key_line` | `_format_parity_finding(file, line, key, side) -> str` | n/a |
| 2.2 | Line-1 fallback when key not located | `_locate_key_line` | `_locate_key_line(text_lines, key) -> int` (returns 1 on miss) | n/a |
| 2.3 | Final `parity: en-only=N, zh-only=M` summary | `run_parity_check` | Last entry of `ParityResult.failure_lines` on failure | n/a |
| 2.4 | All parity output to stderr | `run_check` integration block | `print(..., file=sys.stderr)` | Run-Check Composition |
| 2.5 | Lexicographic ordering | `run_parity_check` | `sorted(...)` over symmetric difference | n/a |
| 3.1 | All checks run, no short-circuit | `run_check` (existing accumulator pattern) | `failed: bool` accumulator | Run-Check Composition |
| 3.2 / 3.3 | Single exit code: 1 on any fail, 0 otherwise | `run_check` | Returns `1 if failed else 0` | Run-Check Composition |
| 3.4 / 3.5 | `--update-baseline`, `--baseline`, `--repo-root` flags unchanged | `main`, `_build_parser` | Existing argparse surface | n/a |
| 3.6 | Workflow file unchanged | `.github/workflows/i18n-cjk-guard.yml` | n/a (no edit) | n/a |
| 4.1 | Stdlib-only | `i18n_cjk_guard` imports | No new imports | n/a |
| 4.2 | Sub-second runtime | `_flatten_keys` is O(keys); set-diff is O(keys) | n/a | n/a |
| 4.3 | Deterministic output | All sorts lexicographic | n/a | n/a |
| 5.1 (af) | Tests for success, en-only, zh-only, both, scalar-leaf, composition | `scripts/ci/tests/test_i18n_cjk_guard.py:ParityCheckTests` + composition test | n/a | n/a |
| 5.2 / 5.3 / 5.4 | Match existing test style; isolated fixtures; clean run on parity-clean repo | Same test file | n/a | n/a |
| 6.1 | Guard passes on live catalogues at HEAD | Manual run at implementation time | `python scripts/ci/i18n_cjk_guard.py` exit 0 | n/a |
| 6.2 | If divergence found, document in tasks.md and fix | n/a (does not trigger; live parity holds) | n/a | n/a |
## System Flows
### Run-Check Composition
```mermaid
sequenceDiagram
participant CLI as main
participant Orch as run_check
participant CjkChk as scan_locale_cjk
participant RatChk as ratchet block
participant ParChk as run_parity_check
participant Out as stderr/stdout
CLI->>Orch: run_check repo baseline
Orch->>CjkChk: scan en.json
CjkChk-->>Orch: findings list
alt findings non-empty
Orch->>Out: stderr cjk-in-en lines
Note over Orch: failed = True
else
Note over Orch: success summary append
end
Orch->>RatChk: count + read baseline
RatChk-->>Orch: regressions list
alt regressions non-empty
Orch->>Out: stderr cjk-regression lines + refresh hint
Note over Orch: failed = True
else
Note over Orch: success summary append
end
Orch->>ParChk: run parity check
ParChk-->>Orch: ParityResult
alt parity failed
Orch->>Out: stderr parity lines + parity summary
Note over Orch: failed = True
else
Note over Orch: success summary append
end
alt failed false
Orch->>Out: stdout success lines
end
Orch-->>CLI: 1 if failed else 0
```
**Key decisions**:
- The parity block is appended last so its (potentially long) failure list is contiguous in the failure stream.
- The `failed` accumulator is shared with the prior two blocks; this is the only mechanism for cross-block signalling.
- The summary line `parity: en-only=N, zh-only=M` is appended to `ParityResult.failure_lines` (last entry) so the orchestrator can print all failure lines uniformly without a special-case branch.
## Components and Interfaces
| Component | Domain/Layer | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts |
|-----------|--------------|--------|--------------|--------------------------|-----------|
| `_flatten_keys` | Guard / helper | Return the dotted-key set of a parsed JSON catalogue, mirroring `check_parity.py.flatten`. | 1.1, 1.5 | `_flatten` (P0, existing) | Service |
| `_locate_key_line` | Guard / helper | Best-effort line-number resolution for a dotted key in raw JSON text, with line-1 fallback. | 2.1, 2.2 | none | Service |
| `_format_parity_finding` | Guard / helper | Format one failure line as `<file>:<line>: parity-<side>-only: <key>`. | 2.1 | none | Service |
| `ParityResult` | Guard / DTO | Carry parity-check outcome (passed flag, failure lines, success-summary line). | 1.2, 2.3, 2.5 | none | State |
| `run_parity_check` | Guard / orchestrator-leaf | Read both catalogues, compute symmetric difference, build `ParityResult`. | 1.11.5, 2.12.5 | `_flatten_keys` (P0), `_locate_key_line` (P0), `_format_parity_finding` (P0) | Service |
| `run_check` (modified) | Guard / orchestrator | Compose the three checks with a single `failed` accumulator and exit code. | 3.13.3 | All three checks (P0) | Service |
| `ParityCheckTests` (test) | Tests | Unit + integration coverage for parity. | 5.1 (af), 5.25.4 | `run_parity_check`, `run_check` (P0) | Service |
### Guard / helper layer
#### `_flatten_keys`
| Field | Detail |
|-------|--------|
| Intent | Return the set of dotted-key paths of a parsed JSON object, mirroring `check_parity.py.flatten`. |
| Requirements | 1.1, 1.5 |
**Responsibilities & Constraints**
- Iterate via the existing `_flatten(prefix, value, out)` helper to guarantee identical path semantics.
- Descend only into `dict`. Any non-dict (string, number, bool, null, list) at a leaf produces a key.
- Return a `set[str]` so the parity caller can compute symmetric differences without re-deduplicating.
**Dependencies**
- Inbound: `run_parity_check` (P0).
- Outbound: `_flatten` (P0, existing private helper in same module).
**Contracts**: Service [x]
##### Service Interface
```python
def _flatten_keys(data: dict[str, object]) -> set[str]:
...
```
- Preconditions: `data` is the result of `json.loads` over a catalogue file (i.e., a `dict` at the top level).
- Postconditions: every dotted path returned corresponds to a non-`dict` leaf in `data`. The set is unordered; callers must sort before formatting output (Requirement 2.5).
- Invariants: `_flatten_keys({}) == set()`. For any catalogue `c`, `_flatten_keys(c)` is identical to the set of keys produced by `check_parity.py.flatten(c)`.
**Implementation Notes**
- Integration: One call site (`run_parity_check`).
- Validation: Unit-test against a hand-rolled fixture with mixed leaf types (string, number, bool, null) and at least three nesting levels (Requirement 5.1.e).
- Risks: None. Reuses the existing flatten primitive verbatim.
#### `_locate_key_line`
| Field | Detail |
|-------|--------|
| Intent | Best-effort line-number resolution for a dotted key in the raw JSON source text, with a deterministic line-1 fallback. |
| Requirements | 2.1, 2.2 |
**Responsibilities & Constraints**
- Accept the splitlines view of a JSON file (`text_lines: list[str]`) and a dotted key (`dotted_key: str`).
- Search for the leaf segment of the dotted key (after the last `.`) wrapped in JSON quotes, e.g. `"missingKey"`. Return the 1-based line number of the first match.
- Fall back to `1` when no match is found (mirrors `_value_line_number`).
- Performance must remain linear in the number of lines.
**Dependencies**
- Inbound: `run_parity_check` (P0).
- Outbound: none.
**Contracts**: Service [x]
##### Service Interface
```python
def _locate_key_line(text_lines: list[str], dotted_key: str) -> int:
...
```
- Preconditions: `dotted_key` non-empty; `text_lines` is the result of `Path.read_text(...).splitlines()`.
- Postconditions: returns an integer ≥ 1.
- Invariants: When the leaf segment appears in `text_lines` wrapped in `"..."`, the return is the (1-based) line number of the first occurrence. Otherwise the return is `1`.
**Implementation Notes**
- Integration: One call site (`run_parity_check`).
- Validation: Unit-test the exact-match path, the multi-occurrence path (first match wins), and the not-found fallback.
- Risks: A leaf segment that also appears as part of another (unrelated) key or in a value text could yield a slightly misleading line number. Acceptable: the dotted key in the failure message is the source of truth; the line is a navigation aid. Documented in the docstring.
#### `_format_parity_finding`
| Field | Detail |
|-------|--------|
| Intent | Format a single parity-failure line in the canonical layout used by the guard. |
| Requirements | 2.1 |
**Responsibilities & Constraints**
- Produce strings of the exact form `<file>:<line>: parity-en-only: <dotted-key>` or `<file>:<line>: parity-zh-only: <dotted-key>`.
- Mirror the existing `_format_locale_finding` style (`<file>:<line>: <category>: <payload>`).
**Dependencies**
- Inbound: `run_parity_check` (P0).
- Outbound: none.
**Contracts**: Service [x]
##### Service Interface
```python
def _format_parity_finding(file_rel_path: str, line_no: int, dotted_key: str, side: str) -> str:
...
```
- Preconditions: `side in {"en-only", "zh-only"}`; `file_rel_path` is one of `EN_JSON_REL_PATH` / `ZH_JSON_REL_PATH`; `line_no >= 1`.
- Postconditions: returns a single line with no embedded newline.
- Invariants: The category token in the line is exactly `parity-en-only` or `parity-zh-only` so log greps match deterministically.
### Guard / DTO layer
#### `ParityResult`
| Field | Detail |
|-------|--------|
| Intent | Immutable carrier for parity-check outcome consumed by `run_check`. |
| Requirements | 1.2, 2.3, 2.5 |
**Contracts**: State [x]
##### State Management
- State model:
```python
class ParityResult(NamedTuple):
passed: bool
failure_lines: list[str] # already-formatted lines, including the trailing "parity: en-only=N, zh-only=M" summary on failure
success_summary: str | None # populated only when passed is True
```
- Persistence & consistency: in-memory only; constructed by `run_parity_check` and consumed by `run_check`.
- Concurrency strategy: n/a (single-process, single-call).
### Guard / orchestrator-leaf
#### `run_parity_check`
| Field | Detail |
|-------|--------|
| Intent | Compute the locale-key parity outcome and produce a `ParityResult`. |
| Requirements | 1.11.5, 2.12.5 |
**Responsibilities & Constraints**
- Read both `locales/en.json` and `locales/zh.json` from `repo_root`.
- Flatten each via `_flatten_keys` and compute the symmetric difference.
- For each en-only key (sorted lexicographically): resolve its line via `_locate_key_line` over the EN catalogue's source-text lines, and emit a `parity-en-only` line via `_format_parity_finding`.
- For each zh-only key (sorted lexicographically, after en-only): resolve its line via `_locate_key_line` over the ZH catalogue's source-text lines, and emit a `parity-zh-only` line.
- On failure, append a final `parity: en-only=N, zh-only=M` summary line to `failure_lines`.
- On success, build the success summary `OK locale-parity: <count> keys per side`.
- If either catalogue file is missing, return a `ParityResult(passed=False, failure_lines=[<single error line>], success_summary=None)` and let `run_check` fold the error into the global `failed` flag.
**Dependencies**
- Inbound: `run_check` (P0).
- Outbound: `_flatten_keys`, `_locate_key_line`, `_format_parity_finding` (all P0).
**Contracts**: Service [x]
##### Service Interface
```python
def run_parity_check(repo_root: Path) -> ParityResult:
...
```
- Preconditions: `repo_root` is a valid working-tree directory; `locales/en.json` and `locales/zh.json` are expected at the relative paths defined by `EN_JSON_REL_PATH` and `ZH_JSON_REL_PATH`.
- Postconditions: returns a `ParityResult`. When `passed`, `failure_lines == []` and `success_summary` is non-`None`. When not `passed`, `failure_lines` is non-empty and ends with a `parity: en-only=…` summary line; `success_summary` is `None`.
- Invariants: Flattened-key-set computation matches `check_parity.py.flatten` byte-for-byte for any input. Output is deterministic across runs for identical inputs.
**Implementation Notes**
- Integration: Called once per `run_check` invocation. Skipped entirely in `--update-baseline` mode (covered by Requirement 3.4 — `update_baseline` is invoked from `main` instead of `run_check`).
- Validation: Unit-test all required outcomes (Requirement 5.1 ae); integration-test composition (5.1 f).
- Risks: A malformed JSON catalogue raises `json.JSONDecodeError`. The function should treat this the same as a missing file (return `ParityResult(passed=False, …)`), so the guard reports a clean failure rather than crashing CI with a Python traceback.
### Guard / orchestrator (modified)
#### `run_check` (modification)
| Field | Detail |
|-------|--------|
| Intent | Compose all three checks (CJK-clean, per-path ratchet, parity) into one exit code. |
| Requirements | 3.1, 3.2, 3.3 |
**Responsibilities & Constraints**
- After the existing per-path-ratchet block (existing line ~258293) and before the final `if not failed` block (existing line ~295298), call `run_parity_check(repo_root)`.
- If the result is not passed, set `failed = True`, print every entry of `result.failure_lines` to `sys.stderr`, one line per `print(...)` call.
- If passed, append `result.success_summary` to `success_summary`.
- Return `1 if failed else 0` (unchanged).
**Dependencies**
- Inbound: `main` (P0, via either standalone CLI or test invocation).
- Outbound: `scan_locale_cjk`, per-path ratchet helpers, `run_parity_check` (all P0).
**Contracts**: Service [x] / State [x]
##### Service Interface
Unchanged signature: `def run_check(repo_root: Path, baseline_path: Path) -> int`.
- Preconditions: unchanged.
- Postconditions: exit code reflects all three checks (was: two checks).
- Invariants: still no short-circuit between checks.
**Implementation Notes**
- Integration: One inserted block of ~10 lines in the existing function.
- Validation: Existing CLI smoke tests continue to pass; new `RunCheckEndToEndTests` cases assert correct fail/pass propagation when only the parity check fails, only an existing check fails, or both fail.
- Risks: A future maintainer could accidentally short-circuit by inserting an early `return` between blocks. Mitigated by the composition test (Requirement 5.1.f) which fails if any block is skipped.
### Tests
#### `ParityCheckTests`
| Field | Detail |
|-------|--------|
| Intent | Unit + integration coverage for the parity check, matching the style of existing `RunCheckEndToEndTests`. |
| Requirements | 5.1 (af), 5.2, 5.3, 5.4 |
**Responsibilities & Constraints**
- Use `unittest`, `tempfile.TemporaryDirectory`, and the existing `_make_repo` / `_commit_file` test helpers.
- Each test owns its own ephemeral repo. No reliance on the live `locales/` content for negative paths (Requirement 5.3).
- Assertions check exit code AND substring presence of the failure category tokens (`parity-en-only`, `parity-zh-only`) AND that the summary line is the last failure line.
**Dependencies**
- Inbound: `unittest.main`.
- Outbound: `i18n_cjk_guard.run_parity_check`, `i18n_cjk_guard.run_check` (both P0).
**Implementation Notes**
- Test cases (one per Requirement 5.1 sub-bullet):
- (a) `test_passes_when_keys_match` — both catalogues identical → `run_parity_check` returns `passed=True`; `run_check` returns 0.
- (b) `test_fails_on_en_only_key``en.json` has an extra key → `run_parity_check` returns `passed=False`, failure includes `parity-en-only`, summary is `parity: en-only=1, zh-only=0`.
- (c) `test_fails_on_zh_only_key` — symmetric of (b).
- (d) `test_fails_on_both_sided_divergence` — failure list contains both `parity-en-only` and `parity-zh-only` lines, ordered en-first then zh, each lex-sorted within its group.
- (e) `test_passes_with_scalar_leaves_at_same_path` — both catalogues have a scalar (e.g. `null`, `42`, `false`) at the same dotted path → parity passes (Requirement 1.5).
- (f) `test_run_check_no_short_circuit` — one repo plants both a CJK in `en.json` and a parity-divergent key. Expect: exit 1; stderr contains both `cjk-in-en` and `parity-en-only` (or `parity-zh-only`); the per-path-ratchet success summary is suppressed (since failed).
- Risks: Test fixtures must use `ensure_ascii=False` JSON to match the live catalogue style.
## Error Handling
### Error Strategy
- **Missing catalogue file** → `run_parity_check` returns `ParityResult(passed=False, failure_lines=[<missing-file-line>], success_summary=None)`. `run_check` flips `failed`, prints the line to stderr, returns 1.
- **Malformed JSON** → same path as missing catalogue. `json.JSONDecodeError` is caught inside `run_parity_check`; the line printed names the offending file and the parser's `msg`.
- **Parity divergence** (the expected unhappy path) → fail per Requirements 1.3 / 1.4 / 2.12.5.
- **`_locate_key_line` cannot find the key** → fall back to line 1 (Requirement 2.2). Not an error; the caller proceeds.
- **No-short-circuit invariant** → enforced by the orchestrator's accumulator pattern; covered by Requirement 5.1.f.
### Monitoring
CI workflow logs (GitHub Actions) are the sole observability surface. Failure lines are designed to be greppable: `parity-en-only`, `parity-zh-only`, `parity: en-only=`, `parity: zh-only=` are stable tokens.
## Testing Strategy
### Unit Tests
- `_flatten_keys`: empty input, flat input, mixed-type leaves, three-level nesting, `null` and scalar leaves.
- `_locate_key_line`: exact match, multi-occurrence (first wins), not found (line-1 fallback).
- `_format_parity_finding`: en-only and zh-only sides, embedded special characters in key names (e.g. underscores, digits).
- `ParityResult`: pass-shape and fail-shape construction.
### Integration Tests
- All six `ParityCheckTests` sub-cases listed above.
- The composition case (Requirement 5.1.f) inside `RunCheckCompositionTests` (or appended to `RunCheckEndToEndTests`).
- A regression of the existing `RunCheckEndToEndTests` cases after extending `_make_full_repo` to write a default parity-clean `locales/zh.json`.
### Performance / Load
- One sanity case: parity check on a synthetic 10 000-key catalogue completes in well under one second on the CI runner. Asserted by a `time.perf_counter()` budget of 1.0 s in the integration test.
## Performance & Scalability
- Catalogue size: ~1000 keys today; growth bounded by the number of UI strings + log keys. Even at 10× the current size, `_flatten` + set-diff remains negligible (<100 ms).
- The CI workflow timeout is 1 minute (`.github/workflows/i18n-cjk-guard.yml:timeout-minutes: 1`); the new check adds at most tens of milliseconds.
## Supporting References
- `gap-analysis.md` (this spec) implementation-approach options A/B/C with rationale.
- `research.md` (this spec) design decision records.
- `.kiro/specs/i18n-ci-guard/design.md` prior CI guard's design doc (style and boundary precedents).
- `.kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py` reference parity algorithm.