MicroFish/.kiro/specs/i18n-locale-parity-guard/gap-analysis.md

14 KiB
Raw Blame History

Gap Analysis — i18n-locale-parity-guard

Current State Investigation

Domain assets

Asset Path Role
Existing PR-time guard scripts/ci/i18n_cjk_guard.py (393 lines) Runs (a) zero-CJK-in-en.json, (b) per-path CJK ratchet on backend/app + frontend/src. CLI: --update-baseline, --baseline, --repo-root. Stdlib-only.
Workflow .github/workflows/i18n-cjk-guard.yml pull_request trigger; single step python scripts/ci/i18n_cjk_guard.py. 1-minute timeout. Python 3.11.
Existing tests scripts/ci/tests/test_i18n_cjk_guard.py (358 lines) unittest, stdlib-only. Per-function test classes (ScanLocaleCjkTests, CountPathCjkTests, BaselineRoundTripTests, RunCheckEndToEndTests, UpdateBaselineTests, CliSmokeTests). Synthetic git repos via tempfile.TemporaryDirectory + git init.
Reference parity logic .kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py (128 lines) Already implements flatten() (recursive dotted-key generator) and the EN/ZH symmetric-difference computation. Used only by the manual audit pipeline; not in CI.
Locale catalogues locales/en.json, locales/zh.json Two-space-indented JSON, ensure_ascii=False. 962 keys per side at HEAD; symmetric difference 0. Multi-level nesting (e.g. common.confirm, step1.upload.title, log.api.graph.startBuild).
Prior spec .kiro/specs/i18n-ci-guard/{design.md,baseline.txt} Documents the CJK-guard's design, format, and "scope ratchets only" rationale. The new check should compose, not replace.

Conventions extracted

  • Module layout: One CLI script per check class; checks compose inside a run_check(...) orchestrator that returns 0/1.
  • Output discipline: Stderr for failures, stdout for success summaries. Each failure line is self-contained (<file>:<line>: <category>: <key/payload>). Refresh hints (when applicable) printed once at the end.
  • No-short-circuit composition: run_check evaluates every check before exiting (existing pattern at lines 230, 258, 271 in i18n_cjk_guard.py).
  • Stdlib-only, deterministic: existing module imports only argparse, json, os, re, subprocess, sys, pathlib. All sorts use lexicographic order.
  • Test-fixture isolation: Each test owns a tempfile.TemporaryDirectory() and writes its own JSON / source files. Negative-path tests never depend on the live locales/.
  • Atomic writes: write_baseline uses tmp-file + os.replace; if any new persistence is added, mirror that pattern.
  • JSON line-resolution helper: _value_line_number(text_lines, value) already implements the line-fallback semantics required by R2.2 (returns 1 when value not found). Reusable for parity reporting if we resolve by key name rather than by value.

Integration surfaces

  • The workflow file invokes the guard exactly once: python scripts/ci/i18n_cjk_guard.py. Anything done inside run_check is automatically picked up — no workflow change needed if we extend the existing script (R3.6).
  • --update-baseline short-circuits inside main() before run_check is called; the new parity check naturally won't run in that mode (R3.4).
  • The audit pipeline at .kiro/specs/i18n-e2e-english-verification/audit/scripts/check_parity.py is independent and stays untouched (R6's "spec for prior CI guard" boundary).
  • Baseline file format is single-purpose (CJK counts) and does not need to grow to accommodate parity (parity has no baseline — divergence is binary).

Requirement-to-Asset Map

# Requirement Existing asset(s) Gap tag Notes
1.1 Flatten EN/ZH into dotted keys matching check_parity.py audit/scripts/check_parity.py:flatten (reference); existing _flatten in guard also flattens but only collects (key, value) pairs into a list Constraint Two _flatten flavours exist. Need ONE canonical function inside the guard module that mirrors check_parity.py.flatten (recursive, descends into dicts only, emits leaf scalars). The existing private _flatten(prefix, value, out) in the guard is already key-value-emitting and will work; the parity check just consumes its keys.
1.2 Pass when key sets identical, emit success summary with key count success_summary list in run_check Missing Add a parity success line in the same idiom: "OK locale-parity: 962 keys per side".
1.3 / 1.4 Fail on en-only or zh-only keys None — no parity check exists Missing Compute symmetric difference.
1.5 Treat dict leaves as non-leaves; treat scalar leaves the same as string leaves for parity _flatten already descends only into dicts and emits any non-dict as a leaf; scan_locale_cjk then narrows to strings, but parity should NOT narrow Constraint Use _flatten directly (no narrowing).
2.1 Print `:: : en-only zh-only` _value_line_number resolves a value's line; needs adaptation for keys Missing
2.2 Fall back to line 1 when location not found _value_line_number already returns 1 in this case Reuse
2.3 Final summary parity: en-only=<n>, zh-only=<m> None Missing One line, stderr.
2.4 All parity output to stderr print(..., file=sys.stderr) pattern used everywhere Reuse
2.5 Lexicographic sort Existing patterns use sorted(...) Reuse
3.1 / 3.2 / 3.3 Compose with existing checks; one exit code run_check already composes (a) and (b) without short-circuit Constraint Insert (c) at the end of run_check, after the per-path block but before the final return. Each check toggles the same failed flag.
3.4 --update-baseline does not run parity main() short-circuits to update_baseline() and never enters run_check Reuse Untouched.
3.5 --baseline and --repo-root semantics unchanged _build_parser and _detect_repo_root Reuse Untouched.
3.6 Workflow file unchanged .github/workflows/i18n-cjk-guard.yml Reuse No edit needed.
4.1 Stdlib-only Existing module is stdlib-only Reuse json is the only library needed for ZH loading.
4.2 Sub-second runtime ~1k keys; flatten + set diff is O(n) Constraint Trivially holds.
4.3 Deterministic output All sorts lexicographic Reuse
5.15.4 Tests under scripts/ci/tests/ for success / en-only / zh-only / both / scalar-leaves / no-short-circuit test_i18n_cjk_guard.py:RunCheckEndToEndTests is the integration class Missing Add either a new ParityCheckTests class or extend RunCheckEndToEndTests. Reuse _make_full_repo style; need a zh_json argument or a new helper that writes both locale files.
6.1 Guard passes on live catalogues at merge target EN/ZH parity verified manually (962/962, 0 diff) Reuse Manual run after implementation.
6.2 Document any blocking divergence in tasks.md n/a Conditional Only relevant if 6.1 fails — currently does not.

Complexity signal

  • Algorithmic logic only: load two JSON files, recursive flatten, set diff, sort, format, print. No external integrations, no I/O contention, no perf concerns at the catalogue size.

Implementation Approach Options

What changes:

  • Add private helpers to the existing module:
    • _flatten_keys(data) -> set[str] — wrapper over the existing _flatten that returns just the dotted-key set.
    • _locate_key_line(text_lines, dotted_key) -> int — substring scan for the leaf segment (after the last .) wrapped in JSON quotes; returns 1 on miss (mirrors _value_line_number's fallback).
    • _format_parity_finding(file_rel_path, line_no, dotted_key, side) -> str — single-line formatter.
  • Add a function run_parity_check(repo_root) -> tuple[bool, list[str], str] returning (passed, failure_lines, success_summary_line). Callable independently for tests.
  • In run_check, after the per-path baseline block and before the final return:
    • Call run_parity_check(repo_root).
    • If failed, set failed = True, print all failure lines + the parity: ... summary to stderr.
    • If passed, append the success line to success_summary.
  • Add a ZH_JSON_REL_PATH constant alongside EN_JSON_REL_PATH.

Compatibility assessment:

  • All existing CLI flags, exit codes, and stdout/stderr patterns preserved.
  • No new top-level dependencies. json already imported.
  • The module grows to ~470 lines, comparable to similar single-purpose CLI scripts in the repo (oasis_profile_generator.py is much larger). Single-responsibility is preserved: the responsibility is "PR-time i18n catalogue health," and parity is a sub-instance of that.
  • Existing tests continue to pass unmodified (none of the changed functions break their contract).

Trade-offs:

  • Zero workflow churn, single CI job, single CLI surface.
  • Reuses _flatten, line-resolution fallback, sort/print idioms.
  • All checks fail/pass together — easier to read in CI logs.
  • Module name (i18n_cjk_guard) is now slightly misleading: it also enforces parity, not just CJK presence. Mitigated by docstring update.

Option B — New parallel script scripts/ci/i18n_locale_parity_guard.py + new workflow step

What changes:

  • New script that implements the parity check standalone.
  • Either (i) add a second job to .github/workflows/i18n-cjk-guard.yml, or (ii) add a new workflow file i18n-locale-parity-guard.yml.
  • New test file scripts/ci/tests/test_i18n_locale_parity_guard.py.

Compatibility assessment:

  • Both scripts duplicate _flatten, line-resolution helper, JSON loader, repo-root detection, argparse boilerplate.
  • Two CI runs (or two steps) to read and ack on every PR.

Trade-offs:

  • Single-responsibility script per file (matches one literal reading of project conventions).
  • Code duplication ~80 lines.
  • Two CI surfaces; PR review fatigue.
  • Violates the spirit of R3 ("compose with the existing checks") — composing across two scripts requires either && or two-job aggregation.

Option C — Hybrid: new helper module + extended guard

What changes:

  • New module scripts/ci/locale_parity.py exposing compute_parity_findings(en_path, zh_path) -> ParityResult.
  • The existing i18n_cjk_guard.py imports from it and integrates the call into run_check, identical to Option A's runtime behaviour.
  • Tests split: test_locale_parity.py covers the helper in isolation; test_i18n_cjk_guard.py gains one composition test.

Compatibility assessment:

  • Adds package-style imports inside scripts/ci/ (currently flat — scripts/ci/i18n_cjk_guard.py adds _GUARD_DIR to sys.path via the test bootstrap, which works for sibling modules without further config).
  • No workflow change.

Trade-offs:

  • Clean separation, more reusable helper.
  • Possible to import the helper from the audit pipeline later (collapsing the duplicate check_parity.py).
  • More files for what is ~80 lines of new logic; over-engineering for current scope.
  • Risks scope creep into "deduplicate check_parity.py," which is explicitly out of scope.

Effort & Risk

  • Effort: S (12 days). Existing module patterns are mature; the algorithmic logic is small and proven (check_parity.py); test scaffolding is already in place.
  • Risk: Low. Stdlib-only; no external integrations; no shared mutable state; deterministic algorithm; existing CI workflow unchanged; live catalogues already pass.

Recommendations for Design Phase

Preferred approach: Option A (extend scripts/ci/i18n_cjk_guard.py)

Rationale:

  1. The existing module's docstring already says "PR-time guard: fail when locales/en.json contains CJK or when backend/app + frontend/src CJK match counts exceed the committed baseline." Extending it to also fail on locale-key parity is the smallest possible delta that also reads naturally in the codebase.
  2. R3 ("composes with the existing CJK and per-path checks; one CLI; no workflow edit") is satisfied trivially.
  3. Reuses _flatten, line-fallback, sort/print idioms verbatim.
  4. The module name remains accurate — "CJK Guard" is the canonical name of the i18n PR-time gate; we'll add a docstring note that parity is the third covered check.

Key design decisions to settle in design.md

  • Function boundary: should run_parity_check live in the same module or in a small helper module? Suggest: same module, as a private function alongside count_path_cjk / scan_locale_cjk for symmetry.
  • Failure line format: exact string layout (file:line:key:side, ordering of the four pieces, separator characters). Suggest mirroring _format_locale_finding exactly: f"{file}:{line}: {category}: {key}" where category is parity-en-only or parity-zh-only.
  • Test fixture for RunCheckEndToEndTests: extend _make_full_repo to accept an optional zh_json parameter, or add a sibling helper. Suggest extending — keeps the integration test in one place and lets the existing tests opt out by passing zh_json=None (the helper writes a parity-clean default).
  • Whether to expose a --check=parity selector: Out of scope per R3.1 (no short-circuit, all-or-nothing).

Research items to carry forward

None. All required information is in the existing repo and the cited reference scripts. No external dependencies, no new tech, no perf research, no security implications.