MicroFish/.kiro/specs/i18n-e2e-english-verification/gap-analysis.md

8.1 KiB
Raw Blame History

Gap Analysis — i18n-e2e-english-verification

1. Current state investigation

Domain-relevant assets in the repo

Concern Location Notes
Locale catalogues locales/en.json, locales/zh.json, locales/languages.json Flat-namespaced JSON, loaded by vue-i18n and the backend logger.
Frontend i18n loader frontend/src/i18n/ Provides useI18n() to components.
Frontend UI surface frontend/src/views/, frontend/src/components/ Step15 components + Process.vue orchestrator.
Backend logger backend/app/utils/logger.py (per CLAUDE.md) Externalised log messages (#6 work).
Locale helpers backend/app/utils/ Per CLAUDE.md, locale propagation lives here.
Prompt assets that emit user-visible text backend/app/services/ontology_generator.py (#2, #3?), oasis_profile_generator.py (#3), simulation_config_generator.py (#4), report_agent.py (#5) Prompts are inline Python strings, not separate files.
Pipeline boundaries backend/app/api/*.py (Flask), services/simulation_runner.py + simulation_ipc.py (subprocess), services/report_agent.py (ReACT) Locale must propagate across all of these.

Project conventions surfaced

  • Task model used for any long-running operation (CLAUDE.md). Verification doesn't introduce one — it is a one-shot batch.
  • Reasoning-model output stripping convention exists, irrelevant here.
  • Per-project group_id isolation in Neo4j — verification queries should NOT touch Neo4j; we run a static audit only.
  • "Match the surrounding file's style" (no enforced formatter).

Live audit baseline (commit 9dcaecd)

git grep -nP "[\x{4e00}-\x{9fff}]" -- backend/app frontend/src locales/en.json | wc -l
→ 2918 lines across 36 files

Bucketed:

Bucket Files Lines Notes
locales/en.json 0 0 clean
frontend/src/views/Process.vue 1 65 hard-coded UI strings (template + JS literals), not i18n keys
frontend/src/components/Step{2,3,4,5}*.vue 4 ~50 (mostly Step4Report.vue regex parsers) depends-on-backend regex parsers + a few literals
backend/app/services/*.py 13 majority docstrings + comments + a few prompt assembly fragments + agent context labels (e.g. "事实信息:" in oasis_profile_generator.py)
backend/app/api/*.py 4 many docstrings + comments + log-message Chinese (build_logger.info(f"[{task_id}] 开始构建图谱...") etc)
backend/app/utils/*.py 7 many docstrings + comments + log strings (e.g. retry.py "函数 {func} 在 N 次重试后仍失败")
backend/app/models/*.py 3 docstrings docstrings only (probably)

Locale catalogue parity (Python check)

en keys: 953
zh keys: 953
symmetric diff: 0

→ R2 (parity) passes. ZH backfill (#8) closed the gap and en/zh are now lock-step.

Boundary review surface (R4)

  • backend/app/api/graph.py build_logger.info(f"[{task_id}] 开始构建图谱...") shows the backend logger is still emitting Chinese on the build path — this is exactly the kind of leak #6 was supposed to externalise.
  • backend/app/utils/retry.py logger.error(f"函数 {func.__name__} 在 {max_retries} 次重试后仍失败...") — same: log strings remain hard-coded Chinese.
  • ReACT/agent context labels in oasis_profile_generator.py ("事实信息:", "相关实体:") feed directly into the LLM prompt — these will bias the model toward Chinese output.

2. Requirements feasibility

Mapping requirements → existing assets

Req Need Existing asset Gap tag
R1 (static audit) run git grep and capture output git, ripgrep None — straightforward
R1.5 (en.json CJK check) inspect catalogue already at 0 hits None — passes
R2 (parity) enumerate keys recursively, diff small Python script None — already passes
R3 (prompt verification) read prompt strings in services/*.py inline Python strings Constraint — prompts are inline, not standalone files; verification must read source not assets
R4 (propagation) trace locale across Flask → Task → OASIS → ReACT source code review Research needed in design phase: where exactly is locale stored today? CLAUDE.md hints set_locale thread-local exists but path not yet read
R5 (post comment) gh issue comment 10 gh CLI None
R6 (ZH regression) confirm zh values are non-English small Python script None
R7 (file follow-ups) gh issue create gh CLI None
R8 (capture & idempotence) write under .kiro/specs/.../audit/ filesystem None

Complexity signals

  • Algorithmic: trivial — grep + count + diff.
  • Workflow: post a comment + open follow-up issues — one-shot.
  • External integrations: GitHub via gh. No DB, no Neo4j, no LLM calls.

Constraints from existing architecture

  • No code edits to backend/app/, frontend/src/, locales/ — the spec is verification-only. The change-set is confined to .kiro/specs/i18n-e2e-english-verification/ (audit captures, gap report, follow-up issue list) and any commit message / PR description.
  • Manual UI walkthrough is not feasible in a sandboxed CLI — must be marked manual-pending per R5.3.
  • Live docker-compose up likewise unavailable — same handling.

3. Implementation approach options

Option A — Pure shell + Python script kept under .kiro/specs/.../audit/

  • A single Bash + Python pipeline that emits audit/cjk-grep.txt, audit/parity.txt, audit/gap-report.md.
  • Posts the comment via gh and opens follow-ups via gh issue create.
  • Scripts are read-only against production source.

Simplest, no production-code touch. Easy to re-run. Scripts only relevant to this ticket — scoped to .kiro/specs/.../audit/scripts/, not promoted to a reusable tools/.

Option B — Build a reusable tools/i18n-audit/ checker

  • Create a permanent CLI under tools/ so future verifiers can re-run.
  • Integrates with CI (could become a check that fails when en.json contains CJK).

Adds a tool & directory the project doesn't have. Scope creep — the spec is for one verification pass, not a CI check. A reusable tool wants its own ticket; ramming it in here violates the "no inline fixes" rule.

Option C — Hybrid: ad-hoc script for this run, plus open a follow-up issue requesting the reusable CI check

  • Run the verification with disposable scripts (Option A) AND file a follow-up issue asking for the reusable CI check (Option B as a future ticket).

Keeps current ticket scoped. Captures the value of B without bloating this PR.

4. Out-of-scope items deferred

  • Any production code edits that would close gaps. R7 makes this explicit.
  • Live UI walkthrough / dynamic verification — captured as manual-pending in the report.

5. Effort & risk

  • Effort: S (1 day) — auditing scripts + report writing + issue filings.
  • Risk: Low — read-only operations, no architectural change, the failure mode (gh lacking permissions) is handled by R7.5 (fallback inline list).

6. Recommendations for design phase

  • Preferred approach: Option C (hybrid).
  • Key decisions to make in design:
    • Concrete script layout under .kiro/specs/i18n-e2e-english-verification/audit/.
    • Format of audit/gap-report.md (the artefact echoed into the issue comment).
    • Exact follow-up issue grouping rule (R7.2): one issue per pipeline step? per file? per category (UI / logs / prompts / docstrings)?
    • Reproducibility (R8.2): do we keep audit/<commit-sha>/ per run, or audit/latest/ + audit/previous/?
    • Whether the scripts are committed to the repo (they live under .kiro/specs/... — yes by default) or only the captured outputs.
  • Research items to carry forward:
    • Read backend/app/utils/ to confirm whether a locale helper / set_locale exists today (R4 detail).
    • Read backend/app/utils/logger.py to confirm where externalised log keys live and how the locale is selected at log time (R4 + Step-1 logs checklist item).
    • Confirm whether any services/*.py Chinese match is part of an LLM prompt vs a comment — only prompt matches block R3.