MicroFish/.kiro/specs/i18n-frontend-comments/gap-analysis.md

10 KiB
Raw Blame History

Gap Analysis — i18n-frontend-comments

1. Current State Investigation

Scope discovery (ground truth)

Ripgrep [\x{4e00}-\x{9fff}] over frontend/src/ returns 20 files, 902 occurrences:

File Hits
views/Process.vue 191
components/Step4Report.vue 176
components/HistoryDatabase.vue 124
components/GraphPanel.vue 84
components/Step2EnvSetup.vue 76
components/Step3Simulation.vue 52
views/Home.vue 43
components/Step5Interaction.vue 34
api/simulation.js 29
views/SimulationView.vue 22
views/SimulationRunView.vue 18
api/graph.js 10
api/index.js 8
api/report.js 8
views/InteractionView.vue 6
views/ReportView.vue 6
components/Step1GraphBuild.vue 5
App.vue 4
views/MainView.vue 4
store/pendingUpload.js 2

No .css files exist under frontend/src/; styles live inside Vue SFC <style> blocks.

Comment shapes encountered

Sampling representative files confirms three syntactic forms — all already English-syntax, only the natural-language content is Chinese:

  • JS line comments: // 创建axios实例, timeout: 300000, // 5分钟超时本体生成可能需要较长时间
  • JSDoc blocks in api/simulation.js: /** * 创建模拟 */, * @returns {Promise} 返回配置信息,包含元数据和配置内容
  • Vue template comments in views/Home.vue: <!-- 顶部导航栏 -->, <!-- 上半部分Hero 区域 -->

String literals containing Chinese (NOT comments)

A naive regex for Chinese inside quoted strings flags 8 files. Spot-checks reveal two distinct categories that the ticket body did not explicitly anticipate:

  • Developer-facing log strings — e.g. Step1GraphBuild.vue:216 console.error('缺少项目或图谱信息'). These print to the browser dev console and are not part of the i18n locale surface. Translating them does not change runtime behavior.
  • LLM prompt template strings — e.g. Step5Interaction.vue:725-727 \以下是我们之前的对话:\n${historyContext}\n\n现在我的新问题是${message}``. These are sent to a Chinese-tuned LLM (default Qwen). Translating them would change the model's input and could shift output behavior.

The ticket says "no UI string changes (those are already in locales/en.json)" and "Out of scope: Translating user-facing strings". Neither category above is user-facing UI text — locales/*.json already covers user-facing strings via vue-i18n. The ticket's acceptance criterion #1 (grep returns no files, or only files with deliberately-kept bilingual comments listed in PR) leaves room to retain the LLM prompt strings as documented exceptions.

Conventions to respect (from steering)

  • tech.md: 4-space indent, no enforced linter, "match the surrounding file's style". Existing files mix English and Chinese in comments/docstrings — preserve both unless asked. This ticket is the explicit ask.
  • structure.md: frontend/src/api/*.js services use Axios with 5-min timeout + exponential retry. The translation pass must not touch the retry/timeout logic.
  • dev-guidelines.md (project-level): "Don't comment the obvious — comment the why." JSDoc on all exported functions, classes, interfaces (so JSDoc blocks must be kept in JSDoc form when translating, not deleted as redundant).
  • commits.md: Conventional Commits, lowercase, imperative, max 72 chars, no Co-Authored-By: footer. Branch <type>/<ticket>-<desc> — ticket dictates docs/i18n-9-translate-frontend-comments.

Recent merged PRs in the same epic (#11):

  • feat/i18n-2-translate-ontology-generator-prompts → backend prompt translation, full content swap.
  • feat/i18n-4-translate-sim-config-prompts, feat/i18n-5-translate-report-agent-prompts → similar backend prompt swaps.
  • feat/i18n-6-externalize-backend-logs → moved log strings out of code into i18n keys.
  • fix/i18n-8-backfill-zh-json (current branch base) → backfilled missing zh translations.

Pattern: prior i18n work changed both content and infrastructure (locale-keying logs). This ticket explicitly does not — it is a documentation-only pass without re-keying anything.

2. Requirements ↔ Asset Map

Req Asset to change Gap tag Note
1.11.4 (translate comments incl. JSDoc) All 20 files listed above — (clear) Largely mechanical; respect SFC block boundaries (<script> vs <template> vs <style>).
1.5 (deliberately bilingual) LLM prompt strings in Step5Interaction.vue (and any others discovered) Constraint Keep Chinese, document in PR. Behavior-risk if translated.
2.x (drop redundant) Files with // 获取数据-style restate-the-code comments Apply per case during the pass; conservative when ambiguous.
3.x (TODO/FIXME ticket refs) Search frontend/src/ for TODO|FIXME Unknown No matches noted in spot checks; will sweep during implementation. If none found, requirement is satisfied vacuously.
4.x (no behavior change) Confirmed by npm run build exit 0 + manual smoke Vite build is the reference; keep all string-literal content (other than developer-log strings) untouched; identifiers and imports are off-limits.
5.x (PR hand-off) PR description, branch name, commit message Branch name from ticket: docs/i18n-9-translate-frontend-comments.

Discovered scope ambiguity → decision needed

Two boundary calls that the requirements should sharpen before design:

  • console.error / console.warn / console.log strings with Chinese content — translate (developer-facing, not in locales) or leave (string-literal change risks scope creep)? Recommended: translate, since they are dev-facing comments-by-other-means and the ticket's spirit is "English-readable code". This is a design decision to be encoded in the design doc, not a new requirement.
  • LLM prompt template strings — leave as-is and list in PR (per Req 1.5). This is the safer call: the LLM is Chinese-tuned by default and translating a system prompt is a behavior change.

Both decisions stay inside the requirements as currently written (specifically Req 1.5 + Req 4.4, which already excludes string literals from the translation pass except where developer-log strings are concerned). The design phase will document the rule explicitly.

3. Implementation Approach Options

Option A — Single-pass translation per file, no tooling

Approach: Open each of the 20 files, translate every Chinese comment in place, drop redundant ones, append (#9) to bare TODO/FIXME, leave Chinese string literals (LLM prompts) and translate console.* Chinese strings. Verify with rg [\x{4e00}-\x{9fff}] frontend/src/.

  • Lowest overhead, no new tools or scripts
  • Fits a one-shot doc-only PR
  • Maximally aligns with dev-guidelines.md "comment the why" — judgment per comment
  • ~900 occurrences spread across 20 files — most concentrated in 6 files (>50 hits each) which are large (Process.vue is 2067 lines, Step4Report.vue, HistoryDatabase.vue)
  • Manual judgment for redundant-vs-meaningful adds reviewer load

Option B — Automated translation script + manual pass

Approach: Write a Node/Python script that walks files, extracts Chinese comments, runs them through an LLM, and writes back. Then a manual pass on the diff.

  • Faster on long files
  • Adds a dependency (LLM call) and a scratch script, neither delivered
  • The translation needs judgment (drop vs translate per Req 2) — automation undercuts the "comment the why" rule
  • Risk of touching string literals or identifiers if regex is loose
  • Out of step with the steering "no enforced tooling without discussion" principle

Option C — File-by-file with task batching

Approach: Group the 20 files into work units by size: (a) high-touch (Process, Step4Report, HistoryDatabase, GraphPanel, Step2EnvSetup, Step3Simulation), (b) mid-touch (Home, Step5Interaction, simulation.js, SimulationView, SimulationRunView), (c) light (api/{graph,index,report}.js, the 48 hit views, App.vue, store/pendingUpload.js, Step1GraphBuild.vue). Implementation tasks mirror these groups. Verify after each group with the ripgrep check.

  • Same translation effort as A but with checkpointable progress (matches the project's task-tracking pattern from steering — "background tasks expose progress")
  • Reviewer can read the PR file-group-by-file-group instead of all-at-once
  • If the PR needs to land partial (rare), the light + mid groups still ship a valuable subset
  • A few extra task headings in tasks.md vs Option A's "do the thing"

4. Effort & Risk

  • Effort: S (12 days). Mechanical translation, plus judgment calls. ~900 occurrences but no architectural work.
  • Risk: Low. Doc-only change. The only real risks are (a) accidentally editing a string literal that affects the LLM prompt or a hardcoded user-visible string, and (b) deleting a comment whose intent the translator misread. Both are mitigated by Req 4.4 ("leave string literals unchanged") and Req 2.3 (conservative-when-ambiguous).

5. Recommendations for Design Phase

  • Preferred approach: Option C — file-grouped translation pass, no tooling, no script. It matches the project's manual-style ethos and the existing pipeline-aligned task structure, and produces a reviewable PR.
  • Encode in design:
    • The translation rule for each comment shape (//, /* */, JSDoc, <!-- -->).
    • The decision matrix for string literals: translate console.* Chinese strings; retain LLM prompt strings (in Step5Interaction.vue) and list them in the PR per Req 1.5.
    • The TODO/FIXME sweep approach (single ripgrep pass before the file loop).
    • The verification command and acceptance check sequence.
  • Research items carried forward: none — the codebase has been inspected enough to commit to Option C without further investigation.