# Design Document — `i18n-translate-backend-comments` ## Overview **Purpose**: Translate Chinese-language docstrings and `#` comments across `backend/` Python files into English, so that English-speaking maintainers can read and review the codebase without translation overhead. **Users**: Backend maintainers and code reviewers who do not read Chinese. **Impact**: Improves developer ergonomics and review throughput. No runtime, behavior, or interface change. Adjacent i18n tickets (#2/#3/#4/#5/#6), which own the string-literal Chinese, remain unaffected. ### Goals - Eliminate Chinese characters from docstrings and `#` comments under the in-scope paths. - Preserve Google-style docstring shape and project formatting rules (4-space indent, ≤120 chars/line, double-quoted strings). - Keep the diff comments-and-docstrings-only — no executable, string-literal, or symbol changes. ### Non-Goals - Translating Chinese inside string literals (prompt templates, `logger.{info,warning,error}` arguments, API responses, error messages). These are owned by issues #2/#3/#4/#5/#6. - Refactoring code, reformatting style, or renaming symbols. - Introducing new tooling, linters, or CI rules. - Translating `backend/tests/test_locale*.py` (Chinese there is intentional test data inside string literals; outside ticket scope). ## Boundary Commitments ### This Spec Owns - Comment and docstring text under: `backend/app/__init__.py`, `backend/app/config.py`, `backend/app/api/`, `backend/app/models/`, `backend/app/services/`, `backend/app/utils/`, `backend/run.py`, `backend/scripts/`. - The decision rule for distinguishing docstrings from value strings (first-statement rule). - The Chinese→English Google-style docstring key map. - The verification workflow (residual `grep`, `pytest`, diff sanity check). ### Out of Boundary - All string-literal content, including triple-quoted strings used as values. - Files under `backend/tests/`, `backend/.venv/`, and any non-Python file. - Refactors, renames, formatting changes, or new dependencies. - Front-end localization, locale JSON files, or i18n runtime behavior. ### Allowed Dependencies - The repository's Python source (read + write for in-scope files only). - The existing test suite (`backend/scripts/test_profile_format.py`) for verification. - The existing `grep`-based residual scan for verification. ### Revalidation Triggers - A new in-scope file added under the listed paths (would expand the file list). - A change to `dev-guidelines.md` regarding docstring style (would change the key map or quote/indent rule). - A merge of any adjacent i18n ticket (#2/#3/#4/#5/#6) that turns a string literal into a docstring or vice versa. ## Architecture ### Existing Architecture Analysis This change touches only commentary; no architectural element of the backend is modified. The work spans the following packages: - `backend/app/__init__.py`, `backend/app/config.py` (Flask app and configuration entrypoint). - `backend/app/api/` (Flask blueprints). - `backend/app/models/` (`Project`, `Task` models). - `backend/app/services/` (graph builder, simulation runner, report agent, etc.). - `backend/app/utils/` (LLM client, file parser, retry, logger, locale, paging). - `backend/run.py` (process entrypoint). - `backend/scripts/` (simulation runners, profile-format test). ### Architecture Pattern & Boundary Map ```mermaid graph TB Discovery[Residual Grep Scan] Plan[Per-Package Plan] Translator[Translation Pass] Verify[Verification Gate] Commit[Per-Package Commit] PR[Single PR to main] Discovery --> Plan Plan --> Translator Translator --> Verify Verify -->|all checks pass| Commit Verify -->|any check fails| Translator Commit --> Plan Commit -->|all packages done| PR ``` **Architecture Integration**: - Selected pattern: **Iterative pass per package** with a verification gate after each pass. Linear, deterministic, low-coordination. - Domain/feature boundaries: One pass per backend package; commits are package-scoped to keep review chunks small. - Existing patterns preserved: 4-space indent, double-quoted strings, Google-style docstrings, `snake_case`, project file layout. - New components rationale: None — no new code, no new files. - Steering compliance: Conforms to repo-level coding rules and the commits ruleset. ### Technology Stack | Layer | Choice / Version | Role in Feature | Notes | |-------|------------------|-----------------|-------| | Backend / Services | Python ≥3.11 | Source language whose docstrings/comments are being translated | No version change; no dependency change | | Tooling | `git`, `grep`, `pytest` (existing) | Discovery, verification, regression check | No new tools | No frontend, data, messaging, or infrastructure layer is touched. ## File Structure Plan ### Directory Structure (no additions, no deletions) ``` backend/ ├── app/ │ ├── __init__.py # docstrings/comments only │ ├── config.py # docstrings/comments only │ ├── api/ # all *.py: docstrings/comments only │ ├── models/ # all *.py: docstrings/comments only │ ├── services/ # all *.py: docstrings/comments only │ └── utils/ # all *.py: docstrings/comments only ├── run.py # docstrings/comments only └── scripts/ # all *.py: docstrings/comments only ``` ### Modified Files The 37 in-scope files identified in `gap-analysis.md` are modified — comment and docstring lines only. No other paths are touched. ## Translation Rules These rules drive the translation pass and the verification gate. They are normative; the implementation must follow them exactly. ### Rule 1 — Docstring vs Value String Disambiguation A triple-quoted string is treated as a **docstring** (in scope) iff it is the first statement of a module, class, or function body. All other triple-quoted strings are **values** (out of scope) and must not be modified. ### Rule 2 — Translate Docstrings to English Google-style - Translate Chinese narrative text to faithful English. - Convert the following Chinese section keys to canonical English Google-style keys when present: | Chinese key | English key | | --- | --- | | `参数:` | `Args:` | | `返回:` | `Returns:` | | `异常:` | `Raises:` | | `产生:` / `生成:` | `Yields:` | | `示例:` | `Examples:` | | `注意:` / `备注:` | `Note:` | - Preserve double-quoted triple-quoted form (`"""..."""`). - Preserve indentation matching the surrounding scope. ### Rule 3 — Translate Inline `#` Comments to English - Translate the comment text to English. - If the translated comment would merely restate the immediately following executable line (a redundant verb-phrase paraphrase), delete the comment. - Preserve `TODO:` / `FIXME:` markers and any embedded ticket reference verbatim. - Preserve trailing in-line comments on the same line as code (e.g. `PENDING = "pending" # waiting`). ### Rule 4 — Style Compliance - Keep every translated line ≤120 characters. - Do not introduce trailing whitespace. - Preserve the original indentation of each comment/docstring. - Use double quotes for any docstring rewritten. ### Rule 5 — Preservation - Do not modify any executable Python statement. - Do not modify any string literal (single-, double-, triple-quoted, f-string, raw, byte) that is not a docstring under Rule 1. The single exception is the docstring being rewritten under Rule 2: quote-style normalization to triple double-quoted form (`"""..."""`) is permitted on the docstring only, since it is the artifact under translation. - Do not rename any symbol. ## System Flows ### Per-package iteration ```mermaid sequenceDiagram participant Dev as Translator participant Repo as Repo participant Tests as Test Suite Dev->>Repo: git checkout docs/i18n-7-translate-backend-comments loop For each package in [models, utils, services, api, scripts, root] Dev->>Repo: Translate docstrings/comments Dev->>Repo: git diff --stat (sanity check) Dev->>Tests: cd backend then uv run python -m pytest scripts/test_profile_format.py Tests-->>Dev: pass / fail Dev->>Repo: Re-run residual grep Repo-->>Dev: residual hits (string-literal only) Dev->>Repo: git commit -m "docs(i18n): translate chinese docstrings/comments in backend/" end Dev->>Repo: gh pr create -> single PR closing #7 ``` ## Requirements Traceability | Requirement | Summary | Components | Interfaces | Flows | |-------------|---------|------------|------------|-------| | 1.1 | No Chinese in docstrings under in-scope paths | Translation Pass | Rule 1, Rule 2 | Per-package iteration | | 1.2 | No Chinese in `#` comments under in-scope paths | Translation Pass | Rule 3 | Per-package iteration | | 1.3 | Residual grep returns only string-literal Chinese | Verification Gate | Residual grep workflow | Per-package iteration | | 1.4 | Google-style docstring shape preserved | Translation Pass | Rule 2 (key map) | — | | 2.1 | No executable statement modified | Verification Gate | Rule 5 | Per-package iteration | | 2.2 | No string literal modified | Verification Gate | Rule 1 (first-statement rule), Rule 5 | Per-package iteration | | 2.3 | No symbol renamed | Verification Gate | Rule 5 | Per-package iteration | | 2.4 | `pytest` passes | Verification Gate | Test suite invocation | Per-package iteration | | 2.5 | Hunks touching code rejected | Verification Gate | `git diff --stat` review | Per-package iteration | | 3.1 | Drop redundant comments | Translation Pass | Rule 3 | — | | 3.2 | Translate the *why* faithfully | Translation Pass | Rule 3 | — | | 3.3 | Preserve `TODO:`/`FIXME:` and ticket refs | Translation Pass | Rule 3 | — | | 3.4 | No new comments introduced | Translation Pass | Rule 3 | — | | 4.1 | ≤120 chars/line | Verification Gate | Rule 4 | — | | 4.2 | No trailing whitespace | Verification Gate | Rule 4 | — | | 4.3 | Preserve indentation | Translation Pass | Rule 4 | — | | 4.4 | Double quotes on rewritten docstrings | Translation Pass | Rule 4 | — | | 4.5 | Preserve 4-space indentation | Translation Pass | Rule 4 | — | | 5.1 | Use grep for discovery | Verification Gate | Discovery scan | — | | 5.2 | Re-run grep after each batch | Verification Gate | Residual grep workflow | Per-package iteration | | 5.3 | Continue until non-string-literal residual cleared | Verification Gate | Rule 1 disambiguation | Per-package iteration | | 5.4 | `git diff --stat` only in-scope paths | Verification Gate | Diff sanity check | Per-package iteration | | 6.1 | Branch `docs/i18n-7-translate-backend-comments` | Tracking & Branching | `/done` skill | — | | 6.2 | Reference issue #7 | Tracking & Branching | Commit/PR template | — | | 6.3 | Conventional Commits `docs(i18n)` | Tracking & Branching | `.claude/rules/commits.md` | — | | 6.4 | No unrelated changes | Verification Gate | Diff sanity check | — | ## Components and Interfaces | Component | Domain/Layer | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts | |-----------|--------------|--------|--------------|--------------------------|-----------| | Translation Pass | Process | Apply Rules 1–5 to one package's `*.py` | 1.1, 1.2, 1.4, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 4.5 | None (manual + AI-assisted) | Process | | Verification Gate | Process | Run residual grep, `pytest`, and diff sanity check after each package | 1.3, 2.1, 2.2, 2.3, 2.4, 2.5, 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.4 | `git`, `grep`, `pytest` (P0) | Process | | Tracking & Branching | Process | Branching, commit messages, PR | 6.1, 6.2, 6.3 | `/done` skill, `gh` CLI (P0) | Process | ### Process #### Translation Pass | Field | Detail | |-------|--------| | Intent | Translate docstrings and `#` comments in one package without touching code or string literals | | Requirements | 1.1, 1.2, 1.4, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 4.5 | **Responsibilities & Constraints** - Apply Rule 1 (first-statement disambiguation) before editing any triple-quoted string. - Apply Rule 2 (key map) for any Chinese Google-style key encountered. - Apply Rule 3 to inline comments; delete redundant ones. - Operate on one package at a time; do not interleave packages. **Dependencies** - Inbound: Verification Gate (provides feedback if a previous batch failed). - Outbound: Verification Gate (hands off post-pass). - External: None. **Contracts**: Process [x] / Service [ ] / API [ ] / Event [ ] / Batch [ ] / State [ ] **Implementation Notes** - Integration: Operates directly on the working tree on branch `docs/i18n-7-translate-backend-comments`. - Validation: After each file is rewritten, sanity-check that the diff for that file shows changes only on comment/docstring lines. - Risks: Accidental edit to a string-literal triple-quoted value — mitigated by Rule 1 + diff review. #### Verification Gate | Field | Detail | |-------|--------| | Intent | Confirm a package's translation pass left runtime behavior intact | | Requirements | 1.3, 2.1, 2.2, 2.3, 2.4, 2.5, 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.4 | **Responsibilities & Constraints** - Re-run `grep -rln '[一-鿿]' backend/ --include='*.py'` after each package and confirm residual hits are limited to string-literal Chinese owned by adjacent tickets. - Run `uv run python -m pytest backend/scripts/test_profile_format.py` and confirm exit 0. - Run `git diff --stat` and confirm only in-scope file paths are listed. - Spot-check a sample of changed files to confirm only comment/docstring lines changed. **Dependencies** - Inbound: Translation Pass. - Outbound: Tracking & Branching (commits) when all checks pass; loops back to Translation Pass otherwise. - External: `git`, `grep`, `pytest` (P0 — required for verification). **Contracts**: Process [x] / Service [ ] / API [ ] / Event [ ] / Batch [ ] / State [ ] **Implementation Notes** - Integration: Run from the repo root; no environment variables required beyond what `uv run` already provides. - Validation: All four checks (grep / pytest / diff scope / spot diff) must pass before committing. - Risks: A flaky `pytest` run unrelated to this change would block progress — mitigated by reading the failure and re-running once. #### Tracking & Branching | Field | Detail | |-------|--------| | Intent | Branch, commit, push, and open PR per project conventions | | Requirements | 6.1, 6.2, 6.3 | **Responsibilities & Constraints** - Branch name: `docs/i18n-7-translate-backend-comments`. - Commit messages follow Conventional Commits with `docs(i18n)` scope (e.g. `docs(i18n): translate chinese docstrings/comments in backend/services`). - PR closes #7 and references the spec. **Dependencies** - Inbound: Verification Gate (only commits when all checks pass). - External: `gh` CLI (P0), `/done` skill (P0). **Contracts**: Process [x] / Service [ ] / API [ ] / Event [ ] / Batch [ ] / State [ ] **Implementation Notes** - Integration: Use `/done` skill at the end to handle branch/push/PR uniformly. - Validation: Confirm PR body references issue #7 with `Closes #7` and lists each commit. - Risks: None. ## Error Handling ### Error Strategy This is a build-time / source-edit task — there is no runtime error path. Errors are caught by the Verification Gate. ### Error Categories and Responses - **Translation slipped into a string literal**: caught by `git diff --stat` + spot diff. Response: revert that hunk, re-apply translation against the docstring/comment only. - **Test suite fails after a pass**: caught by `pytest`. Response: read failure, identify which line was incorrectly modified (likely a string the translator misclassified as a docstring), revert that hunk, re-apply. - **Residual grep returns non-string-literal Chinese**: caught by post-pass grep. Response: classify those hits as in-scope and translate them in the next sub-pass. - **Line exceeds 120 chars after translation**: caught by spot diff. Response: reflow the comment/docstring without changing executable code. ### Monitoring None — this is a one-shot change. No production observability required. ## Testing Strategy The repository's existing tests are the safety net. No new tests are added. ### Default sections - **Unit Tests**: Not applicable; nothing executable changes. - **Integration Tests**: `uv run python -m pytest backend/scripts/test_profile_format.py` must continue to pass after each commit. - **E2E/UI Tests**: Not applicable. - **Verification checks (per package commit)**: 1. Residual `grep -rln '[一-鿿]' backend/ --include='*.py'` (run from repo root) returns only files whose remaining Chinese is in string literals owned by adjacent tickets. 2. `cd backend && uv run python -m pytest scripts/test_profile_format.py` exits 0. 3. `git diff --stat HEAD~..HEAD` shows only in-scope file paths. 4. Spot diff on three random changed files confirms only comment/docstring lines changed. ## Supporting References (Optional) - `gap-analysis.md` — full file enumeration and pattern survey. - `research.md` — discovery log, alternatives, and decisions.