docs(spec): add i18n-translate-backend-comments spec and handoff
This commit is contained in:
parent
c8c455ceb4
commit
2ba84f4c8b
|
|
@ -0,0 +1,61 @@
|
|||
# Handoff — `i18n-translate-backend-comments` (Issue #7)
|
||||
|
||||
## Status
|
||||
**Partial completion.** This is the first installment of the ticket-#7 cleanup. The ticket explicitly allows splitting the work across multiple small PRs ("Low-risk, high-volume mechanical task; can be split across multiple small PRs"). This PR ships translations for the smaller files; the larger service and API files remain for follow-up PRs.
|
||||
|
||||
## Completed in this PR (23 files)
|
||||
All translated to English with no behavior or string-literal changes:
|
||||
|
||||
- **Root**: `backend/app/__init__.py`, `backend/app/config.py`, `backend/run.py`
|
||||
- **API package init**: `backend/app/api/__init__.py`
|
||||
- **Models** (full package): `backend/app/models/__init__.py`, `project.py`, `task.py`
|
||||
- **Utils** (full package): `backend/app/utils/__init__.py`, `file_parser.py`, `llm_client.py`, `locale.py` (no docstring/comment Chinese to begin with), `logger.py`, `retry.py`, `zep_paging.py`
|
||||
- **Services** (partial): `backend/app/services/__init__.py`, `graph_builder.py`, `ontology_generator.py`, `simulation_ipc.py`, `simulation_manager.py`, `text_processor.py`, `zep_entity_reader.py`
|
||||
- **Scripts** (partial): `backend/scripts/action_logger.py`, `backend/scripts/test_profile_format.py`
|
||||
|
||||
## Remaining for follow-up PRs (12 files)
|
||||
Per the AST-aware scanner used in this PR (`/tmp/scan_chinese.py`), the residual in-scope work totals **2,235 hits** (1,203 docstring lines + 1,032 inline-comment lines) across these files:
|
||||
|
||||
| File | Approx in-scope hits | Approx LOC |
|
||||
| --- | --- | --- |
|
||||
| `backend/app/api/graph.py` | ~50 | 665 |
|
||||
| `backend/app/api/report.py` | ~80 | 1020 |
|
||||
| `backend/app/api/simulation.py` | ~250 | 2712 |
|
||||
| `backend/app/services/oasis_profile_generator.py` | ~230 | 1195 |
|
||||
| `backend/app/services/report_agent.py` | ~520 | 2572 |
|
||||
| `backend/app/services/simulation_config_generator.py` | ~150 | 991 |
|
||||
| `backend/app/services/simulation_runner.py` | ~330 | 1768 |
|
||||
| `backend/app/services/zep_graph_memory_updater.py` | ~110 | 544 |
|
||||
| `backend/app/services/zep_tools.py` | ~280 | 1741 |
|
||||
| `backend/scripts/run_parallel_simulation.py` | ~150 | 1699 |
|
||||
| `backend/scripts/run_reddit_simulation.py` | ~50 | 769 |
|
||||
| `backend/scripts/run_twitter_simulation.py` | ~50 | 780 |
|
||||
|
||||
(Counts are approximate and exclude string-literal Chinese, which is owned by adjacent tickets #2/#3/#4/#5/#6.)
|
||||
|
||||
## Suggested follow-up split
|
||||
|
||||
Three additional PRs of similar size to this one would complete the ticket:
|
||||
|
||||
1. **PR 2 — `services/{oasis_profile_generator, simulation_config_generator, simulation_runner, zep_graph_memory_updater, zep_tools}`**
|
||||
2. **PR 3 — `services/report_agent.py`** (single big file; isolating it keeps the diff reviewable)
|
||||
3. **PR 4 — `api/{graph,report,simulation}.py` + `scripts/run_{parallel,reddit,twitter}_simulation.py`**
|
||||
|
||||
## Verification methodology used
|
||||
The AST-aware scanner (`/tmp/scan_chinese.py` — also kept in commit context) classifies every Chinese-containing line into one of three buckets: `DOCSTRING` (in scope), `COMMENT` (in scope), `STRING_VALUE` (out of scope, owned by adjacent tickets). Each translated file was verified with:
|
||||
|
||||
1. `python -m py_compile <file>` — syntactic validity.
|
||||
2. The scanner returning `{'DOCSTRING': 0, 'COMMENT': 0}` for that file.
|
||||
3. `git diff <file>` review — only `#` lines and docstring lines change; no executable lines.
|
||||
|
||||
## Test environment caveat
|
||||
The repo's `uv sync` requires building `tiktoken` from source, which needs Rust. The sandbox running this implementation pass does not have Rust, so `cd backend && uv run python -m pytest scripts/test_profile_format.py` (the verification command in the spec) cannot be executed end-to-end here; the test command also fails on import for unrelated reasons (missing `graphiti_core`, etc.) before any of this PR's changes touched the tree. Because the change set is comments-and-docstrings-only, runtime behavior cannot be affected; the syntactic-validity check stands in for the test run in this environment.
|
||||
|
||||
A developer with the project's normal dev environment (Rust toolchain installed, full `uv sync` succeeded) should re-run `cd backend && uv run python -m pytest scripts/test_profile_format.py` against this branch before merging to confirm.
|
||||
|
||||
## What is NOT changed
|
||||
- No string literal anywhere in the touched files.
|
||||
- No executable Python statement.
|
||||
- No symbol renamed.
|
||||
- No file added or removed.
|
||||
- No dependency added or version-bumped.
|
||||
|
|
@ -0,0 +1,316 @@
|
|||
# Design Document — `i18n-translate-backend-comments`
|
||||
|
||||
## Overview
|
||||
**Purpose**: Translate Chinese-language docstrings and `#` comments across `backend/` Python files into English, so that English-speaking maintainers can read and review the codebase without translation overhead.
|
||||
|
||||
**Users**: Backend maintainers and code reviewers who do not read Chinese.
|
||||
|
||||
**Impact**: Improves developer ergonomics and review throughput. No runtime, behavior, or interface change. Adjacent i18n tickets (#2/#3/#4/#5/#6), which own the string-literal Chinese, remain unaffected.
|
||||
|
||||
### Goals
|
||||
- Eliminate Chinese characters from docstrings and `#` comments under the in-scope paths.
|
||||
- Preserve Google-style docstring shape and project formatting rules (4-space indent, ≤120 chars/line, double-quoted strings).
|
||||
- Keep the diff comments-and-docstrings-only — no executable, string-literal, or symbol changes.
|
||||
|
||||
### Non-Goals
|
||||
- Translating Chinese inside string literals (prompt templates, `logger.{info,warning,error}` arguments, API responses, error messages). These are owned by issues #2/#3/#4/#5/#6.
|
||||
- Refactoring code, reformatting style, or renaming symbols.
|
||||
- Introducing new tooling, linters, or CI rules.
|
||||
- Translating `backend/tests/test_locale*.py` (Chinese there is intentional test data inside string literals; outside ticket scope).
|
||||
|
||||
## Boundary Commitments
|
||||
|
||||
### This Spec Owns
|
||||
- Comment and docstring text under: `backend/app/__init__.py`, `backend/app/config.py`, `backend/app/api/`, `backend/app/models/`, `backend/app/services/`, `backend/app/utils/`, `backend/run.py`, `backend/scripts/`.
|
||||
- The decision rule for distinguishing docstrings from value strings (first-statement rule).
|
||||
- The Chinese→English Google-style docstring key map.
|
||||
- The verification workflow (residual `grep`, `pytest`, diff sanity check).
|
||||
|
||||
### Out of Boundary
|
||||
- All string-literal content, including triple-quoted strings used as values.
|
||||
- Files under `backend/tests/`, `backend/.venv/`, and any non-Python file.
|
||||
- Refactors, renames, formatting changes, or new dependencies.
|
||||
- Front-end localization, locale JSON files, or i18n runtime behavior.
|
||||
|
||||
### Allowed Dependencies
|
||||
- The repository's Python source (read + write for in-scope files only).
|
||||
- The existing test suite (`backend/scripts/test_profile_format.py`) for verification.
|
||||
- The existing `grep`-based residual scan for verification.
|
||||
|
||||
### Revalidation Triggers
|
||||
- A new in-scope file added under the listed paths (would expand the file list).
|
||||
- A change to `dev-guidelines.md` regarding docstring style (would change the key map or quote/indent rule).
|
||||
- A merge of any adjacent i18n ticket (#2/#3/#4/#5/#6) that turns a string literal into a docstring or vice versa.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Existing Architecture Analysis
|
||||
This change touches only commentary; no architectural element of the backend is modified. The work spans the following packages:
|
||||
|
||||
- `backend/app/__init__.py`, `backend/app/config.py` (Flask app and configuration entrypoint).
|
||||
- `backend/app/api/` (Flask blueprints).
|
||||
- `backend/app/models/` (`Project`, `Task` models).
|
||||
- `backend/app/services/` (graph builder, simulation runner, report agent, etc.).
|
||||
- `backend/app/utils/` (LLM client, file parser, retry, logger, locale, paging).
|
||||
- `backend/run.py` (process entrypoint).
|
||||
- `backend/scripts/` (simulation runners, profile-format test).
|
||||
|
||||
### Architecture Pattern & Boundary Map
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
Discovery[Residual Grep Scan]
|
||||
Plan[Per-Package Plan]
|
||||
Translator[Translation Pass]
|
||||
Verify[Verification Gate]
|
||||
Commit[Per-Package Commit]
|
||||
PR[Single PR to main]
|
||||
|
||||
Discovery --> Plan
|
||||
Plan --> Translator
|
||||
Translator --> Verify
|
||||
Verify -->|all checks pass| Commit
|
||||
Verify -->|any check fails| Translator
|
||||
Commit --> Plan
|
||||
Commit -->|all packages done| PR
|
||||
```
|
||||
|
||||
**Architecture Integration**:
|
||||
- Selected pattern: **Iterative pass per package** with a verification gate after each pass. Linear, deterministic, low-coordination.
|
||||
- Domain/feature boundaries: One pass per backend package; commits are package-scoped to keep review chunks small.
|
||||
- Existing patterns preserved: 4-space indent, double-quoted strings, Google-style docstrings, `snake_case`, project file layout.
|
||||
- New components rationale: None — no new code, no new files.
|
||||
- Steering compliance: Conforms to repo-level coding rules and the commits ruleset.
|
||||
|
||||
### Technology Stack
|
||||
|
||||
| Layer | Choice / Version | Role in Feature | Notes |
|
||||
|-------|------------------|-----------------|-------|
|
||||
| Backend / Services | Python ≥3.11 | Source language whose docstrings/comments are being translated | No version change; no dependency change |
|
||||
| Tooling | `git`, `grep`, `pytest` (existing) | Discovery, verification, regression check | No new tools |
|
||||
|
||||
No frontend, data, messaging, or infrastructure layer is touched.
|
||||
|
||||
## File Structure Plan
|
||||
|
||||
### Directory Structure (no additions, no deletions)
|
||||
```
|
||||
backend/
|
||||
├── app/
|
||||
│ ├── __init__.py # docstrings/comments only
|
||||
│ ├── config.py # docstrings/comments only
|
||||
│ ├── api/ # all *.py: docstrings/comments only
|
||||
│ ├── models/ # all *.py: docstrings/comments only
|
||||
│ ├── services/ # all *.py: docstrings/comments only
|
||||
│ └── utils/ # all *.py: docstrings/comments only
|
||||
├── run.py # docstrings/comments only
|
||||
└── scripts/ # all *.py: docstrings/comments only
|
||||
```
|
||||
|
||||
### Modified Files
|
||||
The 37 in-scope files identified in `gap-analysis.md` are modified — comment and docstring lines only. No other paths are touched.
|
||||
|
||||
## Translation Rules
|
||||
|
||||
These rules drive the translation pass and the verification gate. They are normative; the implementation must follow them exactly.
|
||||
|
||||
### Rule 1 — Docstring vs Value String Disambiguation
|
||||
A triple-quoted string is treated as a **docstring** (in scope) iff it is the first statement of a module, class, or function body. All other triple-quoted strings are **values** (out of scope) and must not be modified.
|
||||
|
||||
### Rule 2 — Translate Docstrings to English Google-style
|
||||
- Translate Chinese narrative text to faithful English.
|
||||
- Convert the following Chinese section keys to canonical English Google-style keys when present:
|
||||
|
||||
| Chinese key | English key |
|
||||
| --- | --- |
|
||||
| `参数:` | `Args:` |
|
||||
| `返回:` | `Returns:` |
|
||||
| `异常:` | `Raises:` |
|
||||
| `产生:` / `生成:` | `Yields:` |
|
||||
| `示例:` | `Examples:` |
|
||||
| `注意:` / `备注:` | `Note:` |
|
||||
|
||||
- Preserve double-quoted triple-quoted form (`"""..."""`).
|
||||
- Preserve indentation matching the surrounding scope.
|
||||
|
||||
### Rule 3 — Translate Inline `#` Comments to English
|
||||
- Translate the comment text to English.
|
||||
- If the translated comment would merely restate the immediately following executable line (a redundant verb-phrase paraphrase), delete the comment.
|
||||
- Preserve `TODO:` / `FIXME:` markers and any embedded ticket reference verbatim.
|
||||
- Preserve trailing in-line comments on the same line as code (e.g. `PENDING = "pending" # waiting`).
|
||||
|
||||
### Rule 4 — Style Compliance
|
||||
- Keep every translated line ≤120 characters.
|
||||
- Do not introduce trailing whitespace.
|
||||
- Preserve the original indentation of each comment/docstring.
|
||||
- Use double quotes for any docstring rewritten.
|
||||
|
||||
### Rule 5 — Preservation
|
||||
- Do not modify any executable Python statement.
|
||||
- Do not modify any string literal (single-, double-, triple-quoted, f-string, raw, byte) that is not a docstring under Rule 1. The single exception is the docstring being rewritten under Rule 2: quote-style normalization to triple double-quoted form (`"""..."""`) is permitted on the docstring only, since it is the artifact under translation.
|
||||
- Do not rename any symbol.
|
||||
|
||||
## System Flows
|
||||
|
||||
### Per-package iteration
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Translator
|
||||
participant Repo as Repo
|
||||
participant Tests as Test Suite
|
||||
Dev->>Repo: git checkout docs/i18n-7-translate-backend-comments
|
||||
loop For each package in [models, utils, services, api, scripts, root]
|
||||
Dev->>Repo: Translate docstrings/comments
|
||||
Dev->>Repo: git diff --stat (sanity check)
|
||||
Dev->>Tests: cd backend then uv run python -m pytest scripts/test_profile_format.py
|
||||
Tests-->>Dev: pass / fail
|
||||
Dev->>Repo: Re-run residual grep
|
||||
Repo-->>Dev: residual hits (string-literal only)
|
||||
Dev->>Repo: git commit -m "docs(i18n): translate chinese docstrings/comments in backend/<area>"
|
||||
end
|
||||
Dev->>Repo: gh pr create -> single PR closing #7
|
||||
```
|
||||
|
||||
## Requirements Traceability
|
||||
|
||||
| Requirement | Summary | Components | Interfaces | Flows |
|
||||
|-------------|---------|------------|------------|-------|
|
||||
| 1.1 | No Chinese in docstrings under in-scope paths | Translation Pass | Rule 1, Rule 2 | Per-package iteration |
|
||||
| 1.2 | No Chinese in `#` comments under in-scope paths | Translation Pass | Rule 3 | Per-package iteration |
|
||||
| 1.3 | Residual grep returns only string-literal Chinese | Verification Gate | Residual grep workflow | Per-package iteration |
|
||||
| 1.4 | Google-style docstring shape preserved | Translation Pass | Rule 2 (key map) | — |
|
||||
| 2.1 | No executable statement modified | Verification Gate | Rule 5 | Per-package iteration |
|
||||
| 2.2 | No string literal modified | Verification Gate | Rule 1 (first-statement rule), Rule 5 | Per-package iteration |
|
||||
| 2.3 | No symbol renamed | Verification Gate | Rule 5 | Per-package iteration |
|
||||
| 2.4 | `pytest` passes | Verification Gate | Test suite invocation | Per-package iteration |
|
||||
| 2.5 | Hunks touching code rejected | Verification Gate | `git diff --stat` review | Per-package iteration |
|
||||
| 3.1 | Drop redundant comments | Translation Pass | Rule 3 | — |
|
||||
| 3.2 | Translate the *why* faithfully | Translation Pass | Rule 3 | — |
|
||||
| 3.3 | Preserve `TODO:`/`FIXME:` and ticket refs | Translation Pass | Rule 3 | — |
|
||||
| 3.4 | No new comments introduced | Translation Pass | Rule 3 | — |
|
||||
| 4.1 | ≤120 chars/line | Verification Gate | Rule 4 | — |
|
||||
| 4.2 | No trailing whitespace | Verification Gate | Rule 4 | — |
|
||||
| 4.3 | Preserve indentation | Translation Pass | Rule 4 | — |
|
||||
| 4.4 | Double quotes on rewritten docstrings | Translation Pass | Rule 4 | — |
|
||||
| 4.5 | Preserve 4-space indentation | Translation Pass | Rule 4 | — |
|
||||
| 5.1 | Use grep for discovery | Verification Gate | Discovery scan | — |
|
||||
| 5.2 | Re-run grep after each batch | Verification Gate | Residual grep workflow | Per-package iteration |
|
||||
| 5.3 | Continue until non-string-literal residual cleared | Verification Gate | Rule 1 disambiguation | Per-package iteration |
|
||||
| 5.4 | `git diff --stat` only in-scope paths | Verification Gate | Diff sanity check | Per-package iteration |
|
||||
| 6.1 | Branch `docs/i18n-7-translate-backend-comments` | Tracking & Branching | `/done` skill | — |
|
||||
| 6.2 | Reference issue #7 | Tracking & Branching | Commit/PR template | — |
|
||||
| 6.3 | Conventional Commits `docs(i18n)` | Tracking & Branching | `.claude/rules/commits.md` | — |
|
||||
| 6.4 | No unrelated changes | Verification Gate | Diff sanity check | — |
|
||||
|
||||
## Components and Interfaces
|
||||
|
||||
| Component | Domain/Layer | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts |
|
||||
|-----------|--------------|--------|--------------|--------------------------|-----------|
|
||||
| Translation Pass | Process | Apply Rules 1–5 to one package's `*.py` | 1.1, 1.2, 1.4, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 4.5 | None (manual + AI-assisted) | Process |
|
||||
| Verification Gate | Process | Run residual grep, `pytest`, and diff sanity check after each package | 1.3, 2.1, 2.2, 2.3, 2.4, 2.5, 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.4 | `git`, `grep`, `pytest` (P0) | Process |
|
||||
| Tracking & Branching | Process | Branching, commit messages, PR | 6.1, 6.2, 6.3 | `/done` skill, `gh` CLI (P0) | Process |
|
||||
|
||||
### Process
|
||||
|
||||
#### Translation Pass
|
||||
| Field | Detail |
|
||||
|-------|--------|
|
||||
| Intent | Translate docstrings and `#` comments in one package without touching code or string literals |
|
||||
| Requirements | 1.1, 1.2, 1.4, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 4.5 |
|
||||
|
||||
**Responsibilities & Constraints**
|
||||
- Apply Rule 1 (first-statement disambiguation) before editing any triple-quoted string.
|
||||
- Apply Rule 2 (key map) for any Chinese Google-style key encountered.
|
||||
- Apply Rule 3 to inline comments; delete redundant ones.
|
||||
- Operate on one package at a time; do not interleave packages.
|
||||
|
||||
**Dependencies**
|
||||
- Inbound: Verification Gate (provides feedback if a previous batch failed).
|
||||
- Outbound: Verification Gate (hands off post-pass).
|
||||
- External: None.
|
||||
|
||||
**Contracts**: Process [x] / Service [ ] / API [ ] / Event [ ] / Batch [ ] / State [ ]
|
||||
|
||||
**Implementation Notes**
|
||||
- Integration: Operates directly on the working tree on branch `docs/i18n-7-translate-backend-comments`.
|
||||
- Validation: After each file is rewritten, sanity-check that the diff for that file shows changes only on comment/docstring lines.
|
||||
- Risks: Accidental edit to a string-literal triple-quoted value — mitigated by Rule 1 + diff review.
|
||||
|
||||
#### Verification Gate
|
||||
| Field | Detail |
|
||||
|-------|--------|
|
||||
| Intent | Confirm a package's translation pass left runtime behavior intact |
|
||||
| Requirements | 1.3, 2.1, 2.2, 2.3, 2.4, 2.5, 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.4 |
|
||||
|
||||
**Responsibilities & Constraints**
|
||||
- Re-run `grep -rln '[一-鿿]' backend/ --include='*.py'` after each package and confirm residual hits are limited to string-literal Chinese owned by adjacent tickets.
|
||||
- Run `uv run python -m pytest backend/scripts/test_profile_format.py` and confirm exit 0.
|
||||
- Run `git diff --stat` and confirm only in-scope file paths are listed.
|
||||
- Spot-check a sample of changed files to confirm only comment/docstring lines changed.
|
||||
|
||||
**Dependencies**
|
||||
- Inbound: Translation Pass.
|
||||
- Outbound: Tracking & Branching (commits) when all checks pass; loops back to Translation Pass otherwise.
|
||||
- External: `git`, `grep`, `pytest` (P0 — required for verification).
|
||||
|
||||
**Contracts**: Process [x] / Service [ ] / API [ ] / Event [ ] / Batch [ ] / State [ ]
|
||||
|
||||
**Implementation Notes**
|
||||
- Integration: Run from the repo root; no environment variables required beyond what `uv run` already provides.
|
||||
- Validation: All four checks (grep / pytest / diff scope / spot diff) must pass before committing.
|
||||
- Risks: A flaky `pytest` run unrelated to this change would block progress — mitigated by reading the failure and re-running once.
|
||||
|
||||
#### Tracking & Branching
|
||||
| Field | Detail |
|
||||
|-------|--------|
|
||||
| Intent | Branch, commit, push, and open PR per project conventions |
|
||||
| Requirements | 6.1, 6.2, 6.3 |
|
||||
|
||||
**Responsibilities & Constraints**
|
||||
- Branch name: `docs/i18n-7-translate-backend-comments`.
|
||||
- Commit messages follow Conventional Commits with `docs(i18n)` scope (e.g. `docs(i18n): translate chinese docstrings/comments in backend/services`).
|
||||
- PR closes #7 and references the spec.
|
||||
|
||||
**Dependencies**
|
||||
- Inbound: Verification Gate (only commits when all checks pass).
|
||||
- External: `gh` CLI (P0), `/done` skill (P0).
|
||||
|
||||
**Contracts**: Process [x] / Service [ ] / API [ ] / Event [ ] / Batch [ ] / State [ ]
|
||||
|
||||
**Implementation Notes**
|
||||
- Integration: Use `/done` skill at the end to handle branch/push/PR uniformly.
|
||||
- Validation: Confirm PR body references issue #7 with `Closes #7` and lists each commit.
|
||||
- Risks: None.
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Error Strategy
|
||||
This is a build-time / source-edit task — there is no runtime error path. Errors are caught by the Verification Gate.
|
||||
|
||||
### Error Categories and Responses
|
||||
- **Translation slipped into a string literal**: caught by `git diff --stat` + spot diff. Response: revert that hunk, re-apply translation against the docstring/comment only.
|
||||
- **Test suite fails after a pass**: caught by `pytest`. Response: read failure, identify which line was incorrectly modified (likely a string the translator misclassified as a docstring), revert that hunk, re-apply.
|
||||
- **Residual grep returns non-string-literal Chinese**: caught by post-pass grep. Response: classify those hits as in-scope and translate them in the next sub-pass.
|
||||
- **Line exceeds 120 chars after translation**: caught by spot diff. Response: reflow the comment/docstring without changing executable code.
|
||||
|
||||
### Monitoring
|
||||
None — this is a one-shot change. No production observability required.
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
The repository's existing tests are the safety net. No new tests are added.
|
||||
|
||||
### Default sections
|
||||
- **Unit Tests**: Not applicable; nothing executable changes.
|
||||
- **Integration Tests**: `uv run python -m pytest backend/scripts/test_profile_format.py` must continue to pass after each commit.
|
||||
- **E2E/UI Tests**: Not applicable.
|
||||
- **Verification checks (per package commit)**:
|
||||
1. Residual `grep -rln '[一-鿿]' backend/ --include='*.py'` (run from repo root) returns only files whose remaining Chinese is in string literals owned by adjacent tickets.
|
||||
2. `cd backend && uv run python -m pytest scripts/test_profile_format.py` exits 0.
|
||||
3. `git diff --stat HEAD~..HEAD` shows only in-scope file paths.
|
||||
4. Spot diff on three random changed files confirms only comment/docstring lines changed.
|
||||
|
||||
## Supporting References (Optional)
|
||||
- `gap-analysis.md` — full file enumeration and pattern survey.
|
||||
- `research.md` — discovery log, alternatives, and decisions.
|
||||
|
|
@ -0,0 +1,92 @@
|
|||
# Gap Analysis — `i18n-translate-backend-comments`
|
||||
|
||||
## Scope Recap
|
||||
- **Ticket**: salestech-group/MiroFish#7
|
||||
- **Goal**: Translate Chinese docstrings and `#` comments in `backend/` to English without behavior changes.
|
||||
- **Blast radius**: Comments and docstrings only; runtime semantics preserved.
|
||||
|
||||
## Current State Investigation
|
||||
|
||||
### Discovered files
|
||||
A scan with the regex `[一-鿿]` across `backend/**/*.py` (excluding `.venv`) returns **37 in-app files** plus 2 test files:
|
||||
|
||||
| Area | Count | Files |
|
||||
| --- | --- | --- |
|
||||
| `backend/app/__init__.py` | 1 | `__init__.py` |
|
||||
| `backend/app/config.py` | 1 | `config.py` |
|
||||
| `backend/app/api/` | 4 | `__init__.py`, `graph.py`, `report.py`, `simulation.py` |
|
||||
| `backend/app/models/` | 3 | `__init__.py`, `project.py`, `task.py` |
|
||||
| `backend/app/services/` | 12 | `__init__.py`, `graph_builder.py`, `oasis_profile_generator.py`, `ontology_generator.py`, `report_agent.py`, `simulation_config_generator.py`, `simulation_ipc.py`, `simulation_manager.py`, `simulation_runner.py`, `text_processor.py`, `zep_entity_reader.py`, `zep_graph_memory_updater.py`, `zep_tools.py` |
|
||||
| `backend/app/utils/` | 7 | `__init__.py`, `file_parser.py`, `llm_client.py`, `locale.py`, `logger.py`, `retry.py`, `zep_paging.py` |
|
||||
| `backend/run.py` | 1 | `run.py` |
|
||||
| `backend/scripts/` | 5 | `action_logger.py`, `run_parallel_simulation.py`, `run_reddit_simulation.py`, `run_twitter_simulation.py`, `test_profile_format.py` |
|
||||
| `backend/tests/` (extra, not in ticket file list) | 2 | `test_locale.py`, `test_locale_request_resolution.py` |
|
||||
|
||||
Spot checks (`models/task.py`, `models/project.py`, `services/text_processor.py`, `utils/locale.py`):
|
||||
- Module-level docstrings in Chinese (e.g. `"""任务状态管理"""`).
|
||||
- Class/method docstrings in Chinese, often Google-shaped (`Args:` translated as `参数:`).
|
||||
- Inline `#` comments tagging fields, sections, or restating obvious code (e.g. `# 标准化换行` above an `\n` normalization call).
|
||||
- Status-enum trailing comments (e.g. `PENDING = "pending" # 等待中`).
|
||||
|
||||
### Conventions to preserve
|
||||
- Project guideline: 4-space indent, max 120 char/line, double-quoted strings (Python).
|
||||
- Docstring style: Google-style per `dev-guidelines.md`. Existing files mix English-shape `Args:`/`Returns:` keys with Chinese descriptions, or use Chinese keys (`参数:`, `返回:`). Translate both to canonical Google-style English.
|
||||
- File-level convention: `snake_case` filenames, Python `__init__.py` modules typically have a one-line module docstring.
|
||||
|
||||
### Integration surfaces
|
||||
None. This work touches only commentary; no API contracts, schemas, or imports change.
|
||||
|
||||
## Requirements Feasibility
|
||||
|
||||
| Requirement | Status | Notes |
|
||||
| --- | --- | --- |
|
||||
| R1 (coverage) | Feasible — straightforward | Files identified by `grep` rule. |
|
||||
| R2 (behavior preservation) | Feasible | Achieved by limiting diffs to comment/docstring lines. Need to be careful with multi-line triple-quoted docstrings vs string literals (they are syntactically identical to strings — disambiguation: docstring is the *first* statement of a module/class/function body). |
|
||||
| R3 (comment hygiene) | Feasible | Some judgment required; will adopt heuristic: drop comments whose translated form would be a single verb-phrase paraphrase of the next executable line. |
|
||||
| R4 (style compliance) | Feasible | Watch line-length when translating dense Chinese to English (English is typically longer); rewrap as needed without changing executable code. |
|
||||
| R5 (verification) | Feasible | The `grep -rln '[一-鿿]'` rule is reliable. Residual hits should land only in: prompt template strings (#2/#3/#4/#5), logger/API string literals (#6), and the `tests/test_locale*` files (intentional Chinese test data). |
|
||||
| R6 (tracking/branching) | Feasible | Branch + commit conventions are standard for this repo; `/done` skill enforces them. |
|
||||
|
||||
### Gaps and constraints
|
||||
- **Constraint**: Triple-quoted strings used as values (not as docstrings) must NOT be edited if their content is in scope of issues #2–#6 (prompts/log messages/error messages). Disambiguation matters.
|
||||
- **Constraint**: Chinese characters appearing inside f-string literal segments must remain. They are out of scope.
|
||||
- **Unknown / Research Needed**: None — task is mechanical and well-bounded.
|
||||
|
||||
### Adjacent specs / overlap with other tickets
|
||||
- `i18n-externalize-backend-logs` (#6) owns translating `logger.{info,warning,error}` Chinese arguments and API response strings.
|
||||
- `i18n-report-agent-prompts` (#5), and tickets #2/#3/#4 own prompt template strings.
|
||||
- We must NOT touch any string literal that those tickets own. After this PR, residual `grep` hits should reduce by exactly the count of comments and docstrings translated and nothing else.
|
||||
- The two `backend/tests/test_locale*.py` files are **not in the ticket's listed file scope**, and inspection shows their Chinese is exclusively in string literals (test data and a Unicode range check). They are out of scope by R1's enumerated paths and remain untouched.
|
||||
|
||||
## Implementation Approach Options
|
||||
|
||||
### Option A — Single-pass file-by-file translation (recommended)
|
||||
- Walk the 37 in-scope files in a deterministic order (alphabetical), translating docstrings/comments per file, running the residual grep after each batch.
|
||||
- Group commit by area (models, utils, services, api, scripts, root) to keep PR diff readable.
|
||||
- ✅ Simple, low risk, easy to revert per-area.
|
||||
- ✅ Maps directly to the requirements; easy to verify.
|
||||
- ❌ Larger PR than option B, but ticket explicitly allows a single PR.
|
||||
|
||||
### Option B — Multi-PR per package
|
||||
- Split into one PR per package (`models/`, `utils/`, …). The ticket allows this.
|
||||
- ✅ Smaller diffs to review.
|
||||
- ❌ More overhead (multiple branches/PRs); not necessary for a mechanical change of this size.
|
||||
|
||||
### Option C — Tooling-assisted bulk script
|
||||
- Build a one-shot translation script (LLM-driven) that rewrites docstrings/comments.
|
||||
- ✅ Could scale to other repos.
|
||||
- ❌ Out of proportion for a single-ticket task; risk of errant edits to string literals; tooling itself becomes a deliverable to test and maintain.
|
||||
|
||||
## Effort and Risk
|
||||
- **Effort**: **M (3–7 days of focused work)** — 37 files, hundreds of comments. In an interactive AI-assisted run, this collapses to a few hours.
|
||||
- **Risk**: **Low** — comments-only diff; covered by mechanical verification (grep + pytest); easy to rollback per file/area.
|
||||
|
||||
## Recommendations for Design Phase
|
||||
|
||||
- **Preferred approach**: Option A (single-pass file-by-file, package-grouped commits, single PR).
|
||||
- **Key decisions to capture in design**:
|
||||
- Order of traversal (proposed: `models/` → `utils/` → `services/` → `api/` → `scripts/` → root files `__init__.py`, `config.py`, `run.py`).
|
||||
- Heuristic for "drops the obvious comment" (one-line rule).
|
||||
- How to handle Google-style docstring keys: always translate `参数:` → `Args:`, `返回:` → `Returns:`, `异常:` → `Raises:`.
|
||||
- Verification cadence: re-run the grep after each package batch.
|
||||
- **Research items to carry forward**: None.
|
||||
|
|
@ -0,0 +1,67 @@
|
|||
# Requirements Document
|
||||
|
||||
## Introduction
|
||||
This specification covers the developer-facing internationalization of `backend/` Python source: translating Chinese docstrings and inline comments to English so that English-speaking maintainers can read and review the code without translation overhead. The change is mechanical — no behavior, no public strings, no symbol names are modified. It is one of several i18n tickets (#2, #3, #4, #5, #6, #7); this spec covers ticket #7 only.
|
||||
|
||||
## Boundary Context
|
||||
- **In scope**: Translation of Chinese-language characters that appear in Python docstrings (module/class/function) and inline `#` comments under `backend/`. Removal of comments that merely restate the code. Preservation of `TODO:` / `FIXME:` markers and embedded ticket references.
|
||||
- **Out of scope**: Chinese characters inside string literals (prompt templates, `logger.{info,warning,error}` arguments, API response bodies, error messages returned to clients) — these are tracked separately by issues #2/#3/#4/#5/#6. No refactoring, reformatting, renaming, or behavior changes.
|
||||
- **Adjacent expectations**: Spec `i18n-externalize-backend-logs` (issue #6) and the prompt-translation specs handle string-literal Chinese; this spec must leave those untouched so the other tickets remain mergeable.
|
||||
|
||||
## Requirements
|
||||
|
||||
### Requirement 1: Translation Coverage of In-Scope Files
|
||||
**Objective:** As a maintainer, I want every Chinese docstring and inline comment in the in-scope backend files translated to English, so that I can read and review the code without translation tools.
|
||||
|
||||
#### Acceptance Criteria
|
||||
1. The Backend Codebase shall contain no Chinese characters (Unicode range U+4E00–U+9FFF) inside Python docstrings under `backend/app/__init__.py`, `backend/app/config.py`, `backend/app/models/`, `backend/app/services/`, `backend/app/api/`, `backend/app/utils/`, `backend/run.py`, and `backend/scripts/`.
|
||||
2. The Backend Codebase shall contain no Chinese characters inside Python `#` inline comments under the same paths.
|
||||
3. When `grep -rln '[一-鿿]' backend/ --include='*.py'` is run after this change, the Backend Codebase shall return only files whose remaining Chinese is contained within string literals owned by issues #2/#3/#4/#5/#6.
|
||||
4. When a docstring is translated, the Translator shall preserve Google-style docstring shape (`Args:`, `Returns:`, `Raises:`, `Yields:` sections) per `dev-guidelines.md`.
|
||||
|
||||
### Requirement 2: Preservation of Code Behavior
|
||||
**Objective:** As a maintainer, I want the translation to be comments-and-docstrings-only, so that runtime behavior is provably unchanged.
|
||||
|
||||
#### Acceptance Criteria
|
||||
1. The Translator shall not modify any executable Python statement (assignments, function calls, control flow, decorators, imports).
|
||||
2. The Translator shall not modify any Python string literal (single-, double-, triple-quoted, f-string, raw, byte) regardless of whether it contains Chinese characters.
|
||||
3. The Translator shall not rename any symbol (variable, function, class, module, parameter).
|
||||
4. When `uv run python -m pytest backend/scripts/test_profile_format.py` is run after the change, the Backend Codebase shall exit with status 0.
|
||||
5. If a diff line touches any non-comment, non-docstring code, the Translator shall reject that diff hunk and revise.
|
||||
|
||||
### Requirement 3: Comment Quality Hygiene
|
||||
**Objective:** As a maintainer, I want translated comments to add value, so that the codebase remains easy to read after the migration.
|
||||
|
||||
#### Acceptance Criteria
|
||||
1. When a Chinese comment merely restates the immediately following code (e.g. `# 初始化客户端` above `client = Client()`), the Translator shall delete the comment rather than translate it.
|
||||
2. When a Chinese comment captures non-obvious *why* (constraints, workarounds, invariants), the Translator shall translate it to a faithful English equivalent.
|
||||
3. The Translator shall preserve any `TODO:` / `FIXME:` marker and any embedded ticket reference (e.g. `#1234`, `PROJ-456`) verbatim within the translated comment.
|
||||
4. The Translator shall not introduce new comments that did not exist (or had no Chinese equivalent) in the original source.
|
||||
|
||||
### Requirement 4: Style and Format Compliance
|
||||
**Objective:** As a maintainer, I want the translated output to comply with project style rules, so that no follow-up cleanup PR is needed.
|
||||
|
||||
#### Acceptance Criteria
|
||||
1. The Translator shall keep all translated docstrings and comments at or below 120 characters per line.
|
||||
2. The Translator shall not introduce trailing whitespace on any line.
|
||||
3. The Translator shall preserve the original indentation (tabs/spaces) of every comment and docstring.
|
||||
4. The Translator shall use double quotes for any docstring it rewrites, matching the existing Python convention in the file.
|
||||
5. Where a file already uses 4-space indentation, the Translator shall preserve that indentation.
|
||||
|
||||
### Requirement 5: Discovery and Verification Workflow
|
||||
**Objective:** As a reviewer, I want a reproducible discovery and verification workflow, so that I can confirm coverage and absence of regressions in CI or locally.
|
||||
|
||||
#### Acceptance Criteria
|
||||
1. The Translator shall enumerate candidate files using `grep -rln '[一-鿿]' backend/ --include='*.py'` before beginning work.
|
||||
2. The Translator shall re-run the same `grep` after each batch and confirm the residual hits are limited to string-literal Chinese owned by adjacent tickets (#2/#3/#4/#5/#6).
|
||||
3. When the residual `grep` hits include any non-string-literal Chinese, the Translator shall classify those hits as in-scope and continue translation until they are gone.
|
||||
4. The Translator shall verify that `git diff --stat` only reports changes inside the in-scope file paths listed in Requirement 1.
|
||||
|
||||
### Requirement 6: Tracking and Branching
|
||||
**Objective:** As a release manager, I want the work tracked against ticket #7 on a dedicated branch, so that the PR remains scoped and traceable.
|
||||
|
||||
#### Acceptance Criteria
|
||||
1. The Translator shall produce changes on a branch named `docs/i18n-7-translate-backend-comments`.
|
||||
2. The Translator shall reference issue `salestech-group/MiroFish#7` in commit messages or PR description.
|
||||
3. When committing, the Translator shall use Conventional Commits with type `docs` and scope `i18n` (e.g. `docs(i18n): translate chinese docstrings/comments in backend/<area>`).
|
||||
4. The Translator shall not include unrelated changes (e.g. dependency bumps, config changes, refactors) in the resulting PR.
|
||||
|
|
@ -0,0 +1,80 @@
|
|||
# Research & Design Decisions — `i18n-translate-backend-comments`
|
||||
|
||||
## Summary
|
||||
- **Feature**: `i18n-translate-backend-comments`
|
||||
- **Discovery Scope**: Simple Addition (mechanical translation, no architectural change)
|
||||
- **Key Findings**:
|
||||
- 37 in-scope `backend/` Python files contain Chinese characters in docstrings or `#` comments. The full list is in `gap-analysis.md`.
|
||||
- Existing docstrings mix English-shape Google-style keys (`Args:`/`Returns:`) with Chinese descriptions, and a smaller subset uses Chinese keys (`参数:`/`返回:`/`异常:`). Both patterns must converge to canonical English Google-style.
|
||||
- Several `tests/test_locale*.py` files contain Chinese only inside string literals (intentional test data) and are out of scope by the ticket's enumerated paths.
|
||||
|
||||
## Research Log
|
||||
|
||||
### Discovery scan: where is Chinese in `backend/`?
|
||||
- **Context**: Need a deterministic enumeration of files to translate.
|
||||
- **Sources Consulted**: `grep`/Python-driven scan against `backend/**/*.py`.
|
||||
- **Findings**:
|
||||
- 37 in-app files (under `backend/app/`, `backend/run.py`, `backend/scripts/`).
|
||||
- 2 additional test files in `backend/tests/` whose Chinese is only in string literals; not in ticket scope.
|
||||
- `.venv/` matches are noise and excluded.
|
||||
- **Implications**: The ticket-listed paths are exhaustive; no unexpected location. Order of traversal can be alphabetical within package groups.
|
||||
|
||||
### Disambiguation: docstring vs string literal
|
||||
- **Context**: A triple-quoted string is a docstring iff it is the first statement of a module, class, or function body. Otherwise it is a value (e.g. a prompt template) owned by adjacent tickets.
|
||||
- **Sources Consulted**: Python language reference; spot inspection of `services/ontology_generator.py`, `services/report_agent.py`.
|
||||
- **Findings**:
|
||||
- In-scope files contain both kinds of triple-quoted strings.
|
||||
- Translating only the *first-statement* triple-quoted string per scope keeps the change comments-and-docstrings-only.
|
||||
- **Implications**: Translation pass must visually verify each triple-quoted string is the first statement before rewriting; otherwise leave it alone.
|
||||
|
||||
### Google-style docstring conversions
|
||||
- **Context**: `dev-guidelines.md` requires Google-style docstrings; existing Chinese docstrings sometimes use Chinese keys.
|
||||
- **Findings**: The following key map applies:
|
||||
- `参数:` → `Args:`
|
||||
- `返回:` → `Returns:`
|
||||
- `异常:` → `Raises:`
|
||||
- `产生:` / `生成:` → `Yields:`
|
||||
- `示例:` → `Example:` (or `Examples:`)
|
||||
- `注意:` / `备注:` → `Note:` (or `Notes:`)
|
||||
- **Implications**: Document this mapping in design.md so the implementation pass is mechanical.
|
||||
|
||||
## Architecture Pattern Evaluation
|
||||
|
||||
| Option | Description | Strengths | Risks / Limitations | Notes |
|
||||
|--------|-------------|-----------|---------------------|-------|
|
||||
| Manual file-by-file pass | Walk in alphabetical order, package-grouped commits | Predictable, easy to review per package | Human time required | Selected approach |
|
||||
| Multi-PR per package | One PR per backend package | Smaller diffs to review | Higher overhead, more PR churn | Allowed by ticket but not required |
|
||||
| Tooling-assisted bulk script | LLM-driven find-and-replace tool | Reusable | Risk of touching string literals; tool itself becomes a deliverable | Out of proportion |
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### Decision: Single-pass, package-grouped commits, single PR
|
||||
- **Context**: 37 files, mechanical change, ticket allows either single or split PRs.
|
||||
- **Alternatives Considered**:
|
||||
1. Multi-PR per package — more granular review but higher overhead.
|
||||
2. Tooling-assisted bulk script — overkill for one ticket.
|
||||
- **Selected Approach**: Single PR with one or more commits, grouped by package (`models/`, `utils/`, `services/`, `api/`, `scripts/`, root) so reviewers can read the diff one package at a time.
|
||||
- **Rationale**: Mechanical change with low risk; ticket explicitly allows it; reduces PR overhead; `/done` produces one PR per branch by default.
|
||||
- **Trade-offs**: One large PR, but partitioned by commit. Reviewer can use commit history to navigate.
|
||||
- **Follow-up**: After each package commit, re-run residual `grep` and `pytest` to maintain the invariant.
|
||||
|
||||
### Decision: First-statement disambiguation rule
|
||||
- **Context**: Distinguish docstrings (in scope) from value strings (out of scope).
|
||||
- **Selected Approach**: A triple-quoted string is treated as a docstring (in scope) only if it is the first statement of a module / class / function body. All other triple-quoted strings are values (out of scope).
|
||||
- **Rationale**: Matches Python's own definition; keeps boundary with adjacent tickets unambiguous.
|
||||
|
||||
### Decision: Drop comments that restate code
|
||||
- **Context**: R3 requires deletion of comments whose translated form would merely paraphrase the next line.
|
||||
- **Selected Approach**: Apply a one-line heuristic: if the translated comment would be a verb phrase that mirrors the immediately following executable line, delete the comment instead of writing it.
|
||||
- **Rationale**: Aligns with project rule "comment the why, not the what".
|
||||
|
||||
## Risks & Mitigations
|
||||
- **Risk**: Accidental edit to a string literal (would belong to ticket #2/#3/#4/#5/#6) — **Mitigation**: After each package commit, run `git diff --stat` and a per-file diff sanity check; verify only `#` lines and docstring lines change.
|
||||
- **Risk**: Tests failing because a string-shape changed — **Mitigation**: Run `uv run python -m pytest backend/scripts/test_profile_format.py` after each commit.
|
||||
- **Risk**: Line length violations after English expansion — **Mitigation**: Reflow long English at <= 120 chars within the docstring/comment only; never reflow code.
|
||||
|
||||
## References
|
||||
- `dev-guidelines.md` — repo-level coding standards, Google-style docstring requirement.
|
||||
- `.claude/rules/commits.md` — Conventional Commits standard for the commit message.
|
||||
- Issue #7 — salestech-group/MiroFish: source ticket.
|
||||
- Issues #2/#3/#4/#5/#6 — adjacent i18n tickets that own the string-literal Chinese.
|
||||
|
|
@ -0,0 +1,24 @@
|
|||
{
|
||||
"feature_name": "i18n-translate-backend-comments",
|
||||
"created_at": "2026-05-07T14:24:17Z",
|
||||
"updated_at": "2026-05-07T14:26:00Z",
|
||||
"language": "en",
|
||||
"phase": "tasks-generated",
|
||||
"ticket": 7,
|
||||
"ticket_url": "https://github.com/salestech-group/MiroFish/issues/7",
|
||||
"approvals": {
|
||||
"requirements": {
|
||||
"generated": true,
|
||||
"approved": true
|
||||
},
|
||||
"design": {
|
||||
"generated": true,
|
||||
"approved": true
|
||||
},
|
||||
"tasks": {
|
||||
"generated": true,
|
||||
"approved": true
|
||||
}
|
||||
},
|
||||
"ready_for_implementation": true
|
||||
}
|
||||
|
|
@ -0,0 +1,97 @@
|
|||
# Implementation Plan
|
||||
|
||||
## Foundation
|
||||
|
||||
- [ ] 1. Establish baseline and working branch
|
||||
- [x] 1.1 Create translation working branch and capture baseline state
|
||||
- Create branch `docs/i18n-7-translate-backend-comments` from `main`.
|
||||
- Capture the baseline residual hits by running the discovery scan (the regex `[一-鿿]` against `backend/**/*.py`, excluding `.venv`); record the file list as the work queue.
|
||||
- Run `cd backend && uv run python -m pytest scripts/test_profile_format.py` and confirm a green baseline before any edits.
|
||||
- Observable: a fresh branch exists, the baseline file list of 37 in-scope files is captured, and the baseline pytest run passes.
|
||||
- _Requirements: 5.1, 6.1_
|
||||
|
||||
## Core — Per-Package Translation
|
||||
|
||||
- [ ] 2. Translate Chinese docstrings and inline comments per package
|
||||
|
||||
- [x] 2.1 (P) Translate `backend/app/models/`
|
||||
- Translate Chinese module/class/function docstrings and `#` comments in `backend/app/models/__init__.py`, `backend/app/models/project.py`, and `backend/app/models/task.py`.
|
||||
- Apply the docstring-vs-value disambiguation rule (first-statement only) so that no string literal is touched.
|
||||
- Apply the Google-style key map (`参数:` → `Args:`, `返回:` → `Returns:`, `异常:` → `Raises:`, `产生:`/`生成:` → `Yields:`, `示例:` → `Examples:`, `注意:`/`备注:` → `Note:`).
|
||||
- Drop comments that merely restate the next executable line; preserve `TODO:`/`FIXME:` and any embedded ticket reference verbatim.
|
||||
- Re-run the residual scan and confirm `backend/app/models/` no longer has Chinese in non-string-literal positions.
|
||||
- Re-run `cd backend && uv run python -m pytest scripts/test_profile_format.py` and confirm exit 0.
|
||||
- Observable: zero non-string-literal Chinese remains in `backend/app/models/*.py`, and the test command exits 0.
|
||||
- _Requirements: 1.1, 1.2, 1.4, 2.1, 2.2, 2.3, 2.4, 3.1, 3.2, 3.3, 3.4, 4.1, 4.2, 4.3, 4.4, 4.5_
|
||||
- _Boundary: backend/app/models/_
|
||||
|
||||
- [x] 2.2 (P) Translate `backend/app/utils/`
|
||||
- Translate Chinese docstrings and `#` comments in `backend/app/utils/__init__.py`, `file_parser.py`, `llm_client.py`, `locale.py`, `logger.py`, `retry.py`, and `zep_paging.py`.
|
||||
- Be especially careful with `locale.py` and `logger.py`: they intentionally route Chinese strings through their value paths; only docstrings and `#` comments are in scope.
|
||||
- Apply Rules 1–5 from `design.md` (disambiguation, key map, comment hygiene, style, preservation).
|
||||
- Re-run the residual scan and confirm `backend/app/utils/` no longer has Chinese in non-string-literal positions.
|
||||
- Re-run the pytest command and confirm exit 0.
|
||||
- Observable: zero non-string-literal Chinese remains in `backend/app/utils/*.py`, and the test command exits 0.
|
||||
- _Requirements: 1.1, 1.2, 1.4, 2.1, 2.2, 2.3, 2.4, 3.1, 3.2, 3.3, 3.4, 4.1, 4.2, 4.3, 4.4, 4.5_
|
||||
- _Boundary: backend/app/utils/_
|
||||
|
||||
- [-] 2.3 (P) Translate `backend/app/services/` — partial (7 of 12 files done; 5 remain — see HANDOFF.md)
|
||||
- Translate Chinese docstrings and `#` comments across all 12 service files: `__init__.py`, `graph_builder.py`, `ontology_generator.py`, `oasis_profile_generator.py`, `report_agent.py`, `simulation_config_generator.py`, `simulation_ipc.py`, `simulation_manager.py`, `simulation_runner.py`, `text_processor.py`, `zep_entity_reader.py`, `zep_graph_memory_updater.py`, `zep_tools.py`.
|
||||
- Treat all triple-quoted prompt templates and value strings as out of scope (owned by issues #2/#3/#4/#5/#6) — only the first-statement docstrings of modules/classes/functions are in scope.
|
||||
- Apply Rules 1–5 from `design.md`.
|
||||
- Re-run the residual scan and confirm `backend/app/services/` no longer has Chinese in non-string-literal positions.
|
||||
- Re-run the pytest command and confirm exit 0.
|
||||
- Observable: zero non-string-literal Chinese remains in `backend/app/services/*.py`, and the test command exits 0.
|
||||
- _Requirements: 1.1, 1.2, 1.4, 2.1, 2.2, 2.3, 2.4, 3.1, 3.2, 3.3, 3.4, 4.1, 4.2, 4.3, 4.4, 4.5_
|
||||
- _Boundary: backend/app/services/_
|
||||
|
||||
- [-] 2.4 (P) Translate `backend/app/api/` — partial (only `__init__.py` done; 3 files remain — see HANDOFF.md)
|
||||
- Translate Chinese docstrings and `#` comments in `__init__.py`, `graph.py`, `report.py`, `simulation.py`.
|
||||
- Treat any user-facing string-literal Chinese in API responses as out of scope (owned by issue #6).
|
||||
- Apply Rules 1–5 from `design.md`.
|
||||
- Re-run the residual scan and confirm `backend/app/api/` no longer has Chinese in non-string-literal positions.
|
||||
- Re-run the pytest command and confirm exit 0.
|
||||
- Observable: zero non-string-literal Chinese remains in `backend/app/api/*.py`, and the test command exits 0.
|
||||
- _Requirements: 1.1, 1.2, 1.4, 2.1, 2.2, 2.3, 2.4, 3.1, 3.2, 3.3, 3.4, 4.1, 4.2, 4.3, 4.4, 4.5_
|
||||
- _Boundary: backend/app/api/_
|
||||
|
||||
- [-] 2.5 (P) Translate `backend/scripts/` — partial (`action_logger.py`, `test_profile_format.py` done; 3 `run_*_simulation.py` files remain — see HANDOFF.md)
|
||||
- Translate Chinese docstrings and `#` comments in `action_logger.py`, `run_parallel_simulation.py`, `run_reddit_simulation.py`, `run_twitter_simulation.py`, `test_profile_format.py`.
|
||||
- Apply Rules 1–5 from `design.md`.
|
||||
- Be especially careful with `test_profile_format.py`: any Chinese in test data string literals is out of scope; only docstrings and `#` comments are in scope.
|
||||
- Re-run the residual scan and confirm `backend/scripts/` no longer has Chinese in non-string-literal positions.
|
||||
- Re-run the pytest command and confirm exit 0.
|
||||
- Observable: zero non-string-literal Chinese remains in `backend/scripts/*.py`, and the test command exits 0.
|
||||
- _Requirements: 1.1, 1.2, 1.4, 2.1, 2.2, 2.3, 2.4, 3.1, 3.2, 3.3, 3.4, 4.1, 4.2, 4.3, 4.4, 4.5_
|
||||
- _Boundary: backend/scripts/_
|
||||
|
||||
- [x] 2.6 (P) Translate root backend files
|
||||
- Translate Chinese docstrings and `#` comments in `backend/app/__init__.py`, `backend/app/config.py`, and `backend/run.py`.
|
||||
- Apply Rules 1–5 from `design.md`.
|
||||
- Be especially careful with `backend/app/config.py`: any Chinese in default-value string literals is out of scope; only docstrings and `#` comments are in scope.
|
||||
- Re-run the residual scan and confirm these three files no longer have Chinese in non-string-literal positions.
|
||||
- Re-run the pytest command and confirm exit 0.
|
||||
- Observable: zero non-string-literal Chinese remains in `backend/app/__init__.py`, `backend/app/config.py`, and `backend/run.py`, and the test command exits 0.
|
||||
- _Requirements: 1.1, 1.2, 1.4, 2.1, 2.2, 2.3, 2.4, 3.1, 3.2, 3.3, 3.4, 4.1, 4.2, 4.3, 4.4, 4.5_
|
||||
- _Boundary: backend/app (root), backend/run.py_
|
||||
|
||||
## Validation
|
||||
|
||||
- [ ] 3. Final verification and PR preparation
|
||||
|
||||
- [-] 3.1 Run the final verification gate — partial (per-file scanner + py_compile pass; full pytest blocked by pre-existing env issues, see HANDOFF.md)
|
||||
- Run the residual scan one more time and confirm the only remaining hits are files where the Chinese is in string literals owned by issues #2/#3/#4/#5/#6, plus the intentional Chinese in `backend/tests/test_locale*.py`.
|
||||
- Run `cd backend && uv run python -m pytest scripts/test_profile_format.py` and confirm exit 0.
|
||||
- Run `git diff --stat origin/main...HEAD` and confirm only in-scope file paths under `backend/app/`, `backend/run.py`, and `backend/scripts/` are listed.
|
||||
- Spot-check three random changed files with `git diff <path>` and confirm only `#` lines and docstring lines changed (no executable lines, no string-literal lines).
|
||||
- Observable: residual scan, pytest, diff scope, and spot diff all pass.
|
||||
- _Depends: 2.1, 2.2, 2.3, 2.4, 2.5, 2.6_
|
||||
- _Requirements: 1.3, 2.5, 5.1, 5.2, 5.3, 5.4, 6.4_
|
||||
|
||||
- [ ] 3.2 Open PR and reference ticket #7
|
||||
- Use `/done` to commit any remaining changes per Conventional Commits with type `docs` and scope `i18n` (e.g. `docs(i18n): translate chinese docstrings/comments in backend/<area>`), push the branch, and open a PR.
|
||||
- The PR body must include `Closes #7` and reference the spec at `.kiro/specs/i18n-translate-backend-comments/`.
|
||||
- Verify the PR contains no unrelated changes (no dependency bumps, no config changes, no refactors).
|
||||
- Observable: a PR exists on GitHub from `docs/i18n-7-translate-backend-comments` to `main` that closes #7 and contains only docstring/comment translation diffs.
|
||||
- _Depends: 3.1_
|
||||
- _Requirements: 6.1, 6.2, 6.3, 6.4_
|
||||
Loading…
Reference in New Issue