561 lines
28 KiB
Markdown
561 lines
28 KiB
Markdown
# Design — i18n-e2e-english-verification
|
||
|
||
## Overview
|
||
|
||
**Purpose**: This spec produces a deterministic, re-runnable verification pass that proves (or disproves) the MiroFish 5-step pipeline runs cleanly in English, and posts a structured report on issue #10 with a `pass` / `gap` / `manual-pending` status per checklist item.
|
||
|
||
**Users**: i18n maintainers reviewing the epic (#11), and any future verifier re-running the audit after subsequent merges. The deliverable is read by humans on GitHub (issue comment) and re-run by humans (or CI in a future iteration) to confirm parity.
|
||
|
||
**Impact**: No production code is modified. The repository gains one new directory tree (`.kiro/specs/i18n-e2e-english-verification/`) containing the spec, the audit scripts, and the captured outputs. One GitHub comment is posted on #10. Up to four follow-up issues are filed.
|
||
|
||
### Goals
|
||
|
||
- Static-audit `backend/app`, `frontend/src`, `locales/en.json` for CJK characters; classify every match.
|
||
- Verify EN / ZH locale catalogue parity and flag suspect untranslated entries.
|
||
- Verify LLM-prompt assets respect the requested locale.
|
||
- Document locale-propagation gaps across Flask → `Task` → OASIS subprocess → ReACT agent.
|
||
- Post a single canonical comment on issue #10 with per-checklist statuses.
|
||
- File follow-up issues for every gap (no inline fixes).
|
||
- Make the audit re-runnable by capturing artefacts under `.kiro/specs/.../audit/<commit-sha>/`.
|
||
|
||
### Non-Goals
|
||
|
||
- Patching any `gap` discovered (R7.3 — strictly verification).
|
||
- Performance / load testing.
|
||
- Adding new locales beyond EN / ZH.
|
||
- Building a permanent CI guard (filed as a follow-up issue, not implemented here).
|
||
- Live UI / Docker walkthrough — captured as `manual-pending` in this run's report.
|
||
|
||
## Boundary Commitments
|
||
|
||
### This Spec Owns
|
||
|
||
- The audit scripts and the captured audit outputs under `.kiro/specs/i18n-e2e-english-verification/audit/`.
|
||
- The `gap-report.md` artefact and the comment body posted on issue #10.
|
||
- The grouping rule for follow-up issues (one per category — UI strings, backend log strings, backend LLM-prompt labels, suggested CI guard).
|
||
- The `pass` / `gap` / `manual-pending` / `review-needed` classification scheme.
|
||
|
||
### Out of Boundary
|
||
|
||
- Any modification of files under `backend/app/`, `frontend/src/`, or `locales/`.
|
||
- Fixing the gaps the audit discovers — those land in their own follow-up issues.
|
||
- Live UI walkthrough, Docker run, or LLM execution.
|
||
- A permanent CI check — filed as a separate follow-up issue.
|
||
|
||
### Allowed Dependencies
|
||
|
||
- `git` (for `git grep`, capturing HEAD sha).
|
||
- `gh` CLI (for the comment + follow-up issues; with documented fallback when unavailable).
|
||
- `python3` (for the catalogue parity diff).
|
||
- The repo working tree at HEAD of the working branch.
|
||
|
||
### Revalidation Triggers
|
||
|
||
- Any merge to `main` that touches `locales/`, `backend/app/`, or `frontend/src/` invalidates the captured audit; a re-run should produce a new `audit/<commit-sha>/` directory.
|
||
- A change to issue #10's checklist body (e.g. a new sub-item) requires re-mapping in `gap-report.md`.
|
||
- A change to the four follow-up categories (e.g. project decides to file one issue per file) requires re-running the issue-filing script with new grouping.
|
||
|
||
## Architecture
|
||
|
||
### Existing Architecture Analysis
|
||
|
||
- The MiroFish backend is Flask + Python `Task` workers + an OASIS subprocess (per CLAUDE.md). i18n surfaces are: `vue-i18n` for the SPA, `locales/*.json` shared by both ends, a backend logger that resolves keys per locale, and inline LLM prompts in `backend/app/services/*.py`.
|
||
- The verification pass does **not** hook into any of these — it reads files only. No Flask blueprint, no `Task` model, no Neo4j query.
|
||
|
||
### Architecture Pattern & Boundary Map
|
||
|
||
```mermaid
|
||
graph TB
|
||
Verifier[Verifier shell entrypoint]
|
||
Audit[audit_cjk.sh]
|
||
Parity[check_parity.py]
|
||
Classify[classify.py]
|
||
Report[render_report.py]
|
||
Comment[post_comment.sh]
|
||
FollowUp[file_followups.sh]
|
||
|
||
Repo[Working tree]
|
||
Captures[audit slash sha slash]
|
||
GH[GitHub via gh CLI]
|
||
|
||
Verifier --> Audit
|
||
Verifier --> Parity
|
||
Audit --> Classify
|
||
Parity --> Classify
|
||
Classify --> Report
|
||
Report --> Captures
|
||
Report --> Comment
|
||
Report --> FollowUp
|
||
Audit --> Repo
|
||
Parity --> Repo
|
||
Comment --> GH
|
||
FollowUp --> GH
|
||
```
|
||
|
||
**Architecture Integration**:
|
||
|
||
- **Selected pattern**: Linear pipeline of read-only scripts that each emit a single artefact, composed by a thin shell entrypoint. No mutable state outside `audit/<sha>/`.
|
||
- **Domain boundaries**: `audit_cjk.sh` owns the raw grep; `check_parity.py` owns the catalogue diff; `classify.py` owns the four-class labels; `render_report.py` owns the comment body; `post_comment.sh` and `file_followups.sh` own GitHub side effects.
|
||
- **Existing patterns preserved**: Shell + Python script pair (matches the project's existing `setup`/`run` style); no new test runner, no new linter.
|
||
- **New components rationale**: Each script is single-purpose so failures (e.g. `gh` permission issues) are isolated and the pipeline can resume from the failed step.
|
||
- **Steering compliance**: No production-code touch (R7.3); 4-space indent in any committed Python; double quotes; `snake_case`; reserved Bash exits with a non-zero status on any uncaught error.
|
||
|
||
### Technology Stack
|
||
|
||
| Layer | Choice / Version | Role in Feature | Notes |
|
||
|-------|------------------|-----------------|-------|
|
||
| CLI / Audit runner | Bash 5+, `git grep -P` (PCRE) | Run the canonical CJK audit | `\x{...}` ranges require PCRE — `git grep -E` will fail on this regex (verified). |
|
||
| Static checks | Python 3.11 (project minimum per CLAUDE.md) | Catalogue parity + classification + report rendering | Standard library only — no new deps. |
|
||
| GitHub integration | `gh` CLI | Post the comment, file follow-ups | Falls back to `audit/<sha>/PENDING-*` files when missing. |
|
||
| Output formats | Plain text + Markdown | Captures + comment body | No HTML, no JSON beyond `gh`'s own. |
|
||
|
||
## File Structure Plan
|
||
|
||
### Directory Structure
|
||
|
||
```
|
||
.kiro/specs/i18n-e2e-english-verification/
|
||
├── spec.json
|
||
├── requirements.md
|
||
├── gap-analysis.md
|
||
├── research.md
|
||
├── design.md
|
||
├── tasks.md
|
||
├── HANDOFF.md # only if implementation hits the 3-cycle remediation cap
|
||
└── audit/
|
||
├── scripts/
|
||
│ ├── run_audit.sh # entrypoint - chains the steps below
|
||
│ ├── audit_cjk.sh # git grep PCRE + bucket counts
|
||
│ ├── check_parity.py # locales/en.json vs zh.json key + identical-value diff
|
||
│ ├── classify.py # apply 4-class labels to grep matches
|
||
│ ├── render_report.py # produce gap-report.md + comment-body.md
|
||
│ ├── post_comment.sh # gh issue comment 10 with comment-body.md (or PENDING-*)
|
||
│ └── file_followups.sh # gh issue create per category (or PENDING-*)
|
||
└── <commit-sha>/ # captured outputs of one verification run
|
||
├── cjk-grep.txt # raw `git grep -nP ...` output
|
||
├── cjk-grep-bucketed.txt # the same, partitioned by top-level path
|
||
├── parity.txt # en/zh diff summary
|
||
├── classified.csv # match-by-match label
|
||
├── gap-report.md # the canonical structured report
|
||
├── comment-body.md # the markdown posted to issue #10
|
||
├── PENDING-issue-10-comment.md # only if gh comment failed
|
||
└── PENDING-followups/ # only if gh issue create failed
|
||
├── 01-frontend-ui-strings.md
|
||
├── 02-backend-log-strings.md
|
||
├── 03-backend-prompt-labels.md
|
||
└── 04-permanent-ci-guard.md
|
||
```
|
||
|
||
### Modified Files
|
||
|
||
- *(None.)* The spec explicitly forbids touching production source.
|
||
|
||
## System Flows
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
participant V as Verifier
|
||
participant Run as run_audit.sh
|
||
participant FS as Working tree
|
||
participant GH as GitHub
|
||
|
||
V->>Run: bash run_audit.sh
|
||
Run->>FS: git grep -nP, git rev-parse HEAD
|
||
FS-->>Run: cjk-grep.txt + sha
|
||
Run->>FS: read locales json
|
||
FS-->>Run: en/zh dicts
|
||
Run->>Run: classify
|
||
Run->>FS: write audit slash sha slash artefacts
|
||
Run->>GH: gh issue comment 10
|
||
alt gh succeeds
|
||
GH-->>Run: comment URL
|
||
Run->>GH: gh issue create x N follow-ups
|
||
GH-->>Run: issue URLs
|
||
else gh fails
|
||
Run->>FS: write PENDING markdown to audit slash sha slash
|
||
end
|
||
Run-->>V: exit 0 success or exit 2 PENDING
|
||
```
|
||
|
||
**Key flow decisions**:
|
||
|
||
- The audit always writes the captured artefacts to disk first (idempotent, re-runnable). The GitHub side effects are the *last* steps so any earlier failure leaves a complete capture for inspection.
|
||
- A non-zero `gh` exit shifts the pipeline to PENDING mode rather than failing the whole run; the script exits `2` to flag "audit ran but GitHub side-effects didn't apply".
|
||
|
||
## Requirements Traceability
|
||
|
||
| Requirement | Summary | Components | Interfaces / Artefacts | Flows |
|
||
|-------------|---------|------------|------------------------|-------|
|
||
| 1.1 | Run canonical `git grep` | audit_cjk.sh | `cjk-grep.txt` | Audit step |
|
||
| 1.2 | Classify each match | classify.py | `classified.csv` | Audit step |
|
||
| 1.3 | Record file:line + step tag for `gap` | classify.py | `classified.csv` (`step` column) | Audit step |
|
||
| 1.4 | No file modifications during audit | run_audit.sh | scripts are read-only | — |
|
||
| 1.5 | `en.json` CJK = always `gap` | classify.py | hard rule in classifier | Audit step |
|
||
| 2.1 | Enumerate keys recursively | check_parity.py | `parity.txt` | Audit step |
|
||
| 2.2 | Missing-key gaps recorded | check_parity.py | `parity.txt` (missing-key block) | Audit step |
|
||
| 2.3 | EN catalogue CJK = `gap` | check_parity.py | `parity.txt` (cjk-in-en block) | Audit step |
|
||
| 2.4 | EN/ZH identical = `review-needed` | check_parity.py | `parity.txt` (identical-value block) | Audit step |
|
||
| 2.5 | No catalogue edits | check_parity.py | read-only stdlib JSON load | — |
|
||
| 3.1 | Enumerate prompt files | classify.py (heuristic — known files list) | `gap-report.md` Section 3 | — |
|
||
| 3.2 | Confirm locale-aware or EN-only | classify.py | `gap-report.md` Section 3 | — |
|
||
| 3.3 | Hard-coded ZH directive = `gap` | classify.py | `classified.csv` (`category=prompt-label`) | — |
|
||
| 3.4 | #3, #4, #5 prompts post-merge check | classify.py | `gap-report.md` Section 3 | — |
|
||
| 4.1 | Identify handoff boundaries | render_report.py | `gap-report.md` Section 4 | — |
|
||
| 4.2 | Confirm explicit or re-derived locale | render_report.py | `gap-report.md` Section 4 | — |
|
||
| 4.3 | Silent default = `gap` | classify.py | `classified.csv` (`category=propagation`) | — |
|
||
| 4.4 | Backend logger EN under EN | classify.py | `classified.csv` (`category=backend-log`) | — |
|
||
| 5.1 | Comment lists every checklist item | render_report.py | `comment-body.md` | Comment-post |
|
||
| 5.2 | Each `gap` includes file:line + follow-up link | render_report.py | `comment-body.md` | Comment-post |
|
||
| 5.3 | `manual-pending` items state repro steps | render_report.py | `comment-body.md` | Comment-post |
|
||
| 5.4 | Comment includes raw audit (or path) | render_report.py | `comment-body.md` (path reference) | Comment-post |
|
||
| 5.5 | Post via `gh issue comment 10` | post_comment.sh | `comment-body.md` | Comment-post |
|
||
| 6.1 | ZH covers every EN key | check_parity.py | (already passes per gap-analysis) | — |
|
||
| 6.2 | Locale-aware prompts symmetric | render_report.py | `gap-report.md` Section 6 | — |
|
||
| 6.3 | EN-only ZH value = `review-needed` | check_parity.py | `parity.txt` (identical-value block) | — |
|
||
| 6.4 | ZH regression filed as gap | classify.py | `classified.csv` | — |
|
||
| 7.1 | File issue per gap | file_followups.sh | `gh issue create` | Follow-up |
|
||
| 7.2 | Group by category | file_followups.sh | one body per category in `PENDING-followups/` | Follow-up |
|
||
| 7.3 | No production-code edits | run_audit.sh | only writes under `.kiro/specs/.../` | — |
|
||
| 7.4 | Label follow-ups `i18n` | file_followups.sh | `gh issue create --label i18n` | Follow-up |
|
||
| 7.5 | Fallback inline list when no `gh` | file_followups.sh | `PENDING-followups/*.md` | Follow-up |
|
||
| 8.1 | Capture raw output | run_audit.sh | `audit/<sha>/` directory | Audit step |
|
||
| 8.2 | Preserve previous run | run_audit.sh | `<sha>` subdirectory naming | Audit step |
|
||
| 8.3 | Record HEAD sha | run_audit.sh | `git rev-parse HEAD` | Audit step |
|
||
| 8.4 | Idempotent re-run | run_audit.sh | re-running on same sha overwrites that sha's dir | Audit step |
|
||
|
||
## Components and Interfaces
|
||
|
||
| Component | Domain | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts |
|
||
|-----------|--------|--------|--------------|--------------------------|-----------|
|
||
| run_audit.sh | Verification pipeline | Compose the audit and route artefacts | 1.4, 7.3, 8.1, 8.2, 8.3, 8.4 | git (P0), python3 (P0), gh (P1) | Batch |
|
||
| audit_cjk.sh | Static audit | Run `git grep -nP` and bucket | 1.1, 1.5 | git (P0) | Batch |
|
||
| check_parity.py | Catalogue diff | Diff en/zh + identical-value heuristic | 2.1, 2.2, 2.3, 2.4, 2.5, 6.1, 6.3 | python3 stdlib (P0) | Batch |
|
||
| classify.py | Classification | Apply the 4-class label per match | 1.2, 1.3, 1.5, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 6.4 | cjk-grep.txt (P0), parity.txt (P0) | Batch |
|
||
| render_report.py | Report assembly | Produce gap-report.md + comment-body.md | 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.2 | classified.csv (P0) | Batch |
|
||
| post_comment.sh | GitHub side-effect | Post the comment on #10 | 5.5 | gh (P0), comment-body.md (P0) | Service |
|
||
| file_followups.sh | GitHub side-effect | Open follow-up issues | 7.1, 7.2, 7.4, 7.5 | gh (P0), PENDING-followups/* (P0) | Service |
|
||
|
||
### Verification pipeline
|
||
|
||
#### `run_audit.sh`
|
||
|
||
| Field | Detail |
|
||
|-------|--------|
|
||
| Intent | Single shell entrypoint that runs every step in order and persists artefacts under `audit/<commit-sha>/` |
|
||
| Requirements | 1.4, 7.3, 8.1, 8.2, 8.3, 8.4 |
|
||
|
||
**Responsibilities & Constraints**
|
||
|
||
- Must NOT modify any file outside `.kiro/specs/i18n-e2e-english-verification/`.
|
||
- Must capture HEAD sha before any other step (so the artefact path is set).
|
||
- Must exit `0` on full success (audit + GitHub side effects) and `2` on PENDING (audit succeeded, side effects didn't).
|
||
- Must be safely re-runnable on the same sha (overwriting that sha's directory is acceptable).
|
||
|
||
**Dependencies**
|
||
|
||
- Inbound: invoked manually by the verifier (`bash run_audit.sh`) — Criticality: P0.
|
||
- Outbound: `audit_cjk.sh`, `check_parity.py`, `classify.py`, `render_report.py`, `post_comment.sh`, `file_followups.sh` — Criticality: P0 each.
|
||
- External: `git`, `python3`, `gh` (P1 — fallback supported).
|
||
|
||
**Contracts**: Service [ ] / API [ ] / Event [ ] / Batch [x] / State [ ]
|
||
|
||
##### Batch / Job Contract
|
||
|
||
- **Trigger**: manual `bash .kiro/specs/i18n-e2e-english-verification/audit/scripts/run_audit.sh`.
|
||
- **Input / validation**: working tree at any commit; rejects detached non-clean trees? — no, the audit reads tracked files only via `git grep`, so unstaged edits are ignored deliberately.
|
||
- **Output / destination**: `.kiro/specs/i18n-e2e-english-verification/audit/<commit-sha>/`.
|
||
- **Idempotency & recovery**: Re-running on the same sha overwrites that sha's directory. PENDING outputs survive across runs until a `gh`-enabled run replaces them.
|
||
|
||
**Implementation Notes**
|
||
|
||
- Integration: invoked by humans only — no CI hookup in this spec.
|
||
- Validation: confirm `gh auth status` before attempting comment/issue posts; on failure, branch to PENDING.
|
||
- Risks: shell quoting around the PCRE pattern (`[\x{4e00}-\x{9fff}]`) — use single-quoted argument to `git grep -P`.
|
||
|
||
#### `audit_cjk.sh`
|
||
|
||
| Field | Detail |
|
||
|-------|--------|
|
||
| Intent | Run the canonical PCRE grep + per-bucket counts |
|
||
| Requirements | 1.1, 1.5 |
|
||
|
||
**Responsibilities & Constraints**
|
||
|
||
- Output: `cjk-grep.txt` (raw `git grep -nP` lines) and `cjk-grep-bucketed.txt` (one section per top-level path: `backend/app`, `frontend/src`, `locales/en.json`).
|
||
- Excludes binary file matches (e.g. `.jpeg` false positives).
|
||
|
||
**Dependencies**
|
||
|
||
- Inbound: `run_audit.sh` (P0).
|
||
- External: `git` 2.x (P0 — must support `-P` for PCRE).
|
||
|
||
**Contracts**: Batch [x]
|
||
|
||
##### Batch / Job Contract
|
||
|
||
- **Trigger**: invoked by `run_audit.sh`.
|
||
- **Input / validation**: receives the target output directory as argv[1]; aborts if missing.
|
||
- **Output / destination**: `cjk-grep.txt`, `cjk-grep-bucketed.txt` in `<sha>/`.
|
||
- **Idempotency & recovery**: deterministic — same tree → same output.
|
||
|
||
**Implementation Notes**
|
||
|
||
- Integration: pure read-only against `git`.
|
||
- Validation: `git --version` precondition; abort with a clear error if PCRE unsupported.
|
||
- Risks: ripgrep is NOT used (avoids a hard `rg` dependency); `git grep -P` is built-in to git's PCRE2 binding.
|
||
|
||
#### `check_parity.py`
|
||
|
||
| Field | Detail |
|
||
|-------|--------|
|
||
| Intent | Compare `locales/en.json` and `locales/zh.json`: key parity, CJK in EN, identical-value heuristic |
|
||
| Requirements | 2.1, 2.2, 2.3, 2.4, 2.5, 6.1, 6.3 |
|
||
|
||
**Responsibilities & Constraints**
|
||
|
||
- Recursively flattens nested-dict keys with dotted paths.
|
||
- Reports three blocks: `missing-keys`, `cjk-in-en`, `identical-values`.
|
||
- Treats values as `review-needed` only if (a) en value == zh value, (b) value is non-empty, (c) value is more than two ASCII words.
|
||
|
||
**Dependencies**
|
||
|
||
- Inbound: `run_audit.sh` (P0).
|
||
- External: `json` from Python stdlib (P0).
|
||
|
||
**Contracts**: Batch [x]
|
||
|
||
##### Batch / Job Contract
|
||
|
||
- **Trigger**: invoked by `run_audit.sh` with the `<sha>` directory as argv[1].
|
||
- **Input / validation**: reads `locales/en.json` and `locales/zh.json` from cwd (must be invoked from repo root); fails fast on JSON parse error.
|
||
- **Output / destination**: `parity.txt` in `<sha>/`.
|
||
- **Idempotency & recovery**: pure function of catalogue contents.
|
||
|
||
**Implementation Notes**
|
||
|
||
- Integration: invoked from repo root so relative paths resolve.
|
||
- Validation: parse-on-load, both files must be objects.
|
||
- Risks: the "more than two ASCII words" heuristic may produce noise — `review-needed` is intentionally a soft label not a `gap`.
|
||
|
||
#### `classify.py`
|
||
|
||
| Field | Detail |
|
||
|-------|--------|
|
||
| Intent | Apply the 4-class label (`deliberate` / `gap` / `non-applicable` / `review-needed`) and a category tag per match |
|
||
| Requirements | 1.2, 1.3, 1.5, 3.1, 3.2, 3.3, 3.4, 4.3, 4.4, 6.4 |
|
||
|
||
**Responsibilities & Constraints**
|
||
|
||
- Reads `cjk-grep.txt` and `parity.txt`; emits `classified.csv` with columns: `file`, `line`, `match`, `class`, `category`, `pipeline_step`.
|
||
- Categories (closed set): `frontend-ui-string`, `frontend-regex-parser`, `backend-docstring`, `backend-comment`, `backend-log`, `backend-prompt-label`, `propagation`, `catalogue-parity`, `binary-false-positive`.
|
||
- Pipeline-step tags (closed set): `Graph Build`, `Env Setup`, `Simulation`, `Report`, `Interaction`, `Logs`, `UI`, `n/a`.
|
||
- Classification rules:
|
||
- `locales/en.json` CJK → always `gap` / `catalogue-parity` / `n/a` (R1.5).
|
||
- File path under `frontend/src/views/` or `frontend/src/components/` AND match is inside a string literal (heuristic: enclosed in `'…'`/`"…"`/`` `…` ``) → `gap` / `frontend-ui-string`.
|
||
- Match inside a `text.match(/.../)` call in a `.vue` file → `frontend-regex-parser` / `gap` (cause: backend emits CJK).
|
||
- Backend `.py` file, line starts with `#` or appears inside a triple-quoted docstring → `deliberate-blocked-by-#7` / `backend-docstring` (or `backend-comment`) — counted but not filed as a fresh follow-up since #7 already covers it.
|
||
- Backend `.py` file, line contains `logger.`, `log.`, `print(` and CJK in a string literal → `gap` / `backend-log` / appropriate step tag.
|
||
- Backend `.py` file in `services/{ontology,oasis_profile,simulation_config,report_agent}_generator.py` and CJK appears inside an LLM-prompt context label (heuristic: a string literal not preceded by `#`) → `gap` / `backend-prompt-label`.
|
||
- Binary files (e.g. `.jpeg` ripgrep matches): `non-applicable` / `binary-false-positive`.
|
||
- Anything else: `review-needed` (forces a human look).
|
||
|
||
**Dependencies**
|
||
|
||
- Inbound: `audit_cjk.sh`, `check_parity.py` (P0).
|
||
- External: `csv` from Python stdlib.
|
||
|
||
**Contracts**: Batch [x]
|
||
|
||
##### Batch / Job Contract
|
||
|
||
- **Trigger**: invoked by `run_audit.sh` after the two preceding steps.
|
||
- **Input / validation**: `cjk-grep.txt` and `parity.txt` must exist in `<sha>/`.
|
||
- **Output / destination**: `classified.csv`.
|
||
- **Idempotency & recovery**: deterministic — same inputs → same csv.
|
||
|
||
**Implementation Notes**
|
||
|
||
- Integration: classification rules are heuristics, not a parser; correctness is bounded by careful regexes and an explicit "fallthrough = `review-needed`" rule.
|
||
- Validation: every input row produces an output row (no silent drops); a count-equality assertion runs at the end.
|
||
- Risks: false negatives (e.g. a Chinese log string that doesn't contain `logger.` on the same line) — `review-needed` fallthrough catches these.
|
||
|
||
#### `render_report.py`
|
||
|
||
| Field | Detail |
|
||
|-------|--------|
|
||
| Intent | Produce `gap-report.md` and `comment-body.md` |
|
||
| Requirements | 4.1, 4.2, 5.1, 5.2, 5.3, 5.4, 6.2 |
|
||
|
||
**Responsibilities & Constraints**
|
||
|
||
- `gap-report.md`: Sections: Overview, Section 1 (static audit), Section 2 (parity), Section 3 (prompt verification), Section 4 (propagation), Section 5 (issue-#10 checklist mapping), Section 6 (ZH regression), Section 7 (follow-up plan).
|
||
- `comment-body.md`: Markdown comment for issue #10 — mirrors the issue's checklist with `pass` / `gap` / `manual-pending` for each line, plus a "How to re-run" footer.
|
||
- Reads `classified.csv` and the issue body (snapshot at `.ticket/10.md`).
|
||
|
||
**Dependencies**
|
||
|
||
- Inbound: `classify.py` (P0), `.ticket/10.md` (P0).
|
||
- External: Python stdlib only.
|
||
|
||
**Contracts**: Batch [x]
|
||
|
||
##### Batch / Job Contract
|
||
|
||
- **Trigger**: `run_audit.sh` after `classify.py`.
|
||
- **Input / validation**: `classified.csv` and `.ticket/10.md` must exist.
|
||
- **Output / destination**: `gap-report.md`, `comment-body.md` in `<sha>/`.
|
||
- **Idempotency & recovery**: deterministic.
|
||
|
||
**Implementation Notes**
|
||
|
||
- Integration: the comment body must include a `Run on commit <sha>` header so the comment is traceable.
|
||
- Validation: confirm every issue-body checkbox has been mapped (count check).
|
||
- Risks: rendering CJK characters in markdown — Python writes UTF-8 by default; comment body is verified to round-trip via `gh`.
|
||
|
||
#### `post_comment.sh`
|
||
|
||
| Field | Detail |
|
||
|-------|--------|
|
||
| Intent | Post `comment-body.md` as a comment on issue #10 |
|
||
| Requirements | 5.5 |
|
||
|
||
**Responsibilities & Constraints**
|
||
|
||
- `gh issue comment 10 --repo salestech-group/MiroFish --body-file <sha>/comment-body.md`.
|
||
- On non-zero exit, copies the body to `<sha>/PENDING-issue-10-comment.md` and exits non-zero.
|
||
|
||
**Dependencies**
|
||
|
||
- External: `gh` (P0; degrades to PENDING when missing).
|
||
|
||
**Contracts**: Service [x]
|
||
|
||
##### Service Interface
|
||
|
||
```text
|
||
post_comment.sh <sha-dir>
|
||
precondition: <sha-dir>/comment-body.md exists
|
||
postcondition (success): comment posted; URL printed to stdout
|
||
postcondition (failure): <sha-dir>/PENDING-issue-10-comment.md present; exit code 2
|
||
```
|
||
|
||
**Implementation Notes**
|
||
|
||
- Integration: must be the second-to-last step (so failures don't block the issue-filing fallback).
|
||
- Validation: parses `gh`'s URL output and writes it to `<sha>/comment-url.txt` on success.
|
||
- Risks: PR-time rate limits — unlikely for a single comment.
|
||
|
||
#### `file_followups.sh`
|
||
|
||
| Field | Detail |
|
||
|-------|--------|
|
||
| Intent | Open one follow-up issue per gap category |
|
||
| Requirements | 7.1, 7.2, 7.4, 7.5 |
|
||
|
||
**Responsibilities & Constraints**
|
||
|
||
- Iterates `<sha>/PENDING-followups/*.md` (which `render_report.py` always writes; the ones whose category had zero gaps stay empty placeholders).
|
||
- For each non-empty body, runs `gh issue create --repo salestech-group/MiroFish --title <title> --body-file <body> --label i18n`.
|
||
- On `gh` failure for any single category, leaves the corresponding `PENDING-followups/<n>-*.md` in place and exits non-zero at the end (after attempting all categories).
|
||
|
||
**Dependencies**
|
||
|
||
- External: `gh` (P0; degrades to PENDING).
|
||
|
||
**Contracts**: Service [x]
|
||
|
||
##### Service Interface
|
||
|
||
```text
|
||
file_followups.sh <sha-dir>
|
||
precondition: <sha-dir>/PENDING-followups/*.md exist (possibly empty placeholders)
|
||
postcondition (success): all non-empty bodies posted; URLs appended to <sha-dir>/followup-urls.txt; bodies removed from PENDING-followups/
|
||
postcondition (partial): URLs in followup-urls.txt for the ones that posted; the rest stay in PENDING-followups/; exit code 2
|
||
```
|
||
|
||
**Implementation Notes**
|
||
|
||
- Integration: must be the last step.
|
||
- Validation: post-hoc count check (`gh` URLs + remaining PENDING bodies = total categories).
|
||
- Risks: a category that the spec already considers covered (e.g. backend docstrings → blocked by #7) is not re-filed; the spec's category list is closed and excludes that case.
|
||
|
||
## Data Models
|
||
|
||
### Domain Model
|
||
|
||
The audit operates on three logical concepts:
|
||
|
||
- **Match** — a single line of `git grep` output. `(file, line, raw_text)`.
|
||
- **Classification** — `(match, class ∈ {deliberate, gap, non-applicable, review-needed}, category ∈ closed-set, pipeline_step ∈ closed-set)`.
|
||
- **Follow-up** — `(category, title, body, status ∈ {posted, pending}, url?)`.
|
||
|
||
Invariant: every `Match` produces exactly one `Classification`; every `Classification` with `class == gap` belongs to exactly one `Follow-up` category (which may aggregate multiple gaps).
|
||
|
||
### Logical Data Model
|
||
|
||
**`classified.csv` schema** (CSV, UTF-8, header row):
|
||
|
||
| Column | Type | Notes |
|
||
|--------|------|-------|
|
||
| `file` | string | repo-relative path |
|
||
| `line` | int | 1-indexed |
|
||
| `match` | string | trimmed grep line |
|
||
| `class` | enum | `deliberate` / `gap` / `non-applicable` / `review-needed` |
|
||
| `category` | enum | closed set listed in classify.py rules |
|
||
| `pipeline_step` | enum | closed set listed in classify.py rules |
|
||
|
||
Natural key: `(file, line)`.
|
||
|
||
**`parity.txt` structure** (text, three labelled blocks):
|
||
|
||
```
|
||
[missing-keys]
|
||
en-only: <key.path>
|
||
zh-only: <key.path>
|
||
[cjk-in-en]
|
||
<key.path>: <value snippet>
|
||
[identical-values]
|
||
<key.path>: <value> # review-needed if non-trivial English prose
|
||
```
|
||
|
||
### Data Contracts & Integration
|
||
|
||
- **`comment-body.md`** must be valid GitHub-flavoured Markdown; checkbox lines preserve the issue's original ordering.
|
||
- **Follow-up issue body** must be valid GitHub-flavoured Markdown; first line is a one-sentence summary; subsequent sections are: `## Evidence` (file:line list), `## Linked from` (#10 + comment URL), `## Acceptance` (a small checklist).
|
||
|
||
## Error Handling
|
||
|
||
### Error Strategy
|
||
|
||
- **Read-only operations** (steps 1–4): on any uncaught error (missing file, JSON parse error), the script aborts with a non-zero exit before any artefact is half-written. The orchestrator uses `set -euo pipefail`.
|
||
- **GitHub side effects** (steps 5–6): wrapped — failure routes to PENDING outputs and the orchestrator exits `2`.
|
||
|
||
### Error Categories and Responses
|
||
|
||
- **User errors**: invoked from wrong directory → fail fast with "must be run from repo root".
|
||
- **System errors**: `git`/`python3`/`gh` missing → fail fast with "install <tool>"; `gh auth status` not OK → branch to PENDING.
|
||
- **Business errors**: classification produces 0 matches but `cjk-grep.txt` non-empty → assertion failure (count-equality bug).
|
||
|
||
### Monitoring
|
||
|
||
- The orchestrator prints a one-line status per step.
|
||
- Final summary block to stdout: total matches, gaps, `manual-pending`, follow-ups posted vs PENDING.
|
||
|
||
## Testing Strategy
|
||
|
||
- **Unit tests**: not introduced — the scripts are simple enough that a one-shot dry run on the live tree is the canonical validation.
|
||
- **Integration test**: a single `bash run_audit.sh` against the working tree; success criteria below.
|
||
- **Validation checklist** (run during implementation):
|
||
- The audit produces a non-empty `cjk-grep.txt`.
|
||
- `parity.txt` reports 0 missing keys (matches the live state at HEAD).
|
||
- `classified.csv` row count == `cjk-grep.txt` line count.
|
||
- `gap-report.md` and `comment-body.md` parse as valid markdown (manual eyeball — no toolchain required).
|
||
- The classifier marks every `locales/en.json` CJK as `gap` (currently zero such matches, so this asserts the negative).
|
||
- With `gh` available: a comment is posted on #10 and follow-up issues are created.
|
||
- With `gh` simulated as absent (e.g. `PATH=/dev/null`): PENDING outputs appear under `<sha>/`.
|
||
|
||
### Out of scope for testing
|
||
|
||
- The live UI walkthrough is `manual-pending` (R5.3) and not part of the test plan.
|
||
- Performance, scalability, security: nothing to test — read-only single-shot scripts.
|