MicroFish/.kiro/specs/i18n-simulation-config-gene.../research.md

# Research & Design Decisions — i18n-simulation-config-generator-prompts

## Summary

- **Feature**: `i18n-simulation-config-generator-prompts`
- **Discovery Scope**: Extension (in-place translation of three prompt blocks plus two helpers in a single file)
- **Key Findings**:
    - Two near-identical sister specs (#2, #3) have already established an in-place translation pattern. Following Option A (in-place) keeps reviewer continuity and matches commits `0806832` and `9d1d29b`.
    - The actual in-prompt Chinese-character footprint (~898 chars across six string literals + ~75 in two context-builder helpers ≈ 973) is ~3.6× the ticket's "~247 chars" estimate. The ticket undercounted by ignoring system prompts and the `_build_context`/`_summarize_entities` headings that flow into prompts via `{context_truncated}`.
    - Locale-switching contract (`Accept-Language` → `get_locale()` → `get_language_instruction()`) is unaffected by base-prompt translation; `zh` postfix is `请使用中文回答。`, all other supported locales already use English postfixes. No change to `backend/app/utils/locale.py` or `/locales/*.json` needed.

## Research Log

### Topic: Sister-spec implementation patterns (#2, #3)

- **Context**: Issue #4 explicitly references the same rationale as #5 ("translate prompts even though `get_language_instruction()` exists") and is one of a family of i18n issues. Sister specs already shipped — what pattern did they use?
- **Sources Consulted**: `git show 0806832` (issue #2, ontology_generator), `git show 9d1d29b` (issue #3, oasis_profile_generator), `.kiro/specs/i18n-ontology-generator-prompts/requirements.md`, `.kiro/specs/i18n-oasis-profile-generator-prompts/`.
- **Findings**:
    - Both sister specs translated in-place: edit prompt string literals, leave `get_language_instruction()` postfix call sites intact, leave logger/docstrings/comments alone (those are owned by #6/#7).
    - Both preserved the trailing English `IMPORTANT:` directives that lock identifier formats (`PascalCase`, `snake_case`, `UPPER_SNAKE_CASE`).
    - Both kept module/class docstrings in Chinese (out of scope per #7).
- **Implications**: Use the same pattern. No deviation, no new abstractions, no externalization.

### Topic: Locale resolution & non-`zh` postfix verification

- **Context**: R4 requires that locale switching to `zh` continues to produce Chinese output, and to other locales continues to produce locale-appropriate output. Verify that `get_language_instruction()` returns useful postfixes for all supported locales.
- **Sources Consulted**: `backend/app/utils/locale.py:66-69`, `locales/languages.json`.
- **Findings**:
    - `languages.json` has 7 locales: `zh`, `en`, `es`, `fr`, `pt`, `ru`, `de`. Each provides a `llmInstruction` postfix in its native language (e.g. `de` → `Bitte antworten Sie auf Deutsch.`).
    - `get_locale()` reads `Accept-Language` header in request context, or thread-local in background threads; falls back to `zh`. Confirms the existing fallback semantics.
- **Implications**: Translating the base prompts to English does not regress non-English support — every other locale already gets a native postfix that biases the model away from English. (Today, every other locale fights *against* a Chinese base prompt. After this change, `zh` is the only locale fighting the base.)

### Topic: OASIS subprocess JSON contract

- **Context**: R8 requires Step 3 parity. What does the OASIS subprocess actually consume from `SimulationParameters.to_dict()`?
- **Sources Consulted**: `backend/app/services/simulation_config_generator.py:176-197` (`SimulationParameters.to_dict`), grep for `simulation_ipc` consumers.
- **Findings**:
    - `SimulationParameters.to_dict()` returns a flat dict with `simulation_id`, `project_id`, `graph_id`, `simulation_requirement`, `time_config` (dict), `agent_configs` (list of dicts), `event_config` (dict with `initial_posts`, `scheduled_events`, `hot_topics`, `narrative_direction`), `twitter_config`, `reddit_config`, `llm_model`, `llm_base_url`, `generated_at`, `generation_reasoning`.
    - Field types and shapes are entirely structural — no language-conditioned parsing exists in the consumer side.
- **Implications**: Translation only changes the **content** of natural-language string fields; **shape** is untouched. Step 3 parity is a verification concern, not a design concern.

### Topic: Verification depth — fixture vs. live

- **Context**: R8 acceptance is "OASIS subprocess starts cleanly and runs at least one round." Sandboxed run unlikely to have the live infra (Neo4j, LLM key, OASIS workers).
- **Sources Consulted**: Sister-spec PRs (no live e2e captured in their commit messages or test additions); steering-tech.md (test posture: "coverage is intentionally minimal — don't add a heavy test harness without discussing scope").
- **Findings**:
    - Sister specs shipped without live e2e tests. The verification was static: zero-Chinese assertion in the touched strings, plus reviewer trust on the JSON shape preservation.
    - The repo intentionally avoids heavy test scaffolding.
- **Implications**: Use a fixture-based static verification:
    1. `python -m py_compile backend/app/services/simulation_config_generator.py` (compile pass).
    2. Regex `[一-鿿]` over the six prompt-string literals and the two context-builder f-strings — must yield zero matches.
    3. Construct stub entities and call `_build_context`, `_summarize_entities`, plus the three prompt-rendering paths (mocking the LLM client) — confirm every expected interpolation is present in the rendered prompt and no `KeyError` on missing variables.
- This avoids depending on live infra while still catching every regression mode the requirements care about.

## Architecture Pattern Evaluation

| Option | Description | Strengths | Risks / Limitations | Notes |
|--------|-------------|-----------|---------------------|-------|
| A — In-place translation | Edit string literals directly in `simulation_config_generator.py` | Matches sister-spec precedent (#2, #3); minimum diff; minimum risk | Translations baked in; harder to localize beyond `zh`/`en` later (but `get_language_instruction()` postfix already handles that) | **Selected** |
| B — Externalize prompts to `/locales/` | Move prompts into JSON locale files | Genuine multi-locale prompt support | Departs from sister-spec pattern; out-of-scope per ticket guardrails ("no refactoring"); requires templating choice; brittle for `json.dumps`-shaped interpolations | Rejected |
| C — Hybrid: in-place + indirection helper | Translate in-place, add a thin `_load_prompt()` indirection for future externalization | Sets future migration path | Premature abstraction; larger diff; no caller variance today | Rejected |

## Design Decisions

### Decision: Adopt Option A (in-place translation)

- **Context**: How to translate three Chinese prompt blocks plus two context-builder helpers without breaking locale switching, public API, or downstream OASIS consumption.
- **Alternatives Considered**:
    1. Option A — In-place translation (sister-spec pattern).
    2. Option B — Externalize to `/locales/`.
    3. Option C — Hybrid with a no-op indirection.
- **Selected Approach**: Option A. Translate the six prompt literals and the two context-builder helper bodies directly. Leave `get_language_instruction()` call sites and the trailing English `IMPORTANT:` directives intact (light wording polish allowed for grammatical flow but constraint semantics unchanged). Logger calls, docstrings, comments untouched.
- **Rationale**: Pattern-consistent with #2 and #3, minimum-risk text edit, preserves all behavioural contracts, and produces the smallest reviewable diff.
- **Trade-offs**: Future locales beyond `en`/`zh` continue to rely on the `get_language_instruction()` postfix to bias output — same as the current state, not a regression.
- **Follow-up**: Confirm via fixture that prompt rendering produces no `KeyError` and zero Chinese in the targeted regions; reviewer self-check on prompt wording quality.

### Decision: Translate context-builder section headings (R7)

- **Context**: `_build_context` and `_summarize_entities` emit Chinese section headings (`## 模拟需求`, `### {entity_type} ({n}个)`, etc.) that are interpolated into all three prompts via `{context_truncated}`. Leaving them Chinese re-introduces the bias the prompt translations are designed to remove.
- **Alternatives Considered**:
    1. Translate only the six prompt literals (literal interpretation of ticket scope).
    2. Translate prompts + context-builder headings (functional interpretation of ticket goal).
- **Selected Approach**: Option 2 — translate context-builder headings as part of this spec. The headings have no public surface; they are internal to prompt assembly.
- **Rationale**: Acceptance criterion 1 of issue #4 (no Chinese characters in any prompt string literal) is interpreted strictly only at the prompt-block level by the literal text of the ticket — but the spirit of the ticket (English output under `Accept-Language: en`) demands the helpers be translated too. Sister specs (#2, #3) made the same call for their analogous helpers.
- **Trade-offs**: Slightly larger diff than the literal ticket scope; offset by avoiding a follow-up "we missed this" issue.
- **Follow-up**: Note in the PR body that R7 expands literal ticket scope by ~75 chars across two helpers, with the rationale above.

### Decision: Translate the two default-path `reasoning` strings (R6)

- **Context**: `_get_default_time_config` emits `"使用默认中国人作息配置（每轮1小时）"` as a `reasoning` value, and the `_generate_event_config` exception path emits `"使用默认配置"`. These are static literals (not LLM output) and are joined into the user-visible `generation_reasoning`.
- **Alternatives Considered**:
    1. Leave both Chinese (literal scope per #6/#7 split).
    2. Translate both to locale-agnostic English.
    3. Wire both through `t('progress.*')` for full i18n.
- **Selected Approach**: Option 2 — translate to locale-agnostic English literals (`"Default circadian-pattern config (1h per round)"` and `"Used default config"`). They are not log lines (those are #6's domain) — they are user-facing string values returned in a JSON payload. Wiring them through `t()` is a refactor and likely belongs to a future broader i18n pass.
- **Rationale**: Avoids a forever-Chinese leak into a `generation_reasoning` joined with otherwise-English content. Locale-agnostic English is the lowest-overhead solution for a fallback-only path.
- **Trade-offs**: Under `zh`, these two strings appear in English in `generation_reasoning` only on the failure path. Acceptable: the failure path is rare and the rest of the joined `reasoning` already mixes label-translated and LLM-output content.
- **Follow-up**: None.

### Decision: Light wording polish of the `IMPORTANT:` directive lines

- **Context**: Lines 706 and 870 currently glue an English `IMPORTANT:` directive onto a Chinese system prompt with `f"{system_prompt}\n\n{get_language_instruction()}\nIMPORTANT: ..."`. After translating the system prompts to English, the `IMPORTANT:` directive can either remain verbatim or be merged into the system prompt for cleaner flow.
- **Alternatives Considered**:
    1. Keep the directive lines exactly as-is (preserves byte-for-byte behaviour).
    2. Lightly polish wording for grammatical flow with the now-English system prompt.
    3. Merge the directive into the system prompt body.
- **Selected Approach**: Option 1 with a small caveat — keep the directives verbatim, including the existing English wording. The constraint semantics (PascalCase `poster_type`; `stance` ∈ {`supportive`, `opposing`, `neutral`, `observer`}) MUST not change. If a reviewer requests Option 2, the wording polish can be applied with the constraint semantics held constant.
- **Rationale**: Minimum-diff principle. The directives already work; touching them adds risk for no functional gain.
- **Trade-offs**: Slight grammatical awkwardness from concatenating an English system prompt with another English directive. Cosmetic only.
- **Follow-up**: None unless a reviewer flags wording.

## Risks & Mitigations

- **Risk**: Reviewer interprets ticket scope strictly and rejects the context-builder translation (R7). **Mitigation**: Document the rationale explicitly in the PR body and reference the sister-spec precedent. If rejected, revert just the helper changes — they are isolated edits.
- **Risk**: LLM produces lower-quality output for `zh` locale because the base prompt is now English (model has to do more work to honour the `请使用中文回答。` postfix). **Mitigation**: Sister specs (#2, #3) shipped without observed regressions. If a regression is reported, the fix is to expand the postfix's locale instruction, not to revert this spec.
- **Risk**: Light wording polish of the `IMPORTANT:` directive accidentally drops the `'supportive'/'opposing'/'neutral'/'observer'` enum or the PascalCase requirement, causing the OASIS subprocess to reject `stance` or `poster_type` values. **Mitigation**: R3.7 and R2.6 explicitly forbid changing constraint semantics; the implementation will keep these directives as verbatim string literals.
- **Risk**: A new prompt-time interpolation is missed during translation, producing a `KeyError` at runtime. **Mitigation**: Fixture-based render check (Decision: verification depth) will catch this before commit.

## References

- Sister-spec implementations:
    - `git show 0806832` — `feat(i18n): translate ontology_generator prompts to english` (#2).
    - `git show 9d1d29b` — `feat(i18n): translate oasis_profile_generator prompts to english` (#3).
- Sister-spec planning artefacts: `.kiro/specs/i18n-ontology-generator-prompts/`, `.kiro/specs/i18n-oasis-profile-generator-prompts/`.
- Locale layer: `backend/app/utils/locale.py`, `locales/languages.json`, `locales/en.json`, `locales/zh.json`.
- Steering: `.kiro/steering/tech.md` (i18n notes; "no enforced linter or formatter; preserve mixed Chinese/English in comments unless asked"); `.kiro/steering/structure.md`.