backend/app baseline drops from 2792 to 307 after the comment/docstring
translation pass. Mark i18n-translate-backend-comments tasks complete in
the spec and update HANDOFF.md to record the second-installment scope.
Add the AST-aware scanner used during verification under the spec
directory so future audits can re-run it.
Translate the system prompt and the individual / group persona prompt
builders in backend/app/services/oasis_profile_generator.py from
Chinese to English. The base prompt language was biasing persona
prose (bio, persona, profession, interested_topics) toward Chinese
even under Accept-Language: en, despite the existing
get_language_instruction() postfix mechanism. Translating the base
prompts removes that bias.
All locale-steering call sites are preserved verbatim (the inline
{get_language_instruction()} in each builder, the system-prompt
assembly), so non-English locales continue to receive Chinese output
of equivalent quality. Locale-independent constraints stay English
inside the prompt: gender stays the literal "male"/"female" enum
for individuals and "other" for groups; age stays an integer (30
for institutional accounts). The two attrs_str / context_str fallback
defaults ("无", "无额外上下文") are translated to "None" /
"No additional context" so they compose with the English body.
The country-language hint country: 国家(使用中文,如"中国") is
dropped during translation; locale now decides the country language
via the postfix.
Out of scope (untouched): logger calls (issue #6, already merged),
docstrings and comments (issue #7), the rule-based fallback
_generate_profile_rule_based, and the resilience helpers
_fix_truncated_json / _try_fix_json. No public API change, no new
dependencies, no edits outside the target file.
Closes#3
Adds a stdlib-only Python script and a new GitHub Actions workflow
that fail any pull request which reintroduces CJK characters into
locales/en.json or which raises the total CJK match count under
backend/app or frontend/src above a committed per-path baseline.
The guard captures the two highest-signal checks of the larger
i18n-e2e-english-verification audit so it can run on every PR with a
sub-second budget and without depending on that pipeline being on
main. The committed baseline lets the codebase ratchet down toward
English-only without blocking unrelated PRs on pre-existing CJK
content; refresh it intentionally via the documented flag.
Closes#26
Replace the last hard-coded Chinese log/print strings in the Flask
graph API, OASIS profile generator, and retry utility with calls to
the existing t() helper, completing the backend i18n coverage started
by ticket #6 so EN-locale operators see English logs end to end.
Adds nine entries to locales/{en,zh}.json: log.graph_api.m027-m029,
log.profile_generator.m024-m025, and a new log.retry.m001-m004
sub-namespace for the retry utility.
Closes#24
Replace the silent placeholder-UUID fallback in
_GraphNamespace.add_batch with logger.exception(...) + raise so
embedder misconfiguration (404 unknown model, connection refused, etc.)
fails the surrounding graph-build Task with a visible error instead of
producing a Task that looks completed while the graph stays empty.
Document the existing-but-undocumented Ollama embedder configuration
in .env.example, CLAUDE.md, README.md, and docker-compose.yml.
mxbai-embed-large is the recommended local model because its 1024-dim
output matches Graphiti's default EMBEDDING_DIM. Adds a curl smoke
test to verify embedder reachability before the first graph build.
No new env var or provider literal: Ollama is reached through the
existing openai-provider branch by setting EMBEDDING_BASE_URL,
EMBEDDING_API_KEY, and EMBEDDING_MODEL.
Closes#18
Replace the chinese tagline on README.md and README-EN.md with the
existing english subtitle (collapsing the duplicate stack), and switch
the package.json and backend/pyproject.toml description fields to
english so the project's metadata surface no longer surprises
non-chinese readers.
Rename nine chinese-named static image files under static/image/ to
ASCII slugs (six screenshots, two video covers, the QQ-group image)
via git mv so rename history is preserved, and update every <img src>
in README.md, README-EN.md, and README-ZH.md to the new paths. The
chinese body text of README-ZH.md is preserved by design.
A ripgrep scan for chinese characters in README.md and README-EN.md
(excluding the language-switcher line) now returns zero matches,
satisfying the ticket's acceptance criteria.
Closes#12
Spec under .kiro/specs/i18n-e2e-english-verification/ defines a read-only
verification pipeline that classifies every CJK match in backend/app,
frontend/src, and locales/en.json into deliberate / gap / non-applicable /
review-needed, plus a four-class follow-up grouping (frontend ui strings,
backend log strings, backend prompt-label strings, permanent ci guard).
The captured baseline run at audit/9dcaecd2.../ shows 2916 matches: 237
gaps actionable in follow-up issues #23#24#25#26 (filed by this run),
2299 deliberate (covered by issue #7), and 380 review-needed soft signals.
The verification report comment is posted on issue #10. Locale catalogues
are at full key parity (953/953) and locales/en.json is CJK-clean.
The spec is verification-only: production source under backend/app,
frontend/src, and locales is intentionally untouched. Live UI and
docker-compose walkthrough items in the issue checklist are reported as
manual-pending, with reproduction steps and a re-runnable audit script.
Closes#10
Translate chinese developer comments in frontend/src/ to english so
non-chinese-reading maintainers can understand intent without translation
tooling. Pure documentation cleanup with no runtime behavior changes.
Twenty files updated across views, components, api services, App.vue, and
pendingUpload.js. Region-eligibility matrix from .kiro/specs/i18n-
frontend-comments/design.md drives every edit:
- Translate `//`, `/* */`, JSDoc, and Vue `<!-- -->` template comments.
- Drop comments that merely restate the code per dev-guidelines.md.
- Translate console.error/warn/log argument strings (developer-facing).
- Append (#9) to the single chinese-content TODO in views/Process.vue.
Five files retain documented chinese string literals per requirements 1.5
and 4.4: hardcoded UI text and error fallbacks (Process.vue, Step3Simulation.vue),
backend-format regex patterns and i18n-keyed UI labels (Step4Report.vue),
backend stage-key matchers (Step2EnvSetup.vue), and LLM prompt templates
sent to a chinese-tuned model (Step5Interaction.vue). Translating any of
these would either be out of scope (UI strings belong in /locales/*.json)
or would change runtime behavior.
Verification: `rg '[\x{4e00}-\x{9fff}]' frontend/src/` returns 5 documented
files; `npm run build` exits 0 with the same Vite output as before.
Closes#9
translate nine user-facing english values (step3/step4/step5 waiting
states, step5 interactive tools and report-agent chat panel, graph
panel labels) plus the user-visible step 2 log line `log.prepareTaskId`
to natural chinese, so chinese-locale users no longer see english text
mixed into the chinese ui.
`home.heroDescBrand` is intentionally left as the literal `MiroFish`
because it is a brand name. `log.prepareTaskId` was translated rather
than kept english because it is rendered into the in-ui log panel via
`Step2EnvSetup.vue:801` and every surrounding `log.*` value in zh.json
is already translated; the leading two-space indent, the `└─`
continuation glyph, and the `{taskId}` placeholder are preserved.
en.json and zh.json key sets remain identical
(`paths(scalars)` diff is empty); no frontend code is changed.
Closes#8
Extract every Chinese string inside backend logger.{info,warning,error,
debug,exception} calls and inside user-facing jsonify({"error|message":
...}) responses across the listed in-scope modules into
locales/{en,zh}.json under nested namespaces (log.<module>.*,
api.{error,message}.<scope>.*). Locale dictionaries stay structurally
identical; the existing flat frontend-facing keys at log.* / api.* are
left untouched. The locale helper (backend/app/utils/locale.py) now
emits a single deduplicated mirofish.locale warning per (locale, key)
pair when a translation is missing instead of silently returning the
raw key, so unknown keys are visible without crashing requests or
background tasks. A repo-root scripts/check_i18n_logs.py verifier
performs an AST-aware source scan for residual Chinese inside the
in-scope logger/jsonify calls and a recursive parity diff between
en.json and zh.json — both modes pass.
Why: backend logs and API errors previously emitted Chinese-only
strings, leaving English-speaking operators with unreadable log
aggregator output and API consumers with locale-mismatched error
messages. The t() helper and per-thread set_locale propagation already
existed; this change makes every backend caller route through them.
Closes#6
translate every llm-facing string-literal in
backend/app/services/report_agent.py — the four tool-description
constants, the plan/section/chat system+user prompts, the react loop
templates, the inline messages re-injected during the section and chat
loops, the _execute_tool error returns, and the plan_outline default and
fallback outline content. preserve every {interpolation} token, the
literal final answer: trigger and <tool_call> xml tag, the four primary
tool names, and all three get_language_instruction() postfix call sites.
also switch the unused-tools join separator from "、" to ", " so it
renders naturally inside the now-english react templates.
removes the chinese language bias that the english postfix alone could
not overcome — under accept-language: en the report agent now produces
english-flavoured analytical reports and chat replies; under
accept-language: zh the postfix continues to steer the model into
chinese with no semantic delta. logger calls (#6) and docstrings or
comments (#7) are deliberately untouched.
Closes#5
translate the three llm prompt blocks plus the two prompt-feeding helpers
(_build_context, _summarize_entities) in
backend/app/services/simulation_config_generator.py from chinese to english.
the chinese base prompts were biasing the model toward chinese structure
and lexical choice for content, narrative_direction, hot_topics, and
reasoning fields even when accept-language was en, because
get_language_instruction() only steers the response language as a
postfix.
translation is in-place and preserves every functional contract: the
json output schema for all three prompts, every variable interpolation,
the per-entity-type heuristic ranges in the agent-config prompt, the
trailing english IMPORTANT directives that lock poster_type to
PascalCase and stance to {supportive,opposing,neutral,observer}, and
all three get_language_instruction() postfix call sites. the two
default-path reasoning literals are translated to locale-agnostic
english so generation_reasoning no longer mixes chinese and english on
the failure path.
logger calls, docstrings, and inline comments are intentionally left
chinese (out of scope; covered by issues #6 and #7). public api,
dataclasses, class constants, and the SimulationParameters payload
shape are unchanged.
Closes#4
translate the system prompt constant and the user-message template in
backend/app/services/ontology_generator.py from chinese to english.
the chinese base prompt was biasing the model toward chinese structure
and word choice even when accept-language was en, leaving ontology
descriptions and analysis_summary fields chinese-flavoured.
translation is in-place and preserves every functional contract: the
json output schema, the entity-type and relationship-type taxonomies
verbatim, the reserved-attribute-name list, the count and length
constraints, and all f-string interpolations. the
get_language_instruction() postfix call site and the trailing english
identifier-format directive are unchanged, so zh and other locales
continue to receive locale-appropriate descriptions.
logger calls, docstrings, and inline comments are intentionally left
in chinese — they are owned by issues #6 and #7.
a small static guard script (backend/scripts/test_ontology_prompts_no_cjk.py)
ast-parses the module and asserts zero cjk in the system prompt and in
every string literal of _build_user_message except the docstring, so
the regression cannot reappear silently.
Closes#2
Adds a Neo4j service to docker-compose so `docker compose up -d` works
on a clean checkout, and unhardcodes Graphiti's LLM/embedder so the
documented default provider (Qwen via Dashscope) actually works.
- docker-compose: neo4j:5-community service with cypher-shell
healthcheck, named volumes, and `depends_on: service_healthy` on the
app container; in-Docker NEO4J_URI override leaves the host-mode
default untouched.
- Config: new GRAPHITI_LLM_PROVIDER (openai|gemini, default openai) plus
optional EMBEDDING_API_KEY / EMBEDDING_BASE_URL that fall back to the
chat LLM credentials.
- graphiti_adapter: provider switch inside the singleton factory with
lazy per-provider imports; Gemini path is preserved exactly. The
no-op `_GeminiReranker` becomes a provider-agnostic
`_PassthroughReranker`, still injected explicitly so Graphiti does
not fall back to its OpenAI-only default reranker.
- Drop the ignored `reranker=` kwarg from `_GraphNamespace.search` and
the misleading callers in `zep_tools.py` and
`oasis_profile_generator.py`.
- Refresh `.env.example` to mirror the README env section.
Spec, requirements, and design under
`.kiro/specs/graphiti-neo4j-finalize/`.
Closes#1
Add .kiro/steering/ as persistent project memory for CC-SDD / Kiro
workflows. Three core files (product, tech, structure) capture
purpose, stack, and organization patterns; three custom files
(database, api-standards, error-handling) pin the load-bearing
project-specific conventions:
- group_id isolation and the Graphiti adapter / event-loop singleton
- {success, data|error} envelope and the Task polling contract
- reasoning-model output stripping and the retry_with_backoff helper
Files focus on patterns and decisions, not catalogs, per the
steering-principles "golden rule".
Adopt CC-SDD (Kiro) as the project's spec-driven planning tool, with
plans persisted in .kiro/specs/ and a checkpoint after every task
(strictest cadence — no code without an approved plan).
CC-SDD install (via npx cc-sdd@latest --claude --lang en):
- .kiro/settings/rules/: EARS format, gap-analysis, design and
requirements review gates, design discovery, tasks generation,
steering principles, parallel-task analysis.
- .kiro/settings/templates/: specs (init, requirements, design, tasks,
research) and steering (product/tech/structure plus optional
api-standards/auth/db/deployment/error-handling/security/testing).
- .claude/commands/kiro/: 11 Kiro slash commands — spec-init,
spec-requirements, spec-design, spec-tasks, spec-impl, spec-status,
steering, steering-custom, validate-gap, validate-design,
validate-impl.
Local additions:
- .claude/commands/plan.md: /plan [task] wrapper that picks up the task
from $ARGUMENTS or a single .ticket/<n>.md snapshot, walks the Kiro
flow (steering -> spec-init -> spec-requirements -> validate-gap ->
spec-design -> validate-design -> spec-tasks) and stops for human
approval after each artefact. Refuses "just code it" requests.
- .claude/hooks/session_start.sh: extend to print active tickets
(.ticket/*.md) and open specs (.kiro/specs/*/) with phase from
spec.json, alongside the existing branch/state line.
Documentation: .claude/onboarding/step4_workflow/01_tool_decision.md
Adapt the Notion onboarding prompt — originally Jira/MCP-oriented — to
this project's actual issue tracker (GitHub at
salestech-group/MiroFish) using the gh CLI.
Slash commands (.claude/commands/):
- /ticket <number>: fetch a GitHub issue via gh, self-assign, try to
add an in-progress label (skips silently if the label doesn't exist),
and snapshot the issue (frontmatter + body) to .ticket/<n>.md so
later planning steps can read the description without refetching.
- /ticket-list: interactive overview of issues; asks for filters
(open/closed, status, assignee, milestone, labels) and runs a single
gh issue list with the answers, rendering a compact markdown table.
Workspace:
- .ticket/repo.md declares the target repo (GitHub equivalent of the
Jira "board.md" referenced in the prompt).
- .gitignore: ignore .ticket/* except repo.md and .gitkeep so cached
ticket markdowns stay local.
Settings:
- Allow-list gh issue view/list/edit/comment, gh repo view,
gh pr view/list, gh auth status to avoid permission prompts.
Documentation: .claude/onboarding/step3_planning/01_ticket_sync.md
Permissions:
- Allow npm run/test/install, uv run/sync, docker (compose), and the
common read-only/staging git commands so routine work doesn't trigger
permission prompts.
- Deny Read/Write/Edit on uploads/ and .codegraph/ (auto-generated and
user-data paths) in addition to the existing .env*/secrets/ blocks.
Hooks:
- SessionStart: print branch, ahead/behind vs upstream, and working-tree
state at session start so context is visible immediately.
- PreToolUse (Read|Write|Edit|Bash|NotebookEdit): defence-in-depth
guard that intercepts attempts to access .env / secrets/ paths (and
bash commands targeting them) with a friendly, logged refusal on top
of the permissions.deny rules.
PostToolUse formatter is intentionally skipped — the project has no
configured formatter (per the Step 1 conventions decision).
The Stop hook (quality gate) will be configured in Step 6.
Documentation: .claude/onboarding/step2_setup/01_settings_analysis.md
Bring repo docs in line with the Graphiti+Neo4j migration and prepare
the codebase for Spec-Driven Development.
CLAUDE.md:
- Promote Neo4j + Graphiti to primary memory/graph layer; mark Zep
Cloud as deprecated / compat-only.
- Document the full env-var surface: NEO4J_*, EMBEDDING_MODEL, optional
LLM_BOOST_* block.
- Codify must-respect implementation rules (Task model for long ops,
reasoning-output stripping, simulation IPC, subprocess cleanup,
startup recovery, per-project group_id isolation, chat prefix
injection).
- Note i18n (vue-i18n + /locales/) and Docker prerequisite for dev.
README.md / README-EN.md / README-ZH.md:
- Resolve unresolved merge-conflict markers in README.md left over from
the feat/graphiti-neo4j-migration merge (file was broken Markdown).
- Lead with Docker as the recommended deployment path; keep source
install as a documented alternative.
- Replace Zep env vars with NEO4J_URI / NEO4J_USER / NEO4J_PASSWORD /
EMBEDDING_MODEL across all three READMEs.
- Add optional LLM_BOOST_* block with omit-if-unused note.
- Fix language-switcher links between the three READMEs.
.claude/onboarding/step1_codebase/:
- Document repo analysis, CLAUDE.md conventions decisions, and README
resolution choices.