Commit Graph

16 Commits

Author SHA1 Message Date
Dominik Seemann 6439e58eb5 Merge branch 'main' into docs/i18n-7-translate-backend-comments
# Conflicts:
#	backend/app/services/ontology_generator.py
2026-05-09 10:40:10 +00:00
Dominik Seemann e60a5a93d3 fix(i18n): externalize remaining chinese backend log strings
Replace the last hard-coded Chinese log/print strings in the Flask
graph API, OASIS profile generator, and retry utility with calls to
the existing t() helper, completing the backend i18n coverage started
by ticket #6 so EN-locale operators see English logs end to end.

Adds nine entries to locales/{en,zh}.json: log.graph_api.m027-m029,
log.profile_generator.m024-m025, and a new log.retry.m001-m004
sub-namespace for the retry utility.

Closes #24
2026-05-07 22:40:18 +00:00
Dominik Seemann e3f7defefc docs(i18n): translate chinese docstrings/comments in backend/app/{models,utils} and partial services 2026-05-07 14:44:08 +00:00
Dominik Seemann 74997fd088 feat(i18n): externalize chinese log and api response strings
Extract every Chinese string inside backend logger.{info,warning,error,
debug,exception} calls and inside user-facing jsonify({"error|message":
...}) responses across the listed in-scope modules into
locales/{en,zh}.json under nested namespaces (log.<module>.*,
api.{error,message}.<scope>.*). Locale dictionaries stay structurally
identical; the existing flat frontend-facing keys at log.* / api.* are
left untouched. The locale helper (backend/app/utils/locale.py) now
emits a single deduplicated mirofish.locale warning per (locale, key)
pair when a translation is missing instead of silently returning the
raw key, so unknown keys are visible without crashing requests or
background tasks. A repo-root scripts/check_i18n_logs.py verifier
performs an AST-aware source scan for residual Chinese inside the
in-scope logger/jsonify calls and a recursive parity diff between
en.json and zh.json — both modes pass.

Why: backend logs and API errors previously emitted Chinese-only
strings, leaving English-speaking operators with unreadable log
aggregator output and API consumers with locale-mismatched error
messages. The t() helper and per-thread set_locale propagation already
existed; this change makes every backend caller route through them.

Closes #6
2026-05-07 13:52:22 +00:00
stg 62648289d1 Merge remote-tracking branch 'abhiyadav2345/feat/graphiti-neo4j-migration' 2026-05-05 15:03:47 +02:00
ghostubborn f2404903d6 fix(i18n): validate Accept-Language header against registered locales 2026-04-02 14:20:15 +08:00
ghostubborn 7c07237544 fix(i18n): pass locale to background threads via thread-local storage
Background threads (graph building, simulation prep, report generation,
profile generation) now inherit the requesting user's locale preference.
Previously these fell back to 'zh' because Flask request context was
unavailable in spawned threads.
2026-04-01 16:55:51 +08:00
ghostubborn 0c18e1aeca feat(i18n): add backend translation utility with shared locale files 2026-04-01 15:22:14 +08:00
Abhishek Yadav 28827067c0 feat: migrate knowledge graph from Zep Cloud to Graphiti + local Neo4j
Replaces the paid, rate-limited Zep Cloud service with Graphiti (graphiti-core
0.11.6) backed by a local Neo4j instance — free, unlimited, and self-hosted.

Key changes:
- Add GraphitiAdapter: drop-in Zep-compatible wrapper around Graphiti with a
  persistent event-loop thread to avoid asyncio/Neo4j driver conflicts
- Switch LLM client to native GeminiClient + GeminiEmbedder (text-embedding-004
  fails on Gemini compat endpoint; use google-genai SDK directly)
- Add _GeminiReranker passthrough replacing OpenAIRerankerClient (which
  hardcodes gpt-4.1-nano and uses logprobs unsupported by Gemini)
- Fix Cypher queries: use s.uuid/t.uuid for edge source/target instead of
  r.source_node_uuid (null property in Graphiti's schema)
- Add ontology-based entity type classifier (_classify_entity_type) so nodes
  get colored by type in the D3 graph visualization instead of all being Entity
- Apply classifier in ZepEntityReader so filter_defined_entities finds entities
  (previously 0 personas loaded because all labels were ['Entity'])
- Add startup recovery: auto-mark graph_building projects as graph_completed
  on backend restart if Neo4j already has their data
- Add resume capability to graph build: skip already-processed episodes after
  a restart (断点续传)
- Add non-blocking graph data cache with background refresh in graph.py
- Add EMBEDDING_MODEL config (default: gemini-embedding-001 for Gemini users)
- Add CLAUDE.md with project architecture and dev commands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-22 01:30:28 +05:30
666ghj 985f89f49a fix: resolve 500 error caused by <think> tags and markdown code fences in content field from reasoning models like MiniMax/GLM 2026-03-06 00:30:31 +08:00
666ghj da6548e96f feat(graph): implement pagination for fetching nodes and edges; add utility functions for streamlined data retrieval 2026-02-27 15:53:29 +08:00
666ghj 390c120fef fix(file_parser): handle non-UTF-8 encoded text files with automatic encoding detection 2026-01-22 18:28:37 +08:00
666ghj f46c1a9ec7 Add UTF-8 encoding support for Windows console in run.py and logger.py to prevent character encoding issues 2025-12-26 17:58:48 +08:00
666ghj 5f159f6d88 Enhance backend functionality with OASIS simulation features
- Updated README.md to include new simulation scripts and configuration details for OASIS, including API retry mechanisms and environment variable settings.
- Added simulation management and configuration generation services to streamline the simulation process across Twitter and Reddit platforms.
- Introduced new API routes for simulation-related operations, including entity retrieval and simulation status management.
- Implemented a robust retry mechanism for external API calls to improve system stability.
- Enhanced task management model to include detailed progress tracking.
- Added logging capabilities for action tracking during simulations.
- Included new scripts for running parallel simulations and testing profile formats.
2025-12-01 15:03:44 +08:00
666ghj e98da6b53e Enhance backend startup logging and API endpoint display
- Updated `run.py` to conditionally print startup information only in the reloader process to avoid duplicate logs in debug mode.
- Modified `__init__.py` to log startup and completion messages based on the reloader process condition.
- Added warnings suppression in `graph_builder.py` for Pydantic v2 regarding Field usage.
- Revised `ontology_generator.py` to enforce strict design guidelines for entity types and relationships, ensuring compliance with new requirements.
- Improved logging behavior in `logger.py` to prevent log propagation to the root logger, avoiding duplicate outputs.
2025-11-28 18:59:36 +08:00
666ghj 08f417f3b7 Introduce Project ID for context management, finalizing the stateful API pipeline from file submission to graph construction. 2025-11-28 17:21:08 +08:00