Commit Graph

11 Commits

Author SHA1 Message Date
rqd6f4g6zn-bit c2d533d933 fix: robust LLM JSON parsing + sanitize report sections + chat history dedup
Closes #624, #622, #601, #599, #577

## LLM JSON parsing (#624 / #622 / #601)
- New `_parse_llm_json()` in llm_client.py with 5-stage fallback:
  1. Strip markdown fences (existing)
  2. Strict json.loads (fast path)
  3. json.JSONDecoder.raw_decode (handles trailing prose after JSON)
  4. Balanced-brace extraction (leading prose + embedded JSON)
  5. Strip control chars + retry
- Replaces strict json.loads in chat_json() that was failing on any LLM
  appending text after the JSON (common with qwen-plus, ollama, gemma even
  with response_format=json_object).
- Logs which fallback was used so problematic LLMs are visible.
- 8 unit-test cases covering each strategy.

## Report section tool_call leak (#599)
- New `_sanitize_section_content()` in report_agent.py detects when a
  section's "final answer" is actually an unexecuted tool_call JSON
  (e.g. `{"name":"quick_search","parameters":{...}}`) and replaces it
  with a clear fallback message instead of writing the raw artifact to
  the report.
- Applied at all 3 places where final_answer is returned in
  write_section(): the Final Answer path, the no-prefix fallback, and
  the force-finalize path.

## Chat history duplicate user message (#577)
- In report_agent.py chat(), defensively dedupe chat_history:
  - Only keep {role, content} from history items
  - Skip entries that match the current message exactly
- This prevents LLM from seeing a duplicate trailing user message and
  echoing back the previous answer.
- Added debug log of constructed messages array for diagnostics.
2026-05-17 08:23:16 +02:00
ghostubborn f2404903d6 fix(i18n): validate Accept-Language header against registered locales 2026-04-02 14:20:15 +08:00
ghostubborn 7c07237544 fix(i18n): pass locale to background threads via thread-local storage
Background threads (graph building, simulation prep, report generation,
profile generation) now inherit the requesting user's locale preference.
Previously these fell back to 'zh' because Flask request context was
unavailable in spawned threads.
2026-04-01 16:55:51 +08:00
ghostubborn 0c18e1aeca feat(i18n): add backend translation utility with shared locale files 2026-04-01 15:22:14 +08:00
666ghj 985f89f49a fix: resolve 500 error caused by <think> tags and markdown code fences in content field from reasoning models like MiniMax/GLM 2026-03-06 00:30:31 +08:00
666ghj da6548e96f feat(graph): implement pagination for fetching nodes and edges; add utility functions for streamlined data retrieval 2026-02-27 15:53:29 +08:00
666ghj 390c120fef fix(file_parser): handle non-UTF-8 encoded text files with automatic encoding detection 2026-01-22 18:28:37 +08:00
666ghj f46c1a9ec7 Add UTF-8 encoding support for Windows console in run.py and logger.py to prevent character encoding issues 2025-12-26 17:58:48 +08:00
666ghj 5f159f6d88 Enhance backend functionality with OASIS simulation features
- Updated README.md to include new simulation scripts and configuration details for OASIS, including API retry mechanisms and environment variable settings.
- Added simulation management and configuration generation services to streamline the simulation process across Twitter and Reddit platforms.
- Introduced new API routes for simulation-related operations, including entity retrieval and simulation status management.
- Implemented a robust retry mechanism for external API calls to improve system stability.
- Enhanced task management model to include detailed progress tracking.
- Added logging capabilities for action tracking during simulations.
- Included new scripts for running parallel simulations and testing profile formats.
2025-12-01 15:03:44 +08:00
666ghj e98da6b53e Enhance backend startup logging and API endpoint display
- Updated `run.py` to conditionally print startup information only in the reloader process to avoid duplicate logs in debug mode.
- Modified `__init__.py` to log startup and completion messages based on the reloader process condition.
- Added warnings suppression in `graph_builder.py` for Pydantic v2 regarding Field usage.
- Revised `ontology_generator.py` to enforce strict design guidelines for entity types and relationships, ensuring compliance with new requirements.
- Improved logging behavior in `logger.py` to prevent log propagation to the root logger, avoiding duplicate outputs.
2025-11-28 18:59:36 +08:00
666ghj 08f417f3b7 Introduce Project ID for context management, finalizing the stateful API pipeline from file submission to graph construction. 2025-11-28 17:21:08 +08:00