6.2 KiB
6.2 KiB
Technology Stack
Architecture
A two-tier web app with a long-running background-task core:
- Frontend (Vue 3 + Vite) — Single-page UI orchestrating the 5-step workflow. Polls the backend for task progress; renders the knowledge graph with D3.
- Backend (Flask +
uv) — Stateless HTTP API on top of in-memoryProjectandTaskmodels. Heavy work (ontology extraction, graph build, profile generation, simulation, report) runs as background tasks tracked throughTaskand exposed via polling endpoints. - Knowledge graph — Neo4j is the durable store; Graphiti is the
write/read layer. All queries are scoped by per-project
group_id. - Simulation — CAMEL-OASIS executes in subprocesses; the Flask app
communicates with them only through
services/simulation_ipc.py.
The system favors process isolation for the simulator and in-memory state with restart recovery for project/task tracking, rather than a classic job queue + persistent DB.
Core Technologies
- Backend language: Python ≥3.11, ≤3.12
- Backend framework: Flask 3.0 + flask-cors
- Backend tooling:
uvfor dependency management - Frontend framework: Vue 3.5 + Vue Router 4 +
vue-i18n11 - Frontend tooling: Vite 7
- Graph DB: Neo4j 5.x (Community) via
bolt:// - Graph layer:
graphiti-core≥ 0.3 - Simulation:
camel-oasis0.2.5 +camel-ai0.2.78 - LLM access: OpenAI SDK against any OpenAI-compatible endpoint
Key Libraries
Only the libraries that shape how new code is written:
openai— Sole LLM client; new providers are integrated by changingLLM_BASE_URL/LLM_MODEL_NAME, not by adding a second SDK.graphiti-core— All graph reads/writes go through thegraphiti_adapter; do not call Neo4j drivers directly from feature code.camel-oasis/camel-ai— Pinned versions; upgrading either requires re-validating the simulation pipeline end-to-end.PyMuPDF,charset-normalizer,chardet— File ingestion; encoding detection is mandatory because seed material is frequently non-UTF-8 (notably mixed Chinese/English).pydanticv2 — Used for structured LLM output / validation.axios(frontend) — All API calls go throughsrc/api/*.jsservices with a 5-min timeout and exponential retry; components must not callfetch/axiosdirectly.d3v7 — Knowledge-graph visualization inGraphPanel.vue.
Development Standards
Type Safety
- Python: type hints where the surrounding file uses them. Don't retrofit hints into untyped modules just for consistency.
- Frontend: plain JavaScript, not TypeScript. Use JSDoc only when it improves clarity.
Code Quality
- No enforced linter or formatter in this repo by design. Match the surrounding file's style. Discuss with the user before introducing ESLint/Prettier/Ruff/Black.
- 4-space indentation everywhere.
- Python:
snake_case. Existing files mix English and Chinese in comments/docstrings — preserve both; do not translate one into the other unless asked.
Testing
- pytest is wired (
backend/scripts/test_profile_format.py) but coverage is intentionally minimal. Don't add a heavy test harness without discussing scope. - For UI changes, run
npm run devand exercise the feature in a browser; type-check/test passes do not prove feature correctness here.
Internationalization
- User-visible strings live in repo-root
/locales/*.json(en.json,zh.json,languages.json). Thefrontend/vite.config.jsaliases@localesto that root folder so the backend logger and frontend share the same keys. - Backend logger messages are part of the i18n surface — translate keys, not raw log lines, when adding new logs that surface to users.
Development Environment
Required Tools
| Tool | Version |
|---|---|
| Node.js | ≥18 |
| Python | ≥3.11, ≤3.12 |
uv |
latest |
| Neo4j | 5.x Community |
| Docker | optional |
Common Commands
# Setup (one-shot)
npm run setup:all
# Dev (backend on :5001, frontend on :3000 with /api proxy)
npm run dev
# Run individually
npm run backend
npm run frontend
# Build frontend
npm run build
# Backend tests
cd backend && uv run python -m pytest
# Full stack (incl. Neo4j)
docker compose up
Key Technical Decisions
- Neo4j + Graphiti replaces Zep Cloud. Several services still carry
the legacy
zep_*filename prefix (zep_tools.py,zep_entity_reader.py,zep_graph_memory_updater.py). New code must not depend on Zep Cloud. TheZEP_API_KEYenv var is kept (empty string is fine) only for backwards compatibility. - Per-project graph isolation via
group_id. Every Graphiti read or write must filter by the project'sgroup_id. There is no cross-project graph access. - Reasoning-model output stripping. Models like MiniMax and GLM emit
<think>blocks and markdown fences; outputs are stripped before JSON parsing (see commit985f89f). New LLM-output parsers must do the same. - Background tasks via
Taskmodel, not a queue. Anything taking more than a few seconds returns immediately and tracks progress on aTaskobject the frontend polls. There is no Celery/RQ/etc. - Startup recovery for stuck projects. On boot,
_recover_stuck_projectspromotes projects inGRAPH_BUILDINGtoGRAPH_COMPLETEDif Neo4j already has their nodes. New long-running task types should follow the same recovery pattern. - Subprocess cleanup is centralized.
SimulationRunner.register_cleanup()registers a shutdown hook so simulation subprocesses die with the app. Don't spawn subprocesses outside this path. - Configuration is a single Python file.
backend/app/config.pyholds LLM, Neo4j, embedding, chunking, OASIS, and ReportAgent settings. Prefer extending it over scattering env-var reads through the codebase. - Default simulation parameters. Max 10 rounds. Twitter actions:
CREATE_POST,LIKE_POST,REPOST,FOLLOW,QUOTE_POST,DO_NOTHING. Reddit additionally:CREATE_COMMENT,LIKE_COMMENT,DISLIKE_*,SEARCH_*,TREND,REFRESH,MUTE. Changes go inconfig.py, not per-call.
Document standards and patterns, not every dependency