Bug 1: chat_json() was passing response_format={'type': 'json_object'}
to the LLM, which enforces JSON grammar from token 0. Reasoning
models (Qwen3, DeepSeek-R1, etc.) generate <think>...</think> blocks
before JSON output, causing garbled results. The fix removes the
response_format parameter since the system prompt already requests
JSON output and the existing <think> cleanup handles any remaining
tags.
Bug 2: ontology_generator hardcoded max_tokens=4096, causing
truncation for models with larger context windows. Increased to
16384 to accommodate reasoning model outputs.
Fixes #642
|
||
|---|---|---|
| .. | ||
| app | ||
| scripts | ||
| pyproject.toml | ||
| requirements.txt | ||
| run.py | ||
| uv.lock | ||