Plan-tune cathedral follow-up. The 3/day distill cap was theatrical: at
~$0.01 per Haiku call, even a runaway loop firing every minute would cost
~$14/day, and free-text events are rare enough that the natural input
rate self-limits to 1-2 fires/day. Count caps don't protect against
runaway bugs (which fire 1000x/second, not 4 times/day) but DO punish
heavy users who'd legitimately distill multiple times during a busy week.
Removed: 3/day rate cap on bin/gstack-distill-free-text. --status output
swapped from "TODAY: N / 3" to "TODAY: N run(s), $X" so users see what
they're spending instead of how close they are to a meaningless count.
Loosened (caps that exist for real-runaway protection, not normal scope):
- EVALS_BUDGET_HARD_CAP_GATE $25 → $200/run
- EVALS_BUDGET_HARD_CAP_PERIODIC $70 → $500/run
- EVALS_BUDGET_HARD_CAP $30 → $300/run (umbrella fallback)
- GSTACK_SIZE_BUDGET_RATIO 1.05 → 1.50 per-skill ratio
- plan-review preamble byte budget 40K → 60K
Principle: caps exist to catch obvious bugs (infinite retry, model price
change, prompt blowup), not to gate legitimate scope growth. Set high
enough that real growth never trips them, only bug territory does.
Adjusted defaults are 4-8× historical worst case, leaving ample headroom
for the next 12 months of legitimate expansion.
Tests updated: distill-free-text removes the 3-test rate-cap describe
block in favor of "no rate cap" assertion that 10 runs/day pass. Other
budget tests still pass because they were never near the old ceilings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plan-tune cathedral T10. Reads auq-other free-text events from this
project's question-log.jsonl, calls Claude via the Anthropic SDK to extract
structured proposals (preference candidates, declared-profile nudges, memory
nuggets), writes them to distillation-proposals.json for the user to review
via /plan-tune (never autonomous — every apply requires explicit Y).
Subcommands:
gstack-distill-free-text # sync distill
gstack-distill-free-text --background # detach + return PID
gstack-distill-free-text --dry-run # emit prompt + events, no API call
gstack-distill-free-text --status # run history + cost-to-date
D7 rate cap: 3 distills per slug per day. Reads ~/.gstack/distill-cost.jsonl
for the count, exits with RATE_CAPPED when limit hit. Cost log lines tagged
by slug so sibling projects don't share the cap. Yesterday runs don't count.
D6 API auth: Anthropic SDK direct, fail-loud on missing ANTHROPIC_API_KEY
with explicit message that distill is a separate billing surface from the
interactive Claude Code session. Uses claude-haiku-4-5 for cost (~$0.001/
1k input, $0.005/1k output) — sufficient for structured extraction.
D14 execution context: --background spawns detached (nohup) so auto-trigger
during /ship doesn't add 30s of pause; results surface on next /plan-tune.
Source events get distilled_at:<ts> stamped on them after the run so they
don't re-propose on the next distill. Match by ts + question_id.
Cost-log line per run includes: slug, proposals_count, rejected_low_confidence,
input_tokens, output_tokens, cost_usd_est. /plan-tune stats reads this to
show "$X estimated, N runs this month" per Layer 4 surface.
10 unit tests cover --status, rate cap (3/day, yesterday-not-counted,
other-slug-not-counted), no-log/no-free-text paths, --dry-run, missing
API key, --background spawn. The actual SDK call is exercised by the T16
E2E test (uses real key, ~$0.001 per run).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>