gstack/test
Garry Tan e3384e325c
fix: destructure callJudge return value in LLM eval tests
callJudge<T> returns { result: T, meta } but three call sites were
accessing properties directly on the wrapper object instead of
destructuring result first. This caused "Expected and actual values
must be numbers or bigints" in all workflow judge tests (10 failures).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 15:51:41 -07:00
..
fixtures feat: test coverage catalog — shared audit across plan/ship/review (v0.10.1.0) (#259) 2026-03-22 11:28:16 -07:00
helpers Merge branch 'main' into garrytan/team-supabase-store 2026-03-29 15:12:12 -07:00
analytics.test.ts feat: safety hook skills + skill usage telemetry (v0.7.1) (#189) 2026-03-18 23:57:59 -05:00
audit-compliance.test.ts fix: security audit compliance — credentials, telemetry, bun pin, untrusted warning (v0.12.12.0) (#574) 2026-03-27 12:06:58 -06:00
codex-e2e.test.ts feat: worktree isolation for E2E tests + infrastructure elegance (v0.11.12.0) (#425) 2026-03-23 23:05:22 -07:00
gemini-e2e.test.ts feat: worktree isolation for E2E tests + infrastructure elegance (v0.11.12.0) (#425) 2026-03-23 23:05:22 -07:00
gen-skill-docs.test.ts Merge branch 'main' into garrytan/team-supabase-store 2026-03-29 15:12:12 -07:00
global-discover.test.ts feat: /retro global — cross-project AI coding retrospective (v0.10.2.0) (#316) 2026-03-22 13:52:47 -07:00
hook-scripts.test.ts feat: safety hook skills + skill usage telemetry (v0.7.1) (#189) 2026-03-18 23:57:59 -05:00
lib-dashboard-queries.test.ts feat: add dashboard query functions — pure transforms for team analytics 2026-03-16 02:43:52 -05:00
lib-dashboard-ui.test.ts feat: add shared team dashboard, regression alerts, weekly digest edge functions 2026-03-16 02:44:47 -05:00
lib-edge-functions.test.ts test: add 24 tests for edge function pure functions 2026-03-16 19:12:40 -07:00
lib-eval-cache.test.ts feat: add SHA-based eval caching with EVAL_CACHE=0 bypass 2026-03-15 09:39:26 -05:00
lib-eval-cli.test.ts feat: add CLI leaderboard, refactor formatTeamSummary to use dashboard-queries 2026-03-16 02:44:12 -05:00
lib-eval-cost.test.ts feat: add eval format validation, tier selection, cost tracking 2026-03-15 09:39:18 -05:00
lib-eval-format.test.ts feat: add eval format validation, tier selection, cost tracking 2026-03-15 09:39:18 -05:00
lib-eval-tier.test.ts feat: add eval format validation, tier selection, cost tracking 2026-03-15 09:39:18 -05:00
lib-eval-trend.test.ts feat: add eval:trend CLI for per-test pass rate tracking 2026-03-15 16:47:41 -05:00
lib-llm-summarize.test.ts feat: add push-transcript CLI, show sessions, interactive setup, 36 tests 2026-03-16 00:15:26 -05:00
lib-sync-config.test.ts feat: hook eval-store sync, use shared utils, add 30 lib tests 2026-03-15 02:02:54 -05:00
lib-sync-show.test.ts feat: add push-transcript CLI, show sessions, interactive setup, 36 tests 2026-03-16 00:15:26 -05:00
lib-sync.test.ts feat: hook eval-store sync, use shared utils, add 30 lib tests 2026-03-15 02:02:54 -05:00
lib-team-admin.test.ts feat: add team admin CLI + migration 007 (settings, cooldowns, create_team RPC) 2026-03-16 02:44:24 -05:00
lib-transcript-sync.test.ts feat: add push-transcript CLI, show sessions, interactive setup, 36 tests 2026-03-16 00:15:26 -05:00
lib-util.test.ts feat: add listEvalFiles, loadEvalResults, formatTimestamp to lib/util.ts 2026-03-15 09:39:09 -05:00
review-log.test.ts fix: community PRs + security hardening + E2E stability (v0.12.7.0) (#552) 2026-03-26 23:21:27 -06:00
skill-e2e-bws.test.ts fix: community PRs + security hardening + E2E stability (v0.12.7.0) (#552) 2026-03-26 23:21:27 -06:00
skill-e2e-cso.test.ts feat: /cso v2 — infrastructure-first security audit (v0.11.6.0) (#384) 2026-03-23 06:57:22 -07:00
skill-e2e-deploy.test.ts feat: /land-and-deploy first-run dry run + staging-first + trust ladder (v0.12.2.0) (#518) 2026-03-26 11:08:31 -07:00
skill-e2e-design.test.ts feat: CI evals on Ubicloud — 12 parallel runners + Docker image (v0.11.10.0) (#360) 2026-03-23 10:17:33 -07:00
skill-e2e-plan.test.ts test: E2E tests for plan review report and Codex offering (v0.11.15.0) (#449) 2026-03-24 07:30:24 -07:00
skill-e2e-qa-bugs.test.ts feat: CI evals on Ubicloud — 12 parallel runners + Docker image (v0.11.10.0) (#360) 2026-03-23 10:17:33 -07:00
skill-e2e-qa-workflow.test.ts feat: CI evals on Ubicloud — 12 parallel runners + Docker image (v0.11.10.0) (#360) 2026-03-23 10:17:33 -07:00
skill-e2e-review.test.ts fix: community PRs + security hardening + E2E stability (v0.12.7.0) (#552) 2026-03-26 23:21:27 -06:00
skill-e2e-sidebar.test.ts fix: sidebar agent uses real tab URL instead of stale Playwright URL (v0.12.6.0) (#544) 2026-03-26 22:07:03 -06:00
skill-e2e-workflow.test.ts feat: 2-tier E2E test system — granular touchfiles + gate/periodic split (v0.11.16.0) (#450) 2026-03-24 15:24:00 -07:00
skill-e2e.test.ts feat: test coverage catalog — shared audit across plan/ship/review (v0.10.1.0) (#259) 2026-03-22 11:28:16 -07:00
skill-llm-eval.test.ts fix: destructure callJudge return value in LLM eval tests 2026-03-29 15:51:41 -07:00
skill-parser.test.ts feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41) 2026-03-13 21:08:12 -07:00
skill-routing-e2e.test.ts fix: community PRs + security hardening + E2E stability (v0.12.7.0) (#552) 2026-03-26 23:21:27 -06:00
skill-validation.test.ts Merge branch 'main' into garrytan/team-supabase-store 2026-03-29 15:12:12 -07:00
telemetry.test.ts fix: security audit remediation — 12 fixes, 20 tests (v0.13.1.0) (#595) 2026-03-28 08:35:24 -06:00
touchfiles.test.ts feat: 2-tier E2E test system — granular touchfiles + gate/periodic split (v0.11.16.0) (#450) 2026-03-24 15:24:00 -07:00
uninstall.test.ts feat: community PRs — faster install, skill namespacing, uninstall, Codex fallback, Windows fix, Python patterns (v0.12.9.0) (#561) 2026-03-27 00:44:37 -06:00
worktree.test.ts feat: worktree isolation for E2E tests + infrastructure elegance (v0.11.12.0) (#425) 2026-03-23 23:05:22 -07:00