Commit Graph

5 Commits

Author SHA1 Message Date
Garry Tan 14fc0866d9
v1.58.0.0 feat: diagram + multi-format document engine (mermaid, excalidraw, single-file HTML, DOCX) (#1990)
* docs(todos): P3 content-hash diagram render cache for make-pdf

Deferred from the diagram-engine eng review (Codex outside-voice D7):
repeat make-pdf runs re-render every fence; cache keyed on fence source +
bundle version once multi-diagram docs make it worth building.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(diagram-render): offline mermaid+excalidraw render bundle for browse

Single self-contained page (dist/diagram-render.html, 9.2MB, committed per
eng-review D2) exposing __renderMermaid / __mermaidToExcalidraw /
__excalidrawToSvg / __rasterize / __probeImage through browse load-html +
js --out. Render contract per D3: securityLevel strict, per-fence ids,
print-css font lock, htmlLabels off (canvas-taint-safe). Deterministic
build (same sha twice); drift test pins dist == BUILD_INFO == package.json
pins and rebuild-reproducibility when toolchain matches. Spike-proven
offline: flowchart + sequence SVG, editable .excalidraw scene, 300dpi PNG.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(diagram-render): __downscaleRaster for print-resolution image normalization

Data-URI rasters re-encode in their own format (JPEG stays JPEG at q0.9 —
PNG-encoding photos bloats them) at an explicit target pixel width. Used by
make-pdf's pre-pass for the 300dpi content-box ceiling (eng-review D4).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(make-pdf): diagram pre-pass — mermaid/excalidraw fences render as vector SVG; local images inline as data URIs

```mermaid / ```excalidraw fences extract to placeholder tokens, render in
one diagram-render bundle tab per run (reset contract: bundle page reloads
after any render error), and substitute back as accessible <figure> blocks
with the raw source preserved in a comment. Render failures produce a loud
red diagnostic block, never silent raw code. render=false keeps a fence as
code; title="..." becomes the aria-label and caption.

Local images now actually render: page.setContent loads at about:blank
(tab-session.ts:194), so relative paths silently 404'd before. The pre-pass
resolves them against the markdown's directory, inlines as data URIs, probes
intrinsic dimensions from the bytes (pure-TS PNG/JPEG/GIF/WebP/SVG sniffing),
and downscales rasters wider than 2x the content box at 300dpi. Remote URLs
warn (offline posture, --allow-network exempts); missing files get a visible
placeholder; --strict hard-fails both for CI pipelines.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(make-pdf): diagram pre-pass unit suite + e2e render gates

34 unit tests (fence extraction incl. nested/tilde/unclosed/render=false,
info-string parsing, slot substitution, diagnostic/figure escaping + SVG
script strip, byte-level dimension probing across 5 formats, content-box
math, image inlining incl. strict/remote/missing/data-URI paths). E2E gate
proves through the compiled binary: both fences render as vector text
(id-collision check), raw mermaid ships only via render=false, broken fence
yields the diagnostic block, and the relative fixture image rasterizes to
colored pixels (CRITICAL regression for the about:blank image fix).
--strict exits non-zero on a missing image.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(make-pdf): width directives + conservative auto-landscape via CSS named pages

`![a](x.png){width=full|<pct>|<dim>}` and `{page=landscape|portrait}`
suffixes translate to data-gstack-* attrs in render() (before the sanitizer,
which keeps data- attributes; unrecognized brace groups stay visible text).
Default width rule needs no code: intrinsic CSS-px capped at the content box,
never upscaled — figure img max-width owns it.

Auto-landscape promotes a block to `@page wide { size: <pagesize> landscape }`
only when aspect >= 1.8 AND intrinsic width > 2.5x the content box (~1600px on
letter) AND diagram provenance (rendered fences) or a whole-word alt token
(diagram|architecture|flowchart|chart|graph) for plain images. {page=...}
forces or vetoes; fence info strings accept page=... too. preferCSSPageSize
is passed to Chromium only when a promotion exists, so every other document
prints exactly as before. False negatives are cheap; false positives feel
broken (eng-review P4, Codex challenge accepted).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(make-pdf): width-policy unit suite + landscape e2e gate with negative fixtures

24 unit tests weighted toward the false-positive guards: wide screenshot
without an alt hint stays portrait, sub-threshold and tall images stay
portrait, deterministic 1560/1561px boundary, whole-word alt matching
('photographic' must not match 'graph'), page=portrait veto beats every
heuristic, diagnostic blocks never promote. E2E gate asserts pdfinfo
per-page boxes through the compiled binary: exactly 3 of 5 fixture blocks
get landscape pages (alt-hinted image, directive-forced image, wide sequence
diagram) while the unhinted screenshot and the veto'd diagram stay portrait —
plus the --toc combo proving TOC and named-page landscape coexist.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(make-pdf): --to html|docx output formats

--to html writes the assembled self-contained document directly (no print
round-trip): inline vector diagrams, data-URI images, zero network
references, plus an @media screen layer for browser reading. --to docx is
the content-fidelity export (eng-review P8): html-to-docx@1.8.0 (exact pin;
pure JS, bun-compile-verified) maps headings/tables/code/lists; diagrams and
SVG images rasterize at 300dpi of the content-box width via the render tab;
diagnostic figures convert to plain p/pre so the converter can't silently
drop an error. --format keeps its page-size-alias meaning; --to is the
output format, and the CLI says so when confused.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(make-pdf): format gate — html no-network-refs + docx zip content checks

HTML: zero src/href network refs, no script/link tags, inline SVG diagrams,
data-URI images, screen layer, diagnostic survives. DOCX: valid OOXML zip
(document.xml + Content_Types), >=2 PNG media (diagram raster + fixture
image), headings + render=false source + diagnostic text in document.xml,
no leaked mermaid source from rendered fences. Plus --to validation UX.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(diagram): /diagram skill — English in, editable diagram triplet out

New skill: agent authors mermaid from the user's description and renders the
triplet through the offline diagram-render bundle in the browse daemon —
.mmd source (the single source of truth), editable .excalidraw (opens at
excalidraw.com, round-trips back through re-render), and SVG + PNG. Flowcharts
convert to fully editable scenes; other mermaid types render with an explicit
upstream-converter limitation note. Never ships an unrendered source file;
offline is the contract (no CDN fallback). Inventory rows in AGENTS.md +
docs/skills.md; generated SKILL.md + llms.txt via gen:skill-docs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(diagram): paid E2E pair — gate triplet contract + periodic authoring judge

diagram-triplet (gate, deterministic functional): a fresh claude -p agent
following the skill extract must emit a parseable triplet — graph LR/TD in
.mmd, excalidraw scene with >3 elements, SVG markup, PNG magic bytes.
Verified live: pass, $0.17, 58s. diagram-authoring-quality (periodic,
LLM-judged): faithfulness/labels/size rubric with a diagnostic-path cap,
floor 6/10. Verified live: pass at exactly 6 with substantive critique.
Touchfiles select both on diagram/** and lib/diagram-render/** changes;
tier split per E2E_TIERS rules (eng-review D5).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(diagram): register /diagram in the skill coverage matrix

Gate: triplet contract + structural floor; periodic: authoring-quality judge.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(make-pdf): typography scale-up, zero image truncation, landscape vertical centering

Dogfooding round on the repo README surfaced four output-quality bugs:

- Type was too small everywhere: body 11→12pt, h1 22→26pt, h2 15→18pt,
  cover title 32→56pt with poster spacing, cover meta 10→13pt, TOC 11→12pt
  with tighter leading, code 9.5→10.5pt, tables 10→11pt.
- Zero image truncation, ever: the max-width cap was figure-scoped, but
  markdown images render as <p><img> — a 1850px GitHub screenshot ran off
  the page edge. Global img { max-width: 100%; height: auto; } cap.
- hyphens: auto put real 'dif-\nferent' breaks into the PDF text layer the
  moment 12pt made lines wrap (combined-gate caught it). Clean copy-paste
  is the product contract; left-aligned rag doesn't need hyphenation →
  hyphens: manual.
- Promoted landscape blocks now vertically center. CSS flex/min-height
  centering fragments into phantom empty landscape pages in Chromium
  (bisected: min-height at ANY value; 3 promotions printed 5 pages), so
  image-policy computes an inline margin-top from each block's known
  aspect ratio against the landscape content box instead — fragmentation
  handles margins fine. .page-wide also drops its explicit break-before/
  after (the page-name change already breaks on both sides).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(make-pdf): pin zero-truncation invariant, typography floor, centering math

Global img cap pinned as a regex invariant (the figure-scoped-cap regression
class); typography floor (12pt body, 56pt cover, 12pt TOC); .page-wide must
NOT carry min-height/flex (the phantom-landscape-page regression class);
centering margin math verified both ways (2400×1000 image → 1.38in,
2050×600 viewBox diagram → 1.93in, page-filling directive block → no margin).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs: diagram + multi-format documentation across README, make-pdf skill, and how-to guide

README gains /make-pdf (Publisher) and /diagram (Diagram Maker) rows in the
sprint table. make-pdf's skill doc — the agent-facing contract — gains Core
patterns for mermaid/excalidraw fences (title/render=false/page= options),
the image policy ({width=}/{page=} directives, zero-truncation, conservative
auto-landscape), --to html|docx, and --strict, plus the --to vs --format
disambiguation in Common flags. New docs/howto-diagrams-and-formats.md is
the user-facing walkthrough: fences, directives, formats, /diagram triplet,
the mermaid racetrack trick, troubleshooting.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(make-pdf): fill ship-audit coverage gaps — downscale, reset contract, excalidraw fence, WebP

Ship coverage audit found 9 gaps (85%); this fills the 2 HIGH + 3 MEDIUM and
most LOW. diagram-gate fixture gains a 4200px incompressible photo (the only
live coverage of __downscaleRaster AND the 64KB chunked jsViaBuffer eval
transport — asserted via the downscale stderr warning), an ```excalidraw
scene fence rendered through exportToSvg (vector labels + caption in
pdftotext, no leaked scene JSON), and the broken fence MOVED BETWEEN the two
mermaid fences so the second diagram rendering proves the D6.2 reset
contract end-to-end. New coverage-gaps.test.ts (16 tests): mock-tab reset
contract (exactly one reload, post-failure fence renders), excalidraw
fail-fast diagnostic without a bundle call, rasterize error fallbacks
(figure/tag kept, never silent), WebP VP8/VP8L/VP8X byte parsers,
landscapeContentBox a4/asymmetric margins, bare-token slot fallback,
resolveBundlePath env override + error shape, screenCss media scoping.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(make-pdf): pre-landing review wave — fence fidelity, injection hardening, Windows paths, transport rework

Review army (6 specialists + red team) findings, all fixed:

- Indented fences replay byte-for-byte and indented diagram fences are NOT
  extracted (red-team conf-9: the pre-pass reconstructed fences at column 0,
  splitting any list containing fenced code — every ordinary document).
- String.replace $-pattern injection killed at every seam: substituteSlots,
  mergeStyle, img/src rewrites all use function replacements (a diagram label
  containing $' duplicated the document tail).
- Big-expression transport reworked: browse `eval <file>` (one spawn, any
  size, Windows-safe) replaces the 64KB chunked window-buffer eval — fixes
  the per-chunk spawn cost, the char-vs-byte argv units, AND the Windows
  32,767-char command-line ceiling in one move.
- Staged-bundle trust: content verified by hash even when the file exists,
  and the rename-failure path re-hashes the survivor (sticky-bit /tmp EPERM
  would otherwise ride a pre-planted file past the check).
- Windows drive-letter img srcs (C:/x.png) reach the local-path branch
  instead of being swallowed as unknown URL schemes.
- DOCX rasterize-failure now embeds the decoded source as visible text —
  returning the figure made diagrams vanish silently (converter drops svg).
- Fence source preserved as base64 data-gstack-source attribute (the comment
  encoding corrupted every '-->' arrow); decodeFigureSource() round-trips.
- inlineLocalImages memoizes per path; file:// uses fileURLToPath; preview
  prints a divergence note for fences/local images; --to docx strips the
  watermark div and warns about print-only flags; TOC links resolve in
  html/docx (heading ids assigned); waitForExpression sleeps instead of
  busy-spinning; escapeHtml/svg-dims deduped to single definitions;
  typography stragglers (blockquote 12pt, footnotes 10pt, 42em screen
  measure); bundle BUILD_INFO gains srcSha256 for no-node_modules drift
  detection; MAX_TARGET_PX shared guard.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* ci: make-pdf gate covers the diagram-render bundle; bundle pinned to LF

make-pdf-gate.yml paths gain lib/diagram-render/** and the drift test (a
bundle-only PR previously skipped every render gate AND no CI lane ran the
drift check at all). .gitattributes pins dist html/json to LF so Windows
autocrlf can't break the hash-pinned bundle.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(make-pdf)+feat(diagram): review-wave test pins + skill transport hardening

Tests: indented-fence byte-for-byte replay + no-extraction-in-lists,
drive-letter local-path routing, $-pattern slot immunity, base64 source
round-trip ('A --> B' exact), existing-style merge preservation, DOCX
rasterize-failure surfaces source, srcSha256 + font-stack drift guards,
landscape veto asserted as some-portrait/no-landscape (layout-order-proof),
judge rubric cap lowered to 5 so it actually fails, vacuous error-shape test
removed honestly, tmpdir cleanup.

/diagram skill: base64 transport (template literals corrupted backticks/${
in sources), content-addressed staging with hash verification, and --tab-id
pinned on every browse call so a concurrent /qa session can't be clobbered.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(make-pdf): out-of-tree image reads warn; --strict makes them fatal (D8.1)

Local CLI semantics stay (absolute paths and ../ still inline, like pandoc),
but never silently: an agent PDF-ing untrusted markdown can't quietly embed a
file from outside the input directory into a shareable document without a
visible warning, and --strict pipelines hard-fail. Two unit tests. Also:
TODOS.md gains the deferred e2e-harness dedup entry (D8.2).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix: pre-existing test failure in skill-e2e-bws operational-learning

Root cause was the fixture, not model behavior: gstack-learnings-log gained
an import of lib/jsonl-store.ts in the v1.57.5.0 injection-sanitization wave,
but the test copies only bin/ scripts into its sandbox — the inline bun
import failed and the script exited 1 before writing, on every run, on main
too (reproduced at a5833c41). Fixture now stages lib/jsonl-store.ts beside
bin/; verified deterministically (script exits 0, learning written) and via
the paid test (1 pass).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(make-pdf): adversarial-review wave — offline posture enforced, symlink-aware confinement, bounded reads

Codex adversarial + structured review findings:

- Remote images are now BLOCKED with a visible placeholder instead of
  warn-and-keep — leaving the tag meant Chromium fetched the URL at print
  time anyway, so the offline posture was a lie (tracking pixels and
  internal-URL probes ran without --allow-network).
- The out-of-tree read check compares REAL paths: a symlink inside the input
  dir pointing at ~/.ssh/... passed the string-prefix check, including under
  --strict. Ordered after the existence check (realpath of a missing file
  false-positives on macOS /var → /private/var).
- Image reads are bounded BEFORE reading: statSync first, non-regular files
  (fifo/device/dir) and >64MB files degrade to placeholders instead of
  hanging or exhausting memory; malformed percent-encoding (foo%zz.png)
  degrades to missing-image instead of crashing decodeURIComponent.
- browse shell-outs get a 120s timeout — a wedged daemon or hostile mermaid
  source fails the run instead of hanging it.
- TOC entries link to the heading's ACTUAL id (pre-id'd raw-HTML headings
  previously got dead #toc-N links); per-side margins compose into the CSS
  @page shorthand so a landscape promotion flipping preferCSSPageSize no
  longer silently reverts --margin-left/right to defaults (Codex P2).
- The image memo is a typed object — literal NUL-byte separators had made
  diagram-prepass.ts register as binary to text tooling.

Codex structured review GATE: PASS (no P1).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* chore: bump version and changelog (v1.58.0.0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs: sync make-pdf image-policy docs with final shipped behavior (v1.58.0.0)

The docs wave (87594420) predated the final review-wave commits, so two
docs drifted from shipped behavior:

- make-pdf/SKILL.md.tmpl + generated SKILL.md: remote images are BLOCKED
  with a visible placeholder (not warned-and-kept); out-of-tree reads
  (including via symlink) warn and --strict makes them fatal; --strict
  also covers oversized (>64MB) and non-regular files; troubleshooting
  entry now names the actual "[remote image blocked]" symptom.
- docs/howto-diagrams-and-formats.md: same corrections in the image
  section, CI section, and troubleshooting.
- README.md: docs/howto-diagrams-and-formats.md added to the Docs table
  (was unreachable from any entry-point doc).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs: apply Codex doc-review findings for v1.58.0.0

Cross-model doc review (Codex, read-only) checked the v1.58.0.0 docs
against the shipped code. Fixes:

- howto + make-pdf SKILL: diagram source is preserved base64 in a
  data-gstack-source attribute, not an HTML comment (-- in mermaid
  arrows would corrupt a comment); fences must start at column 0;
  fence options example gains page=portrait; --to html "zero network
  refs" qualified (--allow-network deliberately keeps remote tags).
- /diagram description, README + docs/skills.md rows: the hand-drawn
  aesthetic belongs to the .excalidraw artifact; rendered SVG/PNG use
  mermaid's clean neutral theme (lib/diagram-render entry.ts pins
  theme: "neutral").
- CHANGELOG v1.58.0.0 wording: --strict coverage lists all five fatal
  classes (missing/remote/out-of-tree/oversized/non-regular); fences
  are vector SVG in pdf+html, 300dpi PNG in docx; hand-drawn claim
  scoped to the .excalidraw file.
- lib/diagram-render/README: Page API table gains __downscaleRaster.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 15:38:53 -07:00
Garry Tan 62024d114c
v1.52.2.0 fix(make-pdf): render emoji instead of tofu (▯) on Linux (#1787)
* fix(make-pdf): emoji font fallback in print CSS

Emoji code points rendered as .notdef tofu (▯) because the body and
@top-center font stacks had no emoji family for Chromium to fall back to.
Add SANS_STACK / CJK_STACK / EMOJI_FAMILIES constants (one source of truth
per family list) and append the emoji families before the generic
sans-serif in the two stacks that can hold emoji. The @bottom-* boxes hold
counters / a fixed CONFIDENTIAL string, so they share SANS_STACK without
emoji. Non-emoji output is byte-identical.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(setup): auto-install color-emoji font on Linux

macOS and Windows ship a color-emoji font; most Linux distros/containers
ship none, so make-pdf emits tofu there. ensure_emoji_font() best-effort
installs fonts-noto-color-emoji (apt, with dnf/pacman/apk fallbacks) and
refreshes the fontconfig cache. Hardened: Linux-only guard, GSTACK_SKIP_FONTS
escape hatch, fc-match color=True detection (the broad fc-list query
false-matched LastResort), sudo -n so a password prompt fails fast instead
of hanging, DEBIAN_FRONTEND=noninteractive, timeout 30 on apt update, and
fc-cache under sudo. Warns instead of failing. After a fresh install,
refresh_browse_daemon_for_fonts() runs 'browse stop' so the next render
spawns a Chromium that sees the new font (font fallback is process-cached).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(make-pdf): emoji render gate (pdffonts + pixel proof)

pdftotext is a false oracle for emoji: Skia preserves the Unicode in the
text cluster even when the glyph drew as .notdef tofu, so extraction passes
on a broken render. The gate instead asserts (1) pdffonts shows an emoji
family embedded and (2) pdftoppm rasterizes the page to color (measured
~1650 saturated pixels vs ~0 for tofu). pdfimages is not used: macOS embeds
color emoji as Type 3 fonts, so it lists nothing even on a correct render.
Adds resolvePopplerTool() (DRY resolver, returns null for clean skips) and
a fixture exercising FE0F variation-selector emoji. Skips cleanly when
poppler tools or a color-emoji font are unavailable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci(make-pdf): install emoji font + run emoji gate on Ubuntu

Install fonts-noto-color-emoji before Chromium launches on the Ubuntu leg
(macOS already ships Apple Color Emoji), refresh fontconfig, and log the
fc-match result. Run the whole make-pdf/test/e2e/ dir so the emoji gate runs
alongside the combined-features copy-paste gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* harden(make-pdf): emoji gate + font install per adversarial review

Codex adversarial pass on the implementation diff flagged five robustness
gaps, all fixed here:
- emoji-gate skipped green in CI when poppler/font prerequisites were absent,
  which could let the tofu regression ship behind a green build. Missing
  prerequisites are now a HARD FAILURE when process.env.CI is set; local dev
  still skips cleanly.
- execFileSync children (make-pdf, pdffonts, pdftoppm, fc-match) had no
  timeout; a wedged binary or hostile GSTACK_*_BIN override could hang the
  job past Bun's test timeout. Each child now has a 25s ceiling.
- PPM parser trusted header tokens blindly; malformed/variant output gave a
  silently-wrong count. Now validates magic/dimensions/maxval and pixel-buffer
  length, handles header comments, throws a hard diagnostic on mismatch.
- predictable /tmp paths were collision/symlink-prone; now mkdtempSync under
  /tmp (kept under /tmp for browse's validateOutputPath allowlist).
- only apt-get update was timeout-wrapped; dnf/pacman/apk installs and apt
  install can hang on locks/mirrors. All package installs now timeout-bound.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.52.2.0)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(make-pdf): document color-emoji font requirement + GSTACK_SKIP_FONTS

Extend the Linux font note to cover the color-emoji font that make-pdf
emoji rendering needs: setup auto-installs fonts-noto-color-emoji, the
print CSS falls back through Apple/Segoe/Noto emoji families, and
GSTACK_SKIP_FONTS=1 opts out. Edit the .tmpl and regenerate SKILL.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 18:06:19 -07:00
Garry Tan 49cc4ff9c9
v1.31.1.0 fix wave: 3 community PRs (careful BSD sed, codex Step 0 rename, make-pdf setup ordering) (#1413)
* fix(careful): BSD sed compatibility for safe exception detection on macOS

The sed regex in check-careful.sh uses \s+, which is a GNU sed
extension not supported by BSD sed (macOS default). On macOS, this
causes the RM_ARGS strip to fail silently, making rm -rf of safe
exceptions (node_modules, .next, dist, etc.) trigger the destructive
warning instead of being permitted as designed.

Fix: replace \s+ with POSIX [[:space:]]+, which works on both GNU sed
(Linux) and BSD sed (macOS).

The existing test/hook-scripts.test.ts already documented this
limitation via a detectSafeRmWorks() helper and a platform-conditional
assertion ("if GNU sed: expect undefined, else: expect ask"). Now that
the regex works on both platforms, this dead path is removed and the
safe-exception tests assert the same expectation on every OS.

Note: the grep regex in the same file also uses \s+, but BSD grep -E
on macOS does support \s (verified via bash -x trace), so only the
sed expression needs the fix.

Discovered while translating the careful skill for a Japanese
derivative project (uzustack). Reference:
https://github.com/uzumaki-inc/uzustack/commit/bc67c8d

* docs(codex): rename Step 0 to avoid collision with platform-detect prelude

The codex skill template had its own '## Step 0: Check codex binary'
heading (line 42), which after gen-skill-docs collided with the
platform-detection prelude '## Step 0: Detect platform and base branch'
(injected by scripts/resolvers/utility.ts). The generated codex/SKILL.md
ended up with two H2 headings labeled Step 0, which is ambiguous to an
agent reading the skill in order.

Renamed the local heading to Step 0.4, slotting it between the prelude
(Step 0) and the existing Step 0.5 / Step 0.6 sections. No renumbering
of downstream steps needed.

Closes #1388

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(codex): regenerate SKILL.md after Step 0 rename

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(make-pdf): move setup before preamble footer

* chore: bump version and changelog (v1.31.1.0)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: ToraDady <tac201k@gmail.com>
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Jayesh Betala <jayesh.betala7@gmail.com>
2026-05-10 06:57:24 -07:00
Garry Tan e23ff280a1
fix(v1.4.1.0): /make-pdf — page numbers, entity escape, Linux fonts (#1098)
* fix(make-pdf): single-source page numbers via CSS, honor --no-page-numbers end-to-end

Two page-number sources were stacking in every PDF: Chromium's native footer
and our @page @bottom-center CSS. The CLI flag --page-numbers/--no-page-numbers
also never reached the CSS layer, because RenderOptions didn't carry it.
Passing --footer-template likewise dropped the "custom footer replaces stock
footer" semantic.

- orchestrator.ts: browseClient.pdf() gets pageNumbers:false unconditionally.
  CSS is the single source of truth. Chromium native numbering always off.
- render.ts: RenderOptions gains pageNumbers + footerTemplate. render() computes
  showPageNumbers = pageNumbers !== false && !footerTemplate and passes to
  printCss(), preserving the prior footerTemplate-suppresses-stock semantic.
- print-css.ts: PrintCssOptions.pageNumbers wraps @bottom-center in a conditional
  matching the existing showConfidential pattern.
- types.ts: PreviewOptions.pageNumbers so preview path compiles and matches CLI.
- render.test.ts: 7 regression tests covering printCss({pageNumbers}) in
  isolation AND the full render() data flow incl. footerTemplate path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(make-pdf): decode HTML entities in titles and TOC to prevent double-escape

A markdown title like "# Herbert & Garry" rendered as "Herbert &amp;amp; Garry"
in <title>, cover block, and TOC entries. marked emits "&amp;" (correct HTML),
but extractFirstHeading and extractHeadings only stripTags — leaving the entity
intact. That string then flows through escapeHtml, producing the double-encode.

- render.ts: new decodeTextEntities helper, distinct from decodeTypographicEntities
  (which runs on in-pipeline HTML and intentionally preserves &amp;). Covers
  named entities (lt/gt/quot/apos/39/x27/amp) AND numeric (decimal + hex) so
  inputs like "&#169;" or "&#x2014;" don't create the same partial-fix bug.
  Amp-last ordering prevents double-decode on "&amp;lt;" et al.
- Apply in both extractFirstHeading and extractHeadings. extractHeadings feeds
  buildTocBlock → escapeHtml, so the TOC site had the same bug.
- render.test.ts: 8 tests covering the contract — parameterized across &, <, >,
  ©, — chars; single-escape in <title>/cover; TOC double-escape check; numeric
  entity decode; smartypants-interacts-with-quotes contract (no raw equality).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(make-pdf): Liberation Sans font fallback for Linux rendering

On Linux (Docker, CI, servers), neither Helvetica nor Arial exist. Our CSS
stacks were falling through to DejaVu Sans — wider letterforms that look like
Verdana, not the intended Helvetica/Faber look. Liberation Sans is the standard
metric-compatible Arial clone (SIL OFL 1.1, apt package fonts-liberation).

- print-css.ts: all four font stacks (body + @top-center + @bottom-center +
  @bottom-right CONFIDENTIAL) gain "Liberation Sans" between Helvetica and
  Arial. File-header docblock updated to reflect the new stack.
- .github/docker/Dockerfile.ci: explicit apt-get install fonts-liberation +
  fontconfig with retry, fc-cache -f, and a verify step that fails the build
  loud if the font disappears. Playwright's install-deps happens to pull this
  in today but the dep is implicit and could silently regress.
- SKILL.md.tmpl: one-sentence note pointing Linux users at fonts-liberation.
- SKILL.md: regenerated via bun run gen:skill-docs --host all (only make-pdf's
  generated file changed — verified clean diff scope).
- render.test.ts: 2 assertions — Liberation Sans in body stack AND in at least
  one @page margin-box rule (proves all four intended stacks got touched, not
  just one).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.4.1.0)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: anonymize test fixtures, drop VC-partner framing

- CHANGELOG + render.test.ts fixtures use "Faber & Faber" instead of a
  personal name. Same regression coverage (ampersand in <title>, cover,
  TOC, body), neutral subject.
- make-pdf/SKILL.md.tmpl description drops the "send to a VC partner, a
  book agent, a judge, or Rick Rubin's team" line. "Not a draft artifact
  — a finished artifact" stands on its own without the audience posturing.
- SKILL.md regenerated.

No functional changes. All 58 make-pdf tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:32:58 +08:00
Garry Tan d0782c4c4d
feat(v1.4.0.0): /make-pdf — markdown to publication-quality PDFs (#1086)
* feat(browse): full $B pdf flag contract + tab-scoped load-html/js/pdf

Grow $B pdf from a 2-line wrapper (hard-coded A4) into a real PDF engine
frontend so make-pdf can shell out to it without duplicating Playwright:

- pdf: --format, --width/--height, --margins, --margin-*, --header-template,
  --footer-template, --page-numbers, --tagged, --outline, --print-background,
  --prefer-css-page-size, --toc. Mutex rules enforced. --from-file <json>
  dodges Windows argv limits (8191 char CreateProcess cap).
- load-html: add --from-file <json> mode for large inline HTML. Size + magic
  byte checks still apply to the inline content, not the payload file path.
- newtab: add --json returning {"tabId":N,"url":...} for programmatic use.
- cli: extract --tab-id flag and route as body.tabId to the HTTP layer so
  parallel callers can target specific tabs without racing on the active
  tab (makes make-pdf's per-render tab isolation possible).
- --toc: non-fatal 3s wait for window.__pagedjsAfterFired. Paged.js ships
  later; v1 renders TOC statically via the markdown renderer.

Codex round 2 flagged these P0 issues during plan review. All resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(resolvers): add MAKE_PDF_SETUP + makePdfDir host paths

Skill templates can now embed {{MAKE_PDF_SETUP}} to resolve $P to the
make-pdf binary via the same discovery order as $B / $D: env override
(MAKE_PDF_BIN), local skill root, global install, or PATH.

Mirrors the pattern established by generateBrowseSetup() and
generateDesignSetup() in scripts/resolvers/design.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(make-pdf): new /make-pdf skill + orchestrator binary

Turn markdown into publication-quality PDFs. $P generate input.md out.pdf
produces a PDF with 1in margins, intelligent page breaks, page numbers,
running header, CONFIDENTIAL footer, and curly quotes/em dashes — all on
Helvetica so copy-paste extraction works ("S ai li ng" bug avoided).

Architecture (per Codex round 2):
  markdown → render.ts (marked + sanitize + smartypants) → orchestrator
    → $B newtab --json → $B load-html --tab-id → $B js (poll Paged.js)
    → $B pdf --tab-id → $B closetab

browseClient.ts shells out to the compiled browse CLI rather than
duplicating Playwright. --tab-id isolation per render means parallel
$P generate calls don't race on the active tab. try/finally tab cleanup
survives Paged.js timeouts, browser crashes, and output-path failures.

Features in v1:
  --cover              left-aligned cover page (eyebrow + title + hairline rule)
  --toc                clickable static TOC (Paged.js page numbers deferred)
  --watermark <text>   diagonal DRAFT/CONFIDENTIAL layer
  --no-chapter-breaks  opt out of H1-starts-new-page
  --page-numbers       "N of M" footer (default on)
  --tagged --outline   accessible PDF + bookmark outline (default on)
  --allow-network      opt in to external image loading (default off for privacy)
  --quiet --verbose    stderr control

Design decisions locked from the /plan-design-review pass:
  - Helvetica everywhere (Chromium emits single-word Tj operators for
    system fonts; bundled webfonts emit per-glyph and break extraction).
  - Left-aligned body, flush-left paragraphs, no text-indent, 12pt gap.
  - Cover shares 1in margins with body pages; no flexbox-center, no
    inset padding.
  - The reference HTMLs at .context/designs/*.html are the implementation
    source of truth for print-css.ts.

Tests (56 unit + 1 E2E combined-features gate):
  - smartypants: code/URL-safe, verified against 10 fixtures
  - sanitizer: strips <script>/<iframe>/on*/javascript: URLs
  - render: HTML assembly, CJK fallback, cover/TOC/chapter wrap
  - print-css: all @page rules, margin variants, watermark
  - pdftotext: normalize()+copyPasteGate() cross-OS tolerance
  - browseClient: binary resolution + typed error propagation
  - combined-features gate (P0): 2-chapter fixture with smartypants +
    hyphens + ligatures + bold/italic + inline code + lists + blockquote
    passes through PDF → pdftotext → expected.txt diff

Deferred to Phase 4 (future PR): Paged.js vendored for accurate TOC page
numbers, highlight.js for syntax highlighting, drop caps, pull quotes,
two-column, CMYK, watermark visual-diff acceptance.

Plan: .context/ceo-plans/2026-04-19-perfect-pdf-generator.md
References: .context/designs/make-pdf-*.html

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(build): wire make-pdf into build/test/setup/bin + add marked dep

- package.json: compile make-pdf/dist/pdf as part of bun run build; add
  "make-pdf" to bin entry; include make-pdf/test/ in the free test pass;
  add marked@18.0.2 as a dep (markdown parser, ~40KB).
- setup: add make-pdf/dist/pdf to the Apple Silicon codesign loop.
- .gitignore: add make-pdf/dist/ (matches browse/dist/ and design/dist/).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(make-pdf): matrix copy-paste gate on Ubuntu + macOS

Runs the combined-features P0 gate on pull requests that touch make-pdf/
or browse's PDF surface. Installs poppler (macOS) / poppler-utils (Ubuntu)
per OS. Windows deferred to tolerant mode (Xpdf / Poppler-Windows
extraction variance not yet calibrated against the normalized comparator —
Codex round 2 #18).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(skills): regenerate SKILL.md for make-pdf addition + browse pdf flags

bun run gen:skill-docs picks up:
  - the new /make-pdf skill (make-pdf/SKILL.md)
  - updated browse command descriptions for 'pdf', 'load-html', 'newtab'
    reflecting the new flag contract and --from-file mode

Source of truth stays the .tmpl files + COMMAND_DESCRIPTIONS;
these are regenerated artifacts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tests): repair stale test expectations + emit _EXPLAIN_LEVEL / _QUESTION_TUNING from preamble

Three pre-existing test failures on main were blocking /ship:

- test/skill-validation.test.ts "Step 3.4 test coverage audit" expected the
  literal strings "CODE PATH COVERAGE" and "USER FLOW COVERAGE" which were
  removed when the Step 7 coverage diagram was compressed. Updated assertions
  to check the stable `Code paths:` / `User flows:` labels that still ship.

- test/skill-validation.test.ts "ship step numbering" allowed-substeps list
  didn't include 15.0 (WIP squash) and 15.1 (bisectable commits) which were
  added for continuous checkpoint mode. Extended the allowlist.

- test/writing-style-resolver.test.ts and test/plan-tune.test.ts expected
  `_EXPLAIN_LEVEL` and `_QUESTION_TUNING` bash variables in the preamble but
  generate-preamble-bash.ts had been refactored and those lines were dropped.
  Without them, downstream skills can't read `explain_level` or
  `question_tuning` config at runtime — terse mode and /plan-tune features
  were silently broken.

Added the two bash echo blocks back to generatePreambleBash and refreshed
the golden-file fixtures to match. All three preamble-related golden
baselines (claude/codex/factory) are synchronized with the new output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v1.4.0.0)

New /make-pdf skill + $P binary.

Turn any markdown file into a publication-quality PDF. Default output is
a 1in-margin Helvetica letter with page numbers in the footer. `--cover`
adds a left-aligned cover page, `--toc` generates a clickable table of
contents, `--watermark DRAFT` overlays a diagonal watermark. Copy-paste
extraction from the PDF produces clean words, not "S a i l i n g"
spaced out letter by letter. CI gate (macOS + Ubuntu) runs a combined-
features fixture through pdftotext on every PR.

make-pdf shells out to browse rather than duplicating Playwright.
$B pdf grew into a real PDF engine with full flag contract (--format,
--margins, --header-template, --footer-template, --page-numbers,
--tagged, --outline, --toc, --tab-id, --from-file). $B load-html and
$B js gained --tab-id. $B newtab --json returns structured output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(changelog): rewrite v1.4.0.0 headline — positive voice, no VC framing

The original headline led with "a PDF you wouldn't be embarrassed to send
to a VC": double-negative voice and audience-too-narrow. /make-pdf works
for essays, letters, memos, reports, proposals, and briefs. Framing the
whole release around founders-to-investors misses the wider audience.

New headline: "Turn any markdown file into a PDF that looks finished."
New tagline: "This one reads like a real essay or a real letter."

Positive voice. Broader aperture. Same energy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:20:30 +08:00