gstack/make-pdf/test/fixtures/combined-gate.md

1.0 KiB

The Horizon

This is the combined-features fixture. Every feature turned on simultaneously. The gate asserts that all of these paragraphs extract cleanly from the PDF with pdftotext.

A paragraph with bold, italic, and inline code tokens — each of which gets a different HTML treatment. None should fragment text on copy-paste.

A paragraph with "curly quotes", 'single quotes', an em dash -- like this, and an ellipsis... All three get smartypants transforms.

A subsection heading

Lists must not break mid-item:

  • First list item with some words that keep it on one line.
  • Second list item with more words.
  • Third list item.

A blockquote from Van Dyke. Her diminished size is in me, not in her.

A second chapter

This content begins on a fresh page because the default chapter-breaks rule fires. Extract must still find these paragraphs.

A final paragraph with enough words to trigger hyphenation across the line wrap boundary. Extraordinary words sometimes hyphenate. Interdisciplinary ones certainly do.