gstack/make-pdf/test/fixtures/combined-gate.expected.txt

21 lines
987 B
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

The Horizon
This is the combined-features fixture. Every feature turned on simultaneously. The gate asserts that all of these paragraphs extract cleanly from the PDF with pdftotext.
A paragraph with bold, italic, and inline code tokens — each of which gets a different HTML treatment. None should fragment text on copy-paste.
A paragraph with “curly quotes”, single quotes, an em dash — like this, and an ellipsis… All three get smartypants transforms.
A subsection heading
First list item with some words that keep it on one line.
Second list item with more words.
Third list item.
A blockquote from Van Dyke. Her diminished size is in me, not in her.
A second chapter
This content begins on a fresh page because the default chapter-breaks rule fires. Extract must still find these paragraphs.
A final paragraph with enough words to trigger hyphenation across the line wrap boundary. Extraordinary words sometimes hyphenate. Interdisciplinary ones certainly do.