Releases: jgm/pandoc
pandoc 3.6
Click to expand changelog
-
Add
mdoc
as input format (Evan Silberman). This change introduces a reader for mdoc, a roff-derived semantic markup language for manual pages. This reader has been developed almost exclusively against mandocβs documentation and implementation of mdoc as a reference, and the real-world manual pages tested against are those from the OpenBSD base system. Of ~3500 manuals in mdoc format shipped with a fresh OpenBSD install, 17 cause the mdoc reader to exit with a parse error. Any further chasing of edge cases is deferred to future work. -
New module: Text.Pandoc.Readers.Mdoc, exporting
readMdoc
[API change]. -
Issue warnings for duplicate YAML metadata keys (#10312).
-
Ensure that
--sandbox
affects--embed-resources
. Previously it did not (contrary to what was implied by the manual), which means that an image with URL/etc/passwd
would leak an encoded version of that file to HTML output with--self-contained
or--embed-resources
, even if--sandbox
was used. Thanks to Samuel Mortenson for pointing out the issue. -
Text.Pandoc.App.OutputSettings: add
sandbox'
function. This computes the sandboxed files from Opt and avoids code repetition. -
Docx reader:
- Parse index references as empty spans with attributes (#10171). Attributes included are
entry
, and optionallybold
,italic
,yomi
,see
. - Donβt create multiple paragraphs for title or subtitle (#10359). If there are multiple paragraphs with Title or Subtitle style, use only the first for metadata.
- Handle case where Zotero
itemData
has different id from thecitationItem
id. In this case we use thecitationItemId
in the bibliography as well, overriding thereferenceId
in the itemData (#10366).
- Parse index references as empty spans with attributes (#10171). Attributes included are
-
LaTeX reader:
- Put parsed minipage in specially marked Div (#10266).
-
HTML reader:
- Parse footnotes defined by dpub-aria roles (#5294).
-
MediaWiki reader:
-
Typst reader:
-
Commonmark reader:
implicit_figures
should check for empty caption and not produce an implicit figure in this case (#10429).
-
RST reader:
- Use a new one-pass parsing strategy. Instead of having an initial pass where we collect reference definitions, we create links with target
##SUBST##something
or##REF##something
or##NOTE##something
, and resolve these in a pass over the parsed AST. This allows us to handle link references that are not at the top level (#10281). - Ignore newlines in URL in explicit link (#10279).
- Handle block level substitutions.
- Support
:file:
on raw directive (#8584). - Implement option lists (#10318).
- Avoid putting metadata in Para (#7766). Create MetaInlines when possible, just as with markdown input. MetaBlocks is still used when there are multiple paragraphs or non-paragraph content. This change also affects field lists.
- Fix linked substitutions (#6588). E.g.
|Python|_
. - Support inline anchors (#9196).
- Explicit links define references (#5081). For example,
Go to `g`_ `g <www.example.com>`_.
should produce two links to www.example.com.
- Use a new one-pass parsing strategy. Instead of having an initial pass where we collect reference definitions, we create links with target
-
EPUB writer:
- Use standardized filename for cover image instead of the original name (#10404). This avoids problems with e.g.Β filenames containing spaces.
-
Markdown writer:
- Issue INFO warning when not rendering table, e.g., when
raw_html
is disabled and the table canβt be fit into a supported markdown table format (#10407). - Respect empty LineBlock lines in
plain
output (Evan Silberman). The plain writer behaved as a markdown variant withExt_line_blocks
turned off, and so empty lines in a line block would get eliminated.
- Issue INFO warning when not rendering table, e.g., when
-
LaTeX writer:
Ensure that beamer footnotes go on frame, not column (#5769).
-
HTML writer:
- Unwrap empty incremental divs (#10328, Albert Krewinkel). Divs are unwrapped if the only purpose of the div seems to be to control whether lists are presented incrementally on slides.
-
Typst writer:
- Make template sensitive to a
page-numbering
variable (#10370). This can be set to an empty string (or, in metadata, to false) for no page numbers. - Make
smart
extension work (#10271). Ifsmart
is not enabled, a command in the default template will disable smartquote substitutions. Whensmart
is enabled, render curly apostrophes as straight and escape straight apostrophes. Whensmart
is disabled, render curly apostrophes as curly and donβt escape straight apostrophes. Similarly for quotes, em and en dashes. This should give more idiomatic typst output, with fewer unnecessary escapes.
- Make template sensitive to a
-
ANSI writer:
- Respect empty LineBlock lines (Evan Silberman).
-
JATS writer:
- Correct spelling of suppress attribute (#10350, Andreas Deininger).
-
Typst template:
- Remove
definitions.typst
partial. - Remove unnecessary definition of
endnote
. - Incorporate the one remaining definition into
default.typst
. - Use typst 0.12 code for two column layout (#10294, Luis Rivera).
- Note: the new templates presuppose typst 0.12; if you try to use an earlier version of typst, an error will be raised.
- Remove
-
LaTeX/Beamer template:
- Split
fonts.latex
partial into two parts:fonts.latex
andfont-settings.latex
. - In beamer template, load beamer theme between
fonts.latex
andfont-settings.latex
. This allows a theme (such as metropolis) to set its own default font, while still allowing the user to override it. This fixes a regression in pandoc 3.5 (#10297). - Note: Users who have custom templates based on pandoc 3.5 templates will need to add
font-settings.latex()
afterfonts.latex()
in the latex template. In a beamer template, the beamer theme-setting code needs to be moved between these two partials.
- Split
-
ConTeXt template: Ensure that font names donβt wrap (#10305).
-
epub.css
: remove background-color (#10264, Suraj Patil). With this greyish background color, epubs look bad on a Kindle (#10263). -
Text.Pandoc.ImageSize: add WebP support (Evan Silberman, #10397). Add
Webp
constructor on ImageType [API change]. -
Text.Pandoc.Readers.Roff and a new unexported module Text.Pandoc.Readers.Roff.Escape: parameterize Roff escaping (Evan Silberman) [API change]. This allows code to be reused between the mdoc and man readers, despite the differing Token types.
-
Text.Pandoc.PDF:
-
Text.Pandoc.Logging: add
YamlWarning
constructor toLogMessage
[API change] (#10312). -
Text.Pandoc.Format: remove duplicate typst entry (#10388, Caleb Mclennan).
-
Fix a typo in the
ua.yaml
localization for βSeeβ (Jens). -
Lua subsystem (Albert Krewinkel):
- Remove prefixes from Lua type names (#8574). Lua type names were inconsistent with regard to the use of prefixes; all prefixes are removed now, and Lua types now have the same name as the Haskell types. The use of app-specific prefixes is suggested by the Lua manual to avoid collisions. However, this shouldnβt be a problem with pandoc, as it cannot be used as a Lua package.
-
doc/libraries.md: Add newly developed Haskell packages. Sort list alphabetically (Albert Krewinkel).
-
doc/lua-filters.md: document
pandoc.List:iter
method (Albert Krewinkel). List objects have a new functioniter
that returns an iterator function that returns the next list item on each call. -
MANUAL.txt:
-
Fix comments in TEI writer referring to DocBook (#10430, Evan Silberman).
-
Fix several typos in documentation (#10349, Andreas Deininger).
-
Allow Diff 1.0.
-
Add font-settings.latex partial to pandoc.cabal (#10379).
-
Bump upper bound for data-default.
-
Use latest typst, texmath, pandoc-lua-marshal, commonmark-pandoc, commonmark-extensions, skylighting, skylighting-format-blaze-html.
pandoc 3.5
Click to expand changelog
-
Add command-line options
--list-of-figures/--lof
and--list-of-tables/--lot
(#10029, Akash Patel). Only docx, latex, and context are affected by these options currently. Setting thelof
andlot
variables will also work for the formats that are currently supported. -
Defaults files: interpolation of environment variables now works for
to
andfrom
fields (#8024). This is needed because these files can contain paths of custom readers/writers. -
Docx reader:
- Reset lists after headers in same list
numId
(#10258). To accomplish this, we add a Heading constructor to BodyPart and include on it all the information list items have.
- Reset lists after headers in same list
-
DocBook reader:
- Parse id, class, and tabstyle on tables (#10181, Erik Rask). Add parsing of id (xml:id), class, and tabstyle XML attributes for table and informaltable in the DocBook reader. The tabstyle value is put in the βcustom-styleβ attribute.
-
Dokuwiki reader:
- Be more forgiving about misaligned lists, like dokuwiki itself (#8863).
- Improve blockquote parsing in dokuwiki. Allow for quoted code blocks.
- Enable smart extension.
- Properly parse
--
and---
as dashes. - Fix block quote behavior (#6461). Blockquotes are not really block containers in DokuWiki; the lines are interpreted literally (so, e.g., you canβt start a list), and line breaks are added at the ends.
-
EPUB reader:
- Fix links to other files in the EPUB, making them internal links to a fragment derived from the filename (#10207). There was already code to handle links like
#foo
, but not to handle links likech0001.html#foo
.
- Fix links to other files in the EPUB, making them internal links to a fragment derived from the filename (#10207). There was already code to handle links like
-
LaTeX reader:
- Add em, ex, px, mu to list of units for dimension args (#10212).
-
ANSI writer:
- Fix subscripts (Evan Silberman).
-
DokuWiki writer:
- Donβt emit
<HTML>
tags (#7413). The use of these tags is now strongly discouraged for security reasons, and will be removed. We previously used them as a fallback for lists that could not be represented using DokuWiki syntax, e.g.Β ordered lists with fancy numbers or lists with multiple blocks in their items. We also used them for block quotes with multiple blocks as their contents. We now use the<WRAP>
syntax (from the optional WRAP plugin) to handle lists with multiple blocks as their contents. A new method of handling block quotes with complex contents has the side benefit of also handling nested block quotes, which werenβt supported before.<HTML>
and<html>
tags are only for raw HTML blocks and inlines, and only if theraw_html
extension is enabled. (It is now a valid extension fordokuwiki
, though off by default.)
- Donβt emit
-
Docx writer:
- Support
--list-of-figures
and--list-of-tables
(orlof
andlot
variables) (Akash Patel).
- Support
-
HTML writer:
- Donβt emit missing title/lang warnings if templates does not contain the
pagetitle
orlang
variables respectively (#9370).
- Donβt emit missing title/lang warnings if templates does not contain the
-
LaTeX writer:
- Better fix for lists in definition lists (#10241). In commit a26ec96 we added an empty
\item[]
to the beginning of a list that occurs first in a definition list, to avoid having one item on the line with the label. This gave bad results in some cases (#10241) and there is a more idiomatic solution anyway: using\hfill
. - Avoid error on
refs
div with empty citations (#10185). If there are no citations, donβt emit an empty CSLReferences environment.
- Better fix for lists in definition lists (#10241). In commit a26ec96 we added an empty
-
RST writer:
- Change bullet list hang from 3 to 2. This accords with the style in the RST reference docs.
- Handle cases where indented context starts with block quote (#10236). In these cases we emit an empty comment to fix the point from which indentation is measured; otherwise the block quote is not parsed as a block quote. This affects list items and admonitions.
- Donβt enclose the list table in a
.. table::
; this leads to doubled captions (#10226). - Fix alignment of list table items corresponding to cells (#10227).
-
JATS template:
- Support
floats-group
(Albert Krewinkel, see #10196). The content of thefloats-group
variable is now rendered in a<floats-group>
element when using the publishing or archiving tag sets.
- Support
-
LaTeX and Beamer templates:
- Split old default.latex into two templates,
default.latex
anddefault.beamer
, factoring common parts into partials:fonts.latex
,common.latex
,passoptions.latex
,hypersetup.latex
,after-header-includes.latex
. - Make
default.beamer
the default template for beamer. - Add
shorttitle
,shortsubtitle
,shortauthor
,shortinstitute
,shortdate
variables to beamer template (#10248, Thomas Hodgson). - Make
--number-sections
work with beamer (#12045, Thomas Hodgson). - Support a list of images for
titlegraphic
in beamer template (#10246, Thomas Hodgson). Title graphic options will be applied to each title graphic. Images will be separated by\enspace
. - Beamer theme options (#10243)
- Add theme options to beamer template:
colorthemeoptions
,fontthemeoptions
,innerthemeoptions
,outerthemeoptions
(#10243, Thomas Hodgson). - Donβt load amsmath, amssym in beamer template. These are loaded by beamer automatically.
- Split old default.latex into two templates,
-
Text.Pandoc.SelfContained:
- Improve handling of links to remote CSS (#10261).
-
Text.Pandoc.Class:
- Allow extracting
data:
URIs even in PandocPure (--sandbox
) (#10249). - Export
extractURIData
[API change].
- Allow extracting
-
Text.Pandoc.PDF:
- Read
.toc
and.log
files from output directory (#10186). When this is different from the input directory, this is where.toc
and.log
files are written.
- Read
-
Text.Pandoc.Shared:
- Modify
addPandocAttributes
for changes in commonmark-pandoc. The new commonmark-pandoc version automatically adds the attributewrapper="1"
on all Divs and Spans that are introduced just as containers for attributes that belong properly to their contents. So we donβt need to add the attribute here. This gives much better results in some cases. Previously the wrapper attribute was being added even for explicit Divs and Spans in djot, but it is not needed in these cases.
- Modify
-
Text.Pandoc.Options:
- Add
writerListOfFigures
andwriterListOfTables
fields toWriterOptions
(#8245, Akash Patel). [API change]
- Add
-
Text.Pandoc.App:
- Add
optListOfFigures
andoptListOfTables
toOpt
(#8245) [API change].
- Add
-
Lua subsystem (Albert Krewinkel):
-
Update List module (#9835). The module now comes with a method
:at(index[, def])
that allows to access indices, accepts negative indices to count from the end, and will return thedef
value as a default if the list has no item at the given position. Furthermore, the list constructorpandoc.List
now accepts iterators. E.g.,pandoc.List(text:gmatch '%S+')
returns the list of words intext
. -
Support character styling via
pandoc.layout
. TheDoc
values produced and handled by thepandoc.layout
module can now be styled usingbold
,italic
,underlined
, orstrikeout
. The style is ignored in normal rendering, but becomes visible when rendering to ANSI output. Thepandoc.layout.render
function now takes a third parameter that defines the output style, either plain or ansi. -
It is now possible to return a single filter from a filter file, e.g.
-- Switch single- and double quotes return { Quoted = function (q) elem.quotetype = elem.quotetype == 'SingleQuote' and 'DoubleQuote' or 'SingleQuote' return elem end }
The filter must not contain numerical indexes, or it might be treated as a list of filters.
-
Add
list_of_figures
andlist_of_tables
to writer options (Akash Patel).
-
-
Use latest releases of commonmark, commonmark-pandoc, texmath, djot.
-
Stop depending on package SHA (Albert Krewinkel). Use
crypton
instead. -
linux/make_artifacts.sh
: add riscv64 support (Olivier Benz). -
Fix invalid XML in
test/docx/normalize.docx
(#10242). -
doc/lua-filters.md
: list functions inpandoc.utils
alphabetically (Albert Krewinkel). -
MANUAL.txt:
pandoc 3.4
Click to expand changelog
-
New output format:
ansi
(for formatted console output) (Evan Silberman). Most Pandoc elements are supported and printed in a reasonable way, if not always ideally. This version does no detection of terminal capabilities, nor does it fall back to different output styles for less-capable terminals. -
Add command line options
--table-caption-position
and--figure-caption-position
. These allow the user to specify whether to put captions above or below tables and figures, respectively. The following output formats are supported: HTML (and related such as EPUB), LaTeX (and Beamer), Docx, ODT/OpenDocument, Typst. -
Change default
--pdf-engine
via HTML to WeasyPrint (#10142).wkhtmltopdf
is deprecated.weasyprint
is the easiest-to-install, maintained alternative. For better results, one might preferpagedjs-cli
. -
Org reader:
- Fix parsing of src blocks with an
-i
flag (#10071, Albert Krewinkel). Tabs are now preserved in the contents of src blocks if the the block has the-i
flag.
- Fix parsing of src blocks with an
-
RTF reader:
- Handle images inside
shp
contexts (#10145).
- Handle images inside
-
RST reader:
-
Improve simple table support (#10093). Multiline rows occur only when the first cell is empty; we were previously treating lines with any empty cell as row continuations. In addition, we no longer wrap multiline cells in Para if they can be represented as Plain. This is consistent with docutils behavior.
-
LaTeX reader:
-
Typst reader:
- Change how βblockβ elements are handled. Previously they were always parsed as divs. But actually they can occur in some βinlineβ contexts. Now we first try to parse them as inlines, and only as blocks if that fails. A surrounding Div or Span element is added only if there is an identifier.
-
HTML reader:
- Only parse main elementβs contents (if present) (#10140). If main has an id or class, we include a div with that id or class; otherwise just the contents.
- Read TeX annotation in MathML content if present (#9971).
- Better handle KaTeX-generated math (#9971). KaTeX emits the mathml followed by a span with an HTML fallback. Previously pandoc was converting both. We now ignore the HTML fallback span, marked with class
katex-html
.
-
New module: Text.Pandoc.Writers.ANSI [API change] (Evan Silberman).
-
Docx writer:
- Add βSuppressAuthorβ and βAuthorOnlyβ to citationMode when
+citations
is used (thomjur). - Support
custom-style
attribute for docx table (Sebbones). - Support
--number-offsets
. - Make table/figure rendering sensitive to caption position settings.
- Add βSuppressAuthorβ and βAuthorOnlyβ to citationMode when
-
OpenDocument writer:
- Make table/figure rendering sensitive to caption position settings.
-
Typst writer/template:
- Implement figure caption positions by triggering a show rule in the default template, which determines caption positions for figures and tables globally.
- Donβt include trailing semicolon after
@
style citations with suffixes (#10148). - Template: move header-includes before show doc (#9996, Gordon Woodhull).
-
LaTeX writer:
-
HTML writer/template:
- Make
<figcaption>
placement sensitive to caption position settings. For tables,<caption>
must be the first element, and positioning is determined by CSS, for here we set a variable which the default template is sensitive to. - Use
makeSectionsWithOffsets
forwriterNumberOffsets
, instead of the old, inefficient code. - Donβt add doc-biblioref role to every link in a citation; only to links to the bibliography (#10156).
- Add
data-
when renderinglabel
attribute (#10048).
- Make
-
Markdown writer:
-
Avoid emitting markdown caption if table has fallen back to raw HTML, which will then contain a
<caption>
tag (#10094). -
Make math sensitive to
tex_math_gfm
extension (#9121). This means that in GFM output, the βnew styleβ math will be used by default, e.g.$`x=y`$ ```math x = y ```
To defeat this and get the older behavior, namely
$x=y$ $$x=y$$
one could use
-t gfm-tex_math_gfm
.
-
-
AsciiDoc writer:
- Add
link:
prefix when needed (#10105). AsciiDoc requires it except forhttp
,https
,irc
,mailto
,ftp
schemes (#10105). - Preserve original base level (#10062). We used to normalize so that the base level is always 1, but asciidoc no longer seems to care about that, and the behavior creates difficulties when we are converting fragments.
- Donβt emit empty figure caption (#10047).
- Add
-
ODT writer:
- Add TableCaption to styles.xml (#10058, Ian Max Andolina).
-
LaTeX template:
- Fix wrong beamer color in (sub)section page (Jonathan).
-
Text.Pandoc.Options:
- Add
CaptionPosition
and newWriterOptions
fieldswriterFigureCaptionPosition
andwriterTableCaptionPosition
[API change].
- Add
-
Text.Pandoc.Opt:
- Change default for optNumberOffset to
[]
. This behaves the same as[0,0,0,0,0]
. - Add
Opt
fieldsoptFigureCaptionPosition
andoptTableCaptionPosition
[API change].
- Change default for optNumberOffset to
-
Text.Pandoc.Format: change
formatFromFilePaths
so that it is smarter about URLs. URLs are parsed, and we take the format from the path component, if present (#10141). This means thathttps://emacs.org/
will be treated as HTML, whilehttps://emacs.org/sample.org
will be treated as Org. -
Text.Pandoc.URI:
- Add unofficial
gemini:
to list of URI schemes (Pau RE).
- Add unofficial
-
Text.Pandoc.Shared:
- Add
makeSectionsWithOffsets
[API change]. - Remove `stripEmptyParagraphs [API change] (Albert Krewinkel). This function is no longer used.
- Add
-
Text.Pandoc.Highlighting: Expose
formatANSI
[API change] (Evan Silberman). -
Text.Pandoc.Writers.Shared: export
to{Sub,Super}scriptInline
[API change] (Evan Silberman). -
Remove use of partial functions (e.g.Β
head
) in code. -
Use latest skylighting-core, skylighting, doclayout, texmath, typst.
-
pandoc-lua-engine: Add accessors for several writer options, including some that were added in previous releases.
-
pandoc-server: Initialize some missing fields in WriterOptions:
writerEpubTitlePage
,writerChunkTemplate
,writerListTables
,writerFigureCaptionPosition
,writerTableCaptionPosition
. -
CONTRIBUTING.md: Summarize steps for adding a new cli option.
-
MANUAL.txt:
- Clarify that the
--number-offset
option should only directly affect numbering of the first section heading in a document; subsequent headings will increment normally. - Fix asciidoc link (#10039).
- Fix CSL Docs broken link (#10100, Tristano Ajmone).
- Document the use of
luatexja
when CJKmainfont is used with lualatex (#3873, Kolen Cheung). - Add a
citations
(typst) section to the manual (#9127). - Clarify that
citations
affects both input and output fororg
. - Add note on
--citeproc
that you may need to disablecitations
extension on the output format (e.g.,-t markdown-citations
) to see the rendered citation (#9127, #10012).
- Clarify that the
-
INSTALL.md β reorganise info on static binaries and add conda-forge install options (#10098, #10069, Ian Max Andolina).
pandoc 3.3
Click to expand changelog
-
New cli option:
--link-images
. This causes images to be linked rather than embedded in ODT. -
Allow
--number-sections
to take an optionaltrue|false
argument. -
RTF reader:
- Handle
\*\shppict
without dropping image (#10025).
- Handle
-
TWiki Reader:
- Recognize WikiWords as internal links (#9941).
- Avoid partial function.
-
Typst reader:
- Ignore βpadβ and just parse its body (#9958).
- Use typst 0.5.0.5. Fixes parsing of equations like
$1.$
.
-
Docx writer:
- Fix regression with nested lists (#9994). The bug affects e.g. ordered lists with bullet sublists; after the sublist the top-level list reverts to bullets instead of being properly numbered. This is a regression introduced in version 3.2.1.
-
BibTeX writer:
- Ensure that βliteralβ names are enclosed in braces (#9987).
-
Man writer:
- Use default middle header when metadata does not include
header
(#9943). This change causes pandoc to omit the middle header parameter whenheader
is not set, rather than emitting""
. The parameter is optional and man will use a default based on the section if it is not specified.
- Use default middle header when metadata does not include
-
HTML templates: donβt load polyfill (#9918). This was added in a period when MathJaX required polyfill. MathJaX no longer recommends this and polyfill should no longer be necessary on any reasonably modern browser.
-
Translations:
- Add
ua.yaml
(Jens OehlschlΓ€gel). - Add a script (
tools/update-translations.py
) and Makefile target (update-translations
) to update translation data automatically from babel and polyglossia upstream (Stephen Huan). - Use this script to update language data, increasing the number of languages we cover (Stephen Huan). Fix a few small bugs in existing translations.
- Add
-
Fix some mistakes with Japanese language code (#9938). In several places we were mistakenly assuming that the BCP 47 code for Japanese language was
jp
. It isja
. -
Text.Pandoc.Options:
- New field in WriterOptions:
writerLinkImages
[API change] (#9815).
- New field in WriterOptions:
-
Text.Pandoc.App.Opt:
- New field in Opt:
optLinkImages
[API change] (#9815).
- New field in Opt:
-
Lua subsystem:
-
Keep
lpeg
andre
as βloadedβ modules (Albert Krewinkel). The moduleslpeg
andre
are now treated as if they had been loaded withrequire
. Previously the modules were only assigned to global values, but could be loaded again viarequire
, thereby allowing to use a system-wide installation. However, this proved to be confusing.The old behavior can be restored by adding the following lines to the top of Lua scripts, or to the
init.lua
in the data dir.debug.registry()['_LOADED'].lpeg = nil debug.registry()['_LOADED'].re = nil
-
-
pandoc-cli
: Include pandoc copyright in Lua version info (Albert Krewinkel). -
pandoc-cli
: Refer printing of version info to the Lua interpreter (Albert Krewinkel). The Lua interpreter no longer terminates when called with-v
or--version
arguments, thus improving compatibility with the defaultlua
interpreter program. -
Avoid partial functions in JATS reader, DocBook writer, Haddock reader.
-
Allow tls 2.1.x.
-
MANUAL.txt:
- Make documentation of extensions clearer (#9060).
- Fix section level for two Extensions entries.
-
lua-filters.md: Partially autogenerate docs for module
pandoc
(Albert Krewinkel). The documentation system isnβt powerful enough to generate the full documentation automatically.
pandoc 3.2.1
Click to expand changelog
-
Fix
gfm_auto_identifiers
to replace emojis with their aliases, as documented (#9876). -
CSV reader:
- Turn line breaks into LineBreaks not SoftBreaks (#9797).
-
Docx reader:
- Support task lists (#8211).
- Fix a small bug in parsing delimiters in numbered lists, which led to the default delimiter being used wrongly in some cases.
- Improve handling of captions.
- Support HorizontalRule. We support both pandoc-style and the style described on a Microsoft support page, an empty paragraph with a bottom border (#6285).
- React to
"left"
value onjc
attribute. - Handle column and cell alignments (#8551). We take the column alignments from the first body row.
- Fix a bug that caused comments inside insertions or deletions to be ignored (#9833).
-
HTML reader:
- Better handle non-
li
elements inul
andol
(#9809). For example, ap
after a closedli
will be incorporated into the previousli
. This mirrors what browsers do with this invalid HTML.
- Better handle non-
-
LaTeX reader:
- Fix parsing of dimensions beginning with
.
, e.g.Β\kern.1pt
(#9902).
- Fix parsing of dimensions beginning with
-
Markdown reader:
- Allow author-only textual citations (#7219). E.g.
-@reese2002
outside of brackets.
- Allow author-only textual citations (#7219). E.g.
-
RST reader:
-
Textile reader:
- Donβt let spans begin right after a symbol (#9878).
-
Texinfo writer:
- Ensure proper escaping in all node/link contexts.
- Target node rather than anchor when possible in internal links.
- Remove illegal characters from internal link anchors (#6177).
- Use two commas not one in
@ref
. - Donβt add anchors to headings. We donβt need them, now that we make internal links use the node.
- Avoid duplicate node names.
- Improve menus. Properly handle the case where the node name is different from the descriptive title.
-
Texinfo template: add variables for filename and version.
-
Typst reader:
-
Fix an incomplete pattern match (#9807).
-
Handle inline bodies ending in a parbreak. E.g.
`#strong[ test ]
-
-
ConTeXt template: remove
\setupbackend[export=yes]
(#9820). -
Docx writer:
- Omit
jc
attribute on table cells with AlignDefault (#5662). - Better formatting for task lists. Task lists are now properly formatted, with no bullet (#5198).
- Replace an expensive generic traverse to remove Space elements, for better performance.
- The new OpenXML template had spaces for metadata that need to be filled with OpenXML fragments with the proper shape. This patch ensures that everything is the right shape.
- Wrap figures with
id
in a bookmark (#8662). - Add eastAsia font hints to
w:r
(#9817). We do this when the text in the run contains any CJK characters. This ensures that ambiguous code points (e.g.Β quotation marks) will be represented as βwideβ characters when together with CJK characters. - Clean up Abstract Title and Subtitle in default reference docx. Center Subtitle, remove color.
- Allow OpenXML templates to be used with
docx
(#8338, #9069, #7256, #2928). The--reference-doc
option allows customization of styles in docx output, but it does not allow one to adjust the content of the output (e.g., changing the order in which metadata, the table of contents, and the body of the document are displayed), or adding boilerplate text before or after the document body. For these changes, one can now use--template
with an OpenXML template. (See the defaultopenxml
template for a sample.)--include-before-body
and--include-after-body
can also now be used withdocx
output. The included files must be OpenXML fragments suitable for inclusion in the document body. - New unexported module Text.Pandoc.Writers.Docx.OpenXML.
- Omit
-
HTML writer:
-
Ensure URI escaping needed for
html4
(#9905). Unicode characters need not be escaped for html5, and still wonβt be. -
Donβt emit unnecessary classes in HTML tables (#9325, Thomas Soeiro). Pandoc used to emit a
header
class on thetr
element that forms the table header. This is no longer needed, becausehead > tr
will do the same thing. Similarly, pandoc used to emiteven
andodd
classes ontr
s, allowing striped styling. This is no longer needed, because one can use e.g.Βtbody tr:nth-child(2n)
.Compatibility warning: users who relied on these classes to style tables may need to adjust their CSS.
-
-
JATS writer:
- Support
supplementary-material
in metadata forjats_articlepublishing
(#9818).
- Support
-
LaTeX writer:
- New method for ensuring images donβt overflow (#9660). Previously we relied on graphicx internals and made global changes to Gin to force images to be resized if they exceed textwidth. This approach is brittle and caused problems with
\includesvg
(see #9660). The new approach uses a new macro\pandocbounded
that is now defined in the LaTeX template. (Thanks here to Falk Hanisch in mrpiggi/svg#60.) The LaTeX writer has been changed to enclose\includegraphics
and\includesvg
commands in this macro when they donβt explicitly specify a width or height. In addition, the writer now addskeepaspectratio
to the\includegraphics
or\includesvg
options ifheight
is specified without width, or vice versa. Previously, this was set in the preamble as a global option. Users should attend to the following compatibility issues:- If custom templates are used with the new LaTeX writer, they will have to be updated to include the new
\pandocbounded
macro, or an error will be raised because of the undefined macro. - Documents that specify explicit dimensions for an image may render differently, if the dimensions are greater than the line width or page height. Previously pandoc would shrink these images to fit, but the new behavior takes the specified dimensions literally. In addition, pandoc previously always enforced
keepaspectratio
, even when width and height were both specified, so images with width and height specified that do not conform to their intrinsic aspect ratio will appear differently.
- If custom templates are used with the new LaTeX writer, they will have to be updated to include the new
- Task lists must be unordered (#9185).
- Specify language option for
selnolig
and only include it ifenglish
orgerman
is used (#9863). (This includes changes to the LaTeX template.) This should restore proper ligature suppression when lualatex is used. - Fix
--toc-depth
with beamer output (#9861). Previously only top-level sections were ever included in the TOC, regardless of the setting of--toc-depth
. - Use
\linewidth
instead of\columnwidth
or\textwidth
for resizing figures, table cells, etc. in LaTeX (#9775).\linewidth
, unlike the others, is sensitive to indented environments like lists.
- New method for ensuring images donβt overflow (#9660). Previously we relied on graphicx internals and made global changes to Gin to force images to be resized if they exceed textwidth. This approach is brittle and caused problems with
-
LaTeX template: put
babel-lang
in options to beamer (#9868). This is required to make beamer use proper localized terms for things like βSection.β -
Markdown writer:
-
Typst writer:
- Support β.typst:no-figureβ and βtypst:figure:kind=kindβ attributes (#9778, Carlos Scheidegger). This extends support for fine-grained properties in Typst. If the
typst:no-figure
class is present on a Table, the table will not be placed in a figure. If thetypst:figure:kind
attribute is present, its value will be used for the figureβskind
(#9777). These features are documented indoc/typst-property-output.md
.
- Support β.typst:no-figureβ and βtypst:figure:kind=kindβ attributes (#9778, Carlos Scheidegger). This extends support for fine-grained properties in Typst. If the
-
Typst template:
-
Textile writer:
- Get rid of header, odd, even classes on
tr
(#9376).
- Get rid of header, odd, even classes on
-
Text.Pandoc.Class:
fillMediaBag
: Convert IOErrors to warnings when fetching absolute paths (#9859, Albert Krewinkel). This will allow many conversions that would have failed with an error to succeed (albeit without images or other needed resources).
-
Text.Pandoc.ImageSize:
- Donβt prefer exif width/height when they conflict with image width/height (#9871). That was a mistaken call in #6936. Usually when these values disagree, it is because the image has been resized by a tool that leaves the original exif values the same, so the width/height metadata are more likely to be correct that exif width/height.
-
Text.Pandoc.SelfContained:
- Strip CRs from XML before base64 encoding for data URI (so tests can work on Windows).
- Only create
<svg>
elements for SVG images when the image has the classinline-svg
. Otherwise just use adata
URI as we do with other images (#9787).
-
Lua subsystem (Albert Krewinkel):
- Split Init module into more modules. The module has grown unwieldy and is therefore split into three internal Haskell modules,
Init
,Module
, andRun
. - Add function
pandoc.utils.run_lua_filter
(#9803). - Add function
pandoc.template.get
(#9854, co-authored by Carsten Gips). The function allows to specify a template with the same argument value that would be used with the--template
command line parameter. - Keep CommonState object in the registry. The state is an internal value and should be treated as such. The
PANDOC_STATE
global is merely a copy...
- Split Init module into more modules. The module has grown unwieldy and is therefore split into three internal Haskell modules,
pandoc 3.2
Click to expand changelog
-
Change to
--file-scope
behavior (#8741): previously a Div with an identifier derived from the filename would be added around the contents of each file. This caused problems for βchunkingβ files into chapters, e.g.Β in EPUB. We no longer add the surrounding Div. This cooperates better with chunking. Note, however, that if you have relied on the old behavior to link to the beginning of the contents of a file using its filename as identifier, that will no longer work. -
Markdown reader:
- Allow repeated labels in numbered example lists. Previously if you tried to use the same label as an earlier example list item, youβd get a new number, not the old one, and references to the label would go to the second occurrence. Now an existing label will be reused, and no new number will be generated. Caveat: this only works reliably when the re-used example list item occurs by itself in a list, or occurs in a list of previously used example list items that occur in exactly the same order as previously.
- Fix
normalCite
so it doesnβt consume past a closing]
boundary (#9710). This was causing an exponential performance bug on long lists of links containing potential emphasis characters. - Generalize
inlinesInBalancedBrackets
toinBalancedBrackets
, with a parameter for the inner parser. - Auto-close unclosed divs (#9635). This applies to both fenced and HTML-ish varieties. Otherwise we face an exponential performance problem with backtracking. A warning is issued when a div is implicitly closed.
-
RST reader:
- Fix
figclass
andalign
annotations for figures (#7473, Gokul Rajiv).
- Fix
-
LaTeX writer:
-
LaTeX reader:
-
LaTeX template: add
titlegraphicoptions
variable (#9207, Guilhem Saurel). -
Docx reader:
-
RTF reader:
- Donβt try to handle non-default code pages (#9683). Emit a warning instead.
-
OpenDocument writer:
- Implement custom-style for spans (#9657).
-
Typst writer:
- Add blank line in definition lists with multiple definitions (see #9704).
- Property output (#9648, Gordon Woodhull). The Typst writer will pass on specially marked attributes as raw Typst parameters on selected elements. This allows extensive customization using filters. A separate document (
doc/typst-property-output.md
) has been added that provides extensive documentation and examples of the use of this feature.
-
Markdown writer:
- Donβt try to align columns in pipe tables with lines greater than COLUMNS. The alignment just reduces readibility when the lines soft wrap.
- Donβt use
raw_attribute
syntax for raw blocks, unless there is no other option (see #9677). Macros in araw_attribute
block donβt get interpreted when it is read again by pandocβs markdown reader.
-
ConTeXt writer:
- Replace depreciated
\sc
with\setsmallcaps
(#9518, James P. Ascher).
- Replace depreciated
-
Docx writer:
- Use conventional styles/indents for Word bullet lists (#7280).
-
reference.docx
:- Use current standard Word theme (#7280). This includes using the sans-serif font Aptos instead of the serif font Cambria, and default colors for headings.
- Remove duplicate
DefaultParagraphFont
instyles.xml
.
-
New module Text.Pandoc.Transforms [API change] (Albert Krewinkel). This module exports the following functions which were formerly exported from Tetx.Pnadoc.Shared:
headerShift
,filterIpynbOutput
,eastAsianLineBreakFilter
, as well as some functions that were previously not exported. -
Text.Pandoc.Shared:
headerShift
,filterIpynbOutput
, andeastAsianLineBreakFilter
are no longer exported from this module; they are now exported from Text.Pandoc.Transforms (Albert Krewinkel).
-
Text.Pandoc.Error:
- Improve reporting of unsupported extensions errors (#9247, Albert Krewinkel).
-
Text.Pandoc.App:
- Move βtransformsβ after filters (#9664). This will mean that
--shift-heading-level-by
affects a heading added byreference-section-title
.
- Move βtransformsβ after filters (#9664). This will mean that
-
Text.Pandoc.App.CommandLineOptions:
- Simplify output for
OptVersion
. Omit the information about versions of dependencies. We no longer emit version info at this level anyway;pandoc-cli
intercepts and handles--version
. This code would only be called if someone used the pandoc library functionhandleWithOptInfo
in their own program.
- Simplify output for
-
Text.Pandoc.ImageSize:
- Export
ImageSize
datatype.
- Export
-
Text.Pandoc.SelfContained:
- Merge class attribute when both img and svg specify it (#9652, Carlos Scheidegger).
-
Text.Pandoc.Logging:
- Add
ScriptingInfo
constructor forLogMessage
[API change] (Albert Krewinkel). - Make
DocxParserWarning
a WARNING, not INFO. [API change]. - Add
UnsupportedCodePage
constructor toLogMessage
[API change]. - Add
UnclosedDiv
constructor forLogMessage
[API change].
- Add
-
Lua subsystem (Albert Krewinkel:
- Add a
pandoc.log
module. - Uupdate to pandoc-lua-marshal version 0.2.7 (#8916). This fixes counterintuitive behavior of the
content
property on BulletList and OrderedList items. Unmarshalling of that field now matches the behavior of the constructor. - Use newest zip module. This adds a
symlink
function to Entry objects, allowing to check if an entry represents a symbolic link. - Improve
pandoc.json.decode
docs. - Update and fix docs for
pandoc.types.Version
andpandoc.utils.type
. - Add new module
pandoc.image
The module provides basic querying functions for image properties. - Bump pandoc-lua-engine to 0.2.1.4.
- Add a
-
Use latest KaTeX CDN asset (#9707, Salim B).
-
pandoc-cli
: ensure UTF8 when emitting version info. -
tools/update-lua-module-docs.lua: improve script-internal docs, cleanup (Albert Krewinkel).
-
Allow network 3.2.
-
Use latest versions of texmath, djot, skylighting-core, skylighting.
-
Fix command test for #9652.
-
Fix some typos in code comments (#9638, guqicun).
-
Command tests: include regular PATH after directory with the test executable (ensures that DLLs will be found on Windows).
-
MANUAL.txt:
- Document
handout
variable for beamer (#9742). - Document formats affected by
--slide-level
(#9745). - Update the list of required LaTeX packages (#9728, Albert Krewinkel).
- Use more descriptive link text for ODT (#9673).
- Add clarification about
toc-title
indocx
,pptx
(#9645). - Better document truthiness for conditionals (#9661).
- Mention that
custom-style
works with ODT (Ian Max Andolina). - Harmonize typographic dashes (#9688, Salim B). Standardize on
---
with no space.
- Document
-
INSTALL.md: Minor tweaks (#9705, Leo Heitmann Ruiz).
pandoc 3.1.13
Click to expand changelog
-
Org reader:
- Fix treatment of
id
property under heading (#9639).
- Fix treatment of
-
DocBook reader:
- Add empty title to admonition div if not present (#9569). This allows admonition elements (e.g.Β
<note>
) to work withgfm
admonitions even if the<title>
is not present.
- Add empty title to admonition div if not present (#9569). This allows admonition elements (e.g.Β
-
DokuWiki reader:
-
Typst reader:
- Support Typst 0.11 table features: col/rowspans, table head and foot (#9588).
- Parse cell col/rowspans.
-
CSLJson writer:
- Put
$
or$$
around math incsljson
output (#9616).
- Put
-
ConTeXt writer:
- Fix options order with
\externalfigure
. The dimensions should comebeforeafter the class if both are present.
- Fix options order with
-
Typst writer:
- Put label after Span, not before. Labels get applied to preceding markup item.
- Support Typst 0.11 table features (#9588): colspans, rowspans, cell alignment overrides, relative column widths, header and footer, multiple table bodies with intermediate headers. Row heads are not yet supported.
- The default typst template has been modified so that tables donβt have lines by default. As is standard with pandoc, we only add a line under a header or over a footer. However, a different default stroke pattern can easily be added in a template.
- More reliable escaping in inline
[..]
contexts (#9586). For example, we need to escape[\1. April]
or it will be treated as an ordered list. - Handle
unnumbered
on headings (#9585).
-
LaTeX writer:
- Fix math inside strikeout (#9597).
-
Text.Pandoc.Writers.Shared:
- Export
isOrderedListMarker
[API change].
- Export
-
Change lhs tests so they donβt use
--standalone
. This will avoid test failures due to minor changes in skylighting versions, e.g.Β #9589. -
Use latest texmath, typst.
-
Require pandoc-lua-marshal 0.2.6 (#9613, Albert Krewinkel). Fixes an issue arising when the value of
content
properties on BlockQuote, Figure, and Div elements was an empty list. -
Update lua-filters.md (#9611, Carlos Scheidegger).
pandoc 3.1.12.3
Click to expand changelog
-
Markdown reader: Fix bug with footnotes at end of fenced div (#9576).
-
LaTeX reader:
- Improve tokenization of
@
(#9555). Make tokenization sensitive to\makeatletter
/\makeatother
. Previously we just always treated@
as a letter. This led to bad results, e.g.Β with the sequence\@
. E.g.,a\@ b
would parse as βabβ anda\@b
as βaβ. - Make
withRaw
work insideparseFromToks
(#9517). This is needed for raw environments to work inside table cells. - Better handling of table colwidths (#9579). Previously the parser just failed if the column width specified in
p{}
wasnβt a multiple of\linewidth
. This led to cases where content was skipped.
- Improve tokenization of
-
Typst writer:
- Add βkindβ parameter to figures with tables (#9574).
- Avoid unnecessary box around image in figure (#9236).
- Omit width/height in images unless explicitly specified (#9236). Previously we computed width/heigth for images that didnβt have size information, because otherwise typst would expand the image to fit page width. This typst behavior has changed in 0.11. This change fixes a bug in which images would sometimes overflow page margins, depending on their intrinsic size.
- Donβt add hard-coded
inset
to tables (#9580). Instead, set this globally in the default template, allowing it to be customized.
-
LaTeX template: Fix block headings support for unnumbered paragraphs (#9542, #6018, Oliver Fabel).
-
HTML templates: Replace polyfill provider (#9537, @SukkaW). Replace polyfill.io with cdnjs.cloudflare.com/polyfill. polyfill.io has been acquired by Funnull, and the service has become unstable.
-
Korean translations: delete colon in translation for βtoβ. This was invalid YAML, and not desired anyway, since a colon is added.
-
Use latest commonmark, commonmark-extensions. This fixes a 3.12 regression in parsing of commonmark/gfm autolinks (jgm/commonmark-hs#151).
-
Depend on djot 0.1.1.3, which fixes a serious parsing bug affecting regular paragraphs after lists.
-
Depend on latest skylighting, skylighting-core, typst-hs, texmath.
-
MANUAL.txt: Change broken link to IDML cookbook (#9563).
pandoc 3.1.12.2
Click to expand changelog
-
Docx reader:
-
Markdown reader: fix regression in link parsing with wikilinks extensions (#9481). This fixes a regression introduced in 3.1.12.
-
Org reader/writer: support admonitions (#9475).
-
Org writer: omit extra blank line at end of quote block.
-
Typst writer: ensure that
-
,+
, etc. are escaped at beginning of block (#9478). Our recent relaxing of escaping (#9386) caused problems for things like emphasized-
characters that were rendered using#strong[-]#
. This now gets rendered as#strong[\-]
. -
LaTeX writer: fix bug when a language is specified in two different ways (#9472). If you used
lang: de-DE
but then had a span or div withlang=de
, the preamble would try to loadngerman
twice, leading to an error. This fix ensures that a language is only loaded once. -
Docx writer: Donβt copy over
footnotePr
insettings.xml
from reference.docx (#9522). -
EPUB writer: omit EPUB2-specific meta tag on EPUB3 (#9493). This caused a validation failure in epubs with cover images.
-
Lua: avoid crashing when an error message is not valid UTF-8 (Albert Krewinkel).
-
Text.Pandoc.SelfContained:
- Add
role="img"
to svgs. - Add
aria-label
to svg elements withalt
text if present. Screen readers ignorealt
attributes on svg elements but do pay attention toaria-label
(#9525).
- Add
-
Text.Pandoc.Shared: Fix regression in section numbering in
makeSections
(#9516). Starting with pandoc 3.1.12, unnumbered sections incremented the section number. -
Text.Pandoc.Class: fix
openUrl
TLS negotiation (#9483). With the release of TLS 2.0.0, the TLS library started requiring Extended Main Secret for the TLS handshake. This caused problems connecting to zoteroβs server and others that do not support TLS 1.3. This commit relaxes this requirement. -
Depend on djot 0.1.1.0 (fixes rendering on multiline block attributes).
-
Use new releases of skylighting-format-blaze-html (#9520). Fixes auto-wrapping of long source lines in HTML print media.
-
Use new commonmark-extensions (fixes issue with the
rebase_relative_paths
extension when used with commonmark/gfm. -
Makefile: improve epub-validation target (#9493). Use
--epub-cover-image
to catch issues that only arise with that.
pandoc 3.1.12.1
Click to expand changelog
-
EPUB writer: omit EPUBv3-specific accessibility features on epub2 (#9469). Fixes a regression in 3.1.12.
-
More fixes for SVG ids with
--self-contained
(#9467). This generalizes the fix to #9420 so it applies to things likestyle="fill(url(#..."
and should fix problems with SVGs including gradients. -
Powerpoint writer: properly handle math in headings and tables (#9465). This ensures that paragraphs containing math are wrapped in a
mc:AlternateContent
node as required. -
Makefile: make validate-epub check v2 output too.