forked from jgm/pandoc
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
2491 lines (1842 loc) · 86.6 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% Pandoc User's Guide
% John MacFarlane
% January 27, 2012
Synopsis
========
pandoc [*options*] [*input-file*]...
Description
===========
Pandoc is a [Haskell] library for converting from one markup format to
another, and a command-line tool that uses this library. It can read
[markdown] and (subsets of) [Textile], [reStructuredText], [HTML],
[LaTeX], [MediaWiki markup], and [DocBook XML]; and it can write plain
text, [markdown], [reStructuredText], [XHTML], [HTML 5], [LaTeX]
(including [beamer] slide shows), [ConTeXt], [RTF], [DocBook XML],
[OpenDocument XML], [ODT], [Word docx], [GNU Texinfo], [MediaWiki
markup], [EPUB] (v2 or v3), [FictionBook2], [Textile], [groff man] pages, [Emacs
Org-Mode], [AsciiDoc], and [Slidy], [Slideous], [DZSlides], or [S5] HTML
slide shows. It can also produce [PDF] output on systems where LaTeX is
installed.
Pandoc's enhanced version of markdown includes syntax for footnotes,
tables, flexible ordered lists, definition lists, fenced code blocks,
superscript, subscript, strikeout, title blocks, automatic tables of
contents, embedded LaTeX math, citations, and markdown inside HTML block
elements. (These enhancements, described below under
[Pandoc's markdown](#pandocs-markdown), can be disabled using the
`markdown_strict` input or output format.)
In contrast to most existing tools for converting markdown to HTML, which
use regex substitutions, Pandoc has a modular design: it consists of a
set of readers, which parse text in a given format and produce a native
representation of the document, and a set of writers, which convert
this native representation into a target format. Thus, adding an input
or output format requires only adding a reader or writer.
Using `pandoc`
--------------
If no *input-file* is specified, input is read from *stdin*.
Otherwise, the *input-files* are concatenated (with a blank
line between each) and used as input. Output goes to *stdout* by
default (though output to *stdout* is disabled for the `odt`, `docx`,
`epub`, and `epub3` output formats). For output to a file, use the
`-o` option:
pandoc -o output.html input.txt
Instead of a file, an absolute URI may be given. In this case
pandoc will fetch the content using HTTP:
pandoc -f html -t markdown http://www.fsf.org
If multiple input files are given, `pandoc` will concatenate them all (with
blank lines between them) before parsing.
The format of the input and output can be specified explicitly using
command-line options. The input format can be specified using the
`-r/--read` or `-f/--from` options, the output format using the
`-w/--write` or `-t/--to` options. Thus, to convert `hello.txt` from
markdown to LaTeX, you could type:
pandoc -f markdown -t latex hello.txt
To convert `hello.html` from html to markdown:
pandoc -f html -t markdown hello.html
Supported output formats are listed below under the `-t/--to` option.
Supported input formats are listed below under the `-f/--from` option. Note
that the `rst`, `textile`, `latex`, and `html` readers are not complete;
there are some constructs that they do not parse.
If the input or output format is not specified explicitly, `pandoc`
will attempt to guess it from the extensions of
the input and output filenames. Thus, for example,
pandoc -o hello.tex hello.txt
will convert `hello.txt` from markdown to LaTeX. If no output file
is specified (so that output goes to *stdout*), or if the output file's
extension is unknown, the output format will default to HTML.
If no input file is specified (so that input comes from *stdin*), or
if the input files' extensions are unknown, the input format will
be assumed to be markdown unless explicitly specified.
Pandoc uses the UTF-8 character encoding for both input and output.
If your local character encoding is not UTF-8, you
should pipe input and output through `iconv`:
iconv -t utf-8 input.txt | pandoc | iconv -f utf-8
Creating a PDF
--------------
Earlier versions of pandoc came with a program, `markdown2pdf`, that
used pandoc and pdflatex to produce a PDF. This is no longer needed,
since `pandoc` can now produce `pdf` output itself. To produce a PDF, simply
specify an output file with a `.pdf` extension. Pandoc will create a latex
file and use pdflatex (or another engine, see `--latex-engine`) to convert it
to PDF:
pandoc test.txt -o test.pdf
Production of a PDF requires that a LaTeX engine be installed (see
`--latex-engine`, below), and assumes that the following LaTeX packages are
available: `amssymb`, `amsmath`, `ifxetex`, `ifluatex`, `listings` (if the
`--listings` option is used), `fancyvrb`, `longtable`, `url`,
`graphicx`, `hyperref`, `ulem`, `babel` (if the `lang` variable is set),
`fontspec` (if `xelatex` or `lualatex` is used as the LaTeX engine), `xltxtra`
and `xunicode` (if `xelatex` is used).
`hsmarkdown`
------------
A user who wants a drop-in replacement for `Markdown.pl` may create
a symbolic link to the `pandoc` executable called `hsmarkdown`. When
invoked under the name `hsmarkdown`, `pandoc` will behave as if
invoked with `-f markdown_strict --email-obfuscation=references`,
and all command-line options will be treated as regular arguments.
However, this approach does not work under Cygwin, due to problems with
its simulation of symbolic links.
[Cygwin]: http://www.cygwin.com/
[`iconv`]: http://www.gnu.org/software/libiconv/
[CTAN]: http://www.ctan.org "Comprehensive TeX Archive Network"
[TeX Live]: http://www.tug.org/texlive/
[MacTeX]: http://www.tug.org/mactex/
Options
=======
General options
---------------
`-f` *FORMAT*, `-r` *FORMAT*, `--from=`*FORMAT*, `--read=`*FORMAT*
: Specify input format. *FORMAT* can be `native` (native Haskell),
`json` (JSON version of native AST), `markdown` (pandoc's
extended markdown), `markdown_strict` (original unextended markdown),
`textile` (Textile), `rst` (reStructuredText), `html` (HTML),
`docbook` (DocBook XML), `mediawiki` (MediaWiki markup),
or `latex` (LaTeX). If `+lhs` is appended to `markdown`, `rst`,
`latex`, the input will be treated as literate Haskell source:
see [Literate Haskell support](#literate-haskell-support), below.
Markdown syntax extensions can be individually enabled or disabled
by appending `+EXTENSION` or `-EXTENSION` to the format name.
So, for example, `markdown_strict+footnotes+definition_lists`
is strict markdown with footnotes and definition lists enabled,
and `markdown-pipe_tables+hard_line_breaks` is pandoc's markdown
without pipe tables and with hard line breaks. See [Pandoc's
markdown](#pandocs-markdown), below, for a list of extensions and
their names.
`-t` *FORMAT*, `-w` *FORMAT*, `--to=`*FORMAT*, `--write=`*FORMAT*
: Specify output format. *FORMAT* can be `native` (native Haskell),
`json` (JSON version of native AST), `plain` (plain text),
`markdown` (pandoc's extended markdown), `markdown_strict` (original
unextended markdown), `rst` (reStructuredText), `html` (XHTML
1), `html5` (HTML 5), `latex` (LaTeX), `beamer` (LaTeX beamer slide show),
`context` (ConTeXt), `man` (groff man), `mediawiki` (MediaWiki markup),
`textile` (Textile), `org` (Emacs Org-Mode), `texinfo` (GNU Texinfo),
`docbook` (DocBook XML), `opendocument` (OpenDocument XML), `odt`
(OpenOffice text document), `docx` (Word docx), `epub` (EPUB book), `epub3`
(EPUB v3), `fb2` (FictionBook2 e-book), `asciidoc` (AsciiDoc), `slidy`
(Slidy HTML and javascript slide show), `slideous` (Slideous HTML and
javascript slide show), `dzslides` (HTML5 + javascript slide show), `s5`
(S5 HTML and javascript slide show), or `rtf` (rich text format). Note that
`odt`, `epub`, and `epub3` output will not be directed to *stdout*; an output
filename must be specified using the `-o/--output` option. If `+lhs` is
appended to `markdown`, `rst`, `latex`, `beamer`, `html`, or `html5`, the
output will be rendered as literate Haskell source: see [Literate Haskell
support](#literate-haskell-support), below. Markdown syntax extensions can
be individually enabled or disabled by appending `+EXTENSION` or
`-EXTENSION` to the format name, as described above under `-f`.
`-o` *FILE*, `--output=`*FILE*
: Write output to *FILE* instead of *stdout*. If *FILE* is
`-`, output will go to *stdout*. (Exception: if the output
format is `odt`, `docx`, `epub`, or `epub3`, output to stdout is disabled.)
`--data-dir=`*DIRECTORY*
: Specify the user data directory to search for pandoc data files.
If this option is not specified, the default user data directory
will be used:
$HOME/.pandoc
in unix and
C:\Documents And Settings\USERNAME\Application Data\pandoc
in Windows. A `reference.odt`, `reference.docx`, `default.csl`,
`epub.css`, `templates`, `slidy`, `slideous`, or `s5` directory
placed in this directory will override pandoc's normal defaults.
`-v`, `--version`
: Print version.
`-h`, `--help`
: Show usage message.
Reader options
--------------
`-R`, `--parse-raw`
: Parse untranslatable HTML codes and LaTeX environments as raw HTML
or LaTeX, instead of ignoring them. Affects only HTML and LaTeX
input. Raw HTML can be printed in markdown, reStructuredText, HTML,
Slidy, Slideous,
DZSlides, and S5 output; raw LaTeX can be printed in markdown,
reStructuredText, LaTeX, and ConTeXt output. The default is for the
readers to omit untranslatable HTML codes and LaTeX environments.
(The LaTeX reader does pass through untranslatable LaTeX *commands*,
even if `-R` is not specified.)
`-S`, `--smart`
: Produce typographically correct output, converting straight quotes
to curly quotes, `---` to em-dashes, `--` to en-dashes, and
`...` to ellipses. Nonbreaking spaces are inserted after certain
abbreviations, such as "Mr." (Note: This option is significant only when
the input format is `markdown`, `markdown_strict`, or `textile`. It
is selected automatically when the input format is `textile` or the
output format is `latex` or `context`, unless `--no-tex-ligatures`
is used.)
`--old-dashes`
: Selects the pandoc <= 1.8.2.1 behavior for parsing smart dashes: `-` before
a numeral is an en-dash, and `--` is an em-dash. This option is selected
automatically for `textile` input.
`--base-header-level=`*NUMBER*
: Specify the base level for headers (defaults to 1).
`--indented-code-classes=`*CLASSES*
: Specify classes to use for indented code blocks--for example,
`perl,numberLines` or `haskell`. Multiple classes may be separated
by spaces or commas.
`--normalize`
: Normalize the document after reading: merge adjacent
`Str` or `Emph` elements, for example, and remove repeated `Space`s.
`-p`, `--preserve-tabs`
: Preserve tabs instead of converting them to spaces (the default).
Note that this will only affect tabs in literal code spans and code
blocks; tabs in regular text will be treated as spaces.
`--tab-stop=`*NUMBER*
: Specify the number of spaces per tab (default is 4).
General writer options
----------------------
`-s`, `--standalone`
: Produce output with an appropriate header and footer (e.g. a
standalone HTML, LaTeX, or RTF file, not a fragment). This option
is set automatically for `pdf`, `epub`, `epub3`, `fb2`, `docx`, and `odt`
output.
`--template=`*FILE*
: Use *FILE* as a custom template for the generated document. Implies
`--standalone`. See [Templates](#templates) below for a description
of template syntax. If no extension is specified, an extension
corresponding to the writer will be added, so that `--template=special`
looks for `special.html` for HTML output. If the template is not
found, pandoc will search for it in the user data directory
(see `--data-dir`). If this option is not used, a default
template appropriate for the output format will be used (see
`-D/--print-default-template`).
`-V` *KEY[=VAL]*, `--variable=`*KEY[:VAL]*
: Set the template variable *KEY* to the value *VAL* when rendering the
document in standalone mode. This is generally only useful when the
`--template` option is used to specify a custom template, since
pandoc automatically sets the variables used in the default
templates. If no *VAL* is specified, the key will be given the
value `true`.
`-D` *FORMAT*, `--print-default-template=`*FORMAT*
: Print the default template for an output *FORMAT*. (See `-t`
for a list of possible *FORMAT*s.)
`--no-wrap`
: Disable text wrapping in output. By default, text is wrapped
appropriately for the output format.
`--columns`=*NUMBER*
: Specify length of lines in characters (for text wrapping).
`--toc`, `--table-of-contents`
: Include an automatically generated table of contents (or, in
the case of `latex`, `context`, and `rst`, an instruction to create
one) in the output document. This option has no effect on `man`,
`docbook`, `slidy`, `slideous`, or `s5` output.
`--no-highlight`
: Disables syntax highlighting for code blocks and inlines, even when
a language attribute is given.
`--highlight-style`=*STYLE*
: Specifies the coloring style to be used in highlighted source code.
Options are `pygments` (the default), `kate`, `monochrome`,
`espresso`, `zenburn`, `haddock`, and `tango`.
`-H` *FILE*, `--include-in-header=`*FILE*
: Include contents of *FILE*, verbatim, at the end of the header.
This can be used, for example, to include special
CSS or javascript in HTML documents. This option can be used
repeatedly to include multiple files in the header. They will be
included in the order specified. Implies `--standalone`.
`-B` *FILE*, `--include-before-body=`*FILE*
: Include contents of *FILE*, verbatim, at the beginning of the
document body (e.g. after the `<body>` tag in HTML, or the
`\begin{document}` command in LaTeX). This can be used to include
navigation bars or banners in HTML documents. This option can be
used repeatedly to include multiple files. They will be included in
the order specified. Implies `--standalone`.
`-A` *FILE*, `--include-after-body=`*FILE*
: Include contents of *FILE*, verbatim, at the end of the document
body (before the `</body>` tag in HTML, or the
`\end{document}` command in LaTeX). This option can be be used
repeatedly to include multiple files. They will be included in the
order specified. Implies `--standalone`.
Options affecting specific writers
----------------------------------
`--self-contained`
: Produce a standalone HTML file with no external dependencies, using
`data:` URIs to incorporate the contents of linked scripts, stylesheets,
images, and videos. The resulting file should be "self-contained,"
in the sense that it needs no external files and no net access to be
displayed properly by a browser. This option works only with HTML output
formats, including `html`, `html5`, `html+lhs`, `html5+lhs`, `s5`,
`slidy`, `slideous`,
and `dzslides`. Scripts, images, and stylesheets at absolute URLs
will be downloaded; those at relative URLs will be sought first relative
to the working directory, then relative to the user data directory (see
`--data-dir`), and finally relative to pandoc's default data directory.
`--offline`
: Deprecated synonym for `--self-contained`.
`-5`, `--html5`
: Produce HTML5 instead of HTML4. This option has no effect for writers
other than `html`. (*Deprecated:* Use the `html5` output format instead.)
`--ascii`
: Use only ascii characters in output. Currently supported only
for HTML output (which uses numerical entities instead of
UTF-8 when this option is selected).
`--reference-links`
: Use reference-style links, rather than inline links, in writing markdown
or reStructuredText. By default inline links are used.
`--atx-headers`
: Use ATX style headers in markdown output. The default is to use
setext-style headers for levels 1-2, and then ATX headers.
`--chapters`
: Treat top-level headers as chapters in LaTeX, ConTeXt, and DocBook
output. When the LaTeX template uses the report, book, or
memoir class, this option is implied. If `--beamer` is used,
top-level headers will become `\part{..}`.
`-N`, `--number-sections`
: Number section headings in LaTeX, ConTeXt, or HTML output.
By default, sections are not numbered.
`--no-tex-ligatures`
: Do not convert quotation marks, apostrophes, and dashes to
the TeX ligatures when writing LaTeX or ConTeXt. Instead, just
use literal unicode characters. This is needed for using advanced
OpenType features with XeLaTeX and LuaLaTeX. Note: normally
`--smart` is selected automatically for LaTeX and ConTeXt
output, but it must be specified explicitly if `--no-tex-ligatures`
is selected. If you use literal curly quotes, dashes, and ellipses
in your source, then you may want to use `--no-tex-ligatures`
without `--smart`.
`--listings`
: Use listings package for LaTeX code blocks
`-i`, `--incremental`
: Make list items in slide shows display incrementally (one by one).
The default is for lists to be displayed all at once.
`--slide-level`=*NUMBER*
: Specifies that headers with the specified level create
slides (for `beamer`, `s5`, `slidy`, `slideous`, `dzslides`). Headers
above this level in the hierarchy are used to divide the
slide show into sections; headers below this level create
subheads within a slide. The default is to set the slide level
based on the contents of the document; see
[Structuring the slide show](#structuring-the-slide-show), below.
`--section-divs`
: Wrap sections in `<div>` tags (or `<section>` tags in HTML5),
and attach identifiers to the enclosing `<div>` (or `<section>`)
rather than the header itself.
See [Section identifiers](#header-identifiers-in-html-latex-and-context), below.
`--email-obfuscation=`*none|javascript|references*
: Specify a method for obfuscating `mailto:` links in HTML documents.
*none* leaves `mailto:` links as they are. *javascript* obfuscates
them using javascript. *references* obfuscates them by printing their
letters as decimal or hexadecimal character references.
`--id-prefix`=*STRING*
: Specify a prefix to be added to all automatically generated identifiers
in HTML and DocBook output, and to footnote numbers in markdown output.
This is useful for preventing duplicate identifiers when generating
fragments to be included in other pages.
`-T` *STRING*, `--title-prefix=`*STRING*
: Specify *STRING* as a prefix at the beginning of the title
that appears in the HTML header (but not in the title as it
appears at the beginning of the HTML body). Implies
`--standalone`.
`-c` *URL*, `--css=`*URL*
: Link to a CSS style sheet. This option can be be used repeatedly to
include multiple files. They will be included in the order specified.
`--reference-odt=`*FILE*
: Use the specified file as a style reference in producing an ODT.
For best results, the reference ODT should be a modified version
of an ODT produced using pandoc. The contents of the reference ODT
are ignored, but its stylesheets are used in the new ODT. If no
reference ODT is specified on the command line, pandoc will look
for a file `reference.odt` in the user data directory (see
`--data-dir`). If this is not found either, sensible defaults will be
used.
`--reference-docx=`*FILE*
: Use the specified file as a style reference in producing a docx file.
For best results, the reference docx should be a modified version
of a docx file produced using pandoc. The contents of the reference docx
are ignored, but its stylesheets are used in the new docx. If no
reference docx is specified on the command line, pandoc will look
for a file `reference.docx` in the user data directory (see
`--data-dir`). If this is not found either, sensible defaults will be
used. The following styles are used by pandoc: [paragraph]
Normal, Title, Authors, Date, Heading 1, Heading 2, Heading 3,
Heading 4, Heading 5, Block Quote, Definition Term, Definition,
Body Text, Table Caption, Picture Caption; [character] Default
Paragraph Font, Body Text Char, Verbatim Char, Footnote Reference,
Hyperlink.
`--epub-stylesheet=`*FILE*
: Use the specified CSS file to style the EPUB. If no stylesheet
is specified, pandoc will look for a file `epub.css` in the
user data directory (see `--data-dir`). If it is not
found there, sensible defaults will be used.
`--epub-cover-image=`*FILE*
: Use the specified image as the EPUB cover. It is recommended
that the image be less than 1000px in width and height.
`--epub-metadata=`*FILE*
: Look in the specified XML file for metadata for the EPUB.
The file should contain a series of Dublin Core elements,
as documented at <http://dublincore.org/documents/dces/>.
For example:
<dc:rights>Creative Commons</dc:rights>
<dc:language>es-AR</dc:language>
By default, pandoc will include the following metadata elements:
`<dc:title>` (from the document title), `<dc:creator>` (from the
document authors), `<dc:date>` (from the document date, which should
be in [ISO 8601 format]), `<dc:language>` (from the `lang`
variable, or, if is not set, the locale), and `<dc:identifier
id="BookId">` (a randomly generated UUID). Any of these may be
overridden by elements in the metadata file.
`--epub-embed-font=`*FILE*
: Embed the specified font in the EPUB. This option can be repeated
to embed multiple fonts. To use embedded fonts, you
will need to add declarations like the following to your CSS (see
`--epub-stylesheet`):
@font-face {
font-family: DejaVuSans;
font-style: normal;
font-weight: normal;
src:url("DejaVuSans-Regular.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: normal;
font-weight: bold;
src:url("DejaVuSans-Bold.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: italic;
font-weight: normal;
src:url("DejaVuSans-Oblique.ttf");
}
@font-face {
font-family: DejaVuSans;
font-style: italic;
font-weight: bold;
src:url("DejaVuSans-BoldOblique.ttf");
}
body { font-family: "DejaVuSans"; }
`--latex-engine=`*pdflatex|lualatex|xelatex*
: Use the specified LaTeX engine when producing PDF output.
The default is `pdflatex`. If the engine is not in your PATH,
the full path of the engine may be specified here.
Citation rendering
------------------
`--bibliography=`*FILE*
: Specify bibliography database to be used in resolving
citations. The database type will be determined from the
extension of *FILE*, which may be `.mods` (MODS format),
`.bib` (BibLaTeX format, which will normally work for BibTeX
files as well), `.bibtex` (BibTeX format),
`.ris` (RIS format), `.enl` (EndNote format),
`.xml` (EndNote XML format), `.wos` (ISI format),
`.medline` (MEDLINE format), `.copac` (Copac format),
or `.json` (citeproc JSON). If you want to use multiple
bibliographies, just use this option repeatedly.
`--csl=`*FILE*
: Specify [CSL] style to be used in formatting citations and
the bibliography. If *FILE* is not found, pandoc will look
for it in
$HOME/.csl
in unix and
C:\Documents And Settings\USERNAME\Application Data\csl
in Windows. If the `--csl` option is not specified, pandoc
will use a default style: either `default.csl` in the
user data directory (see `--data-dir`), or, if that is
not present, the Chicago author-date style.
`--citation-abbreviations=`*FILE*
: Specify a file containing abbreviations for journal titles and
other bibliographic fields (indicated by setting `form="short"`
in the CSL node for the field). The format is described at
<http://citationstylist.org/2011/10/19/abbreviations-for-zotero-test-release/>.
Here is a short example:
{ "default": {
"container-title": {
"Lloyd's Law Reports": "Lloyd's Rep",
"Estates Gazette": "EG",
"Scots Law Times": "SLT"
}
}
}
`--natbib`
: Use natbib for citations in LaTeX output.
`--biblatex`
: Use biblatex for citations in LaTeX output.
Math rendering in HTML
----------------------
`-m` [*URL*], `--latexmathml`[=*URL*]
: Use the [LaTeXMathML] script to display embedded TeX math in HTML output.
To insert a link to a local copy of the `LaTeXMathML.js` script,
provide a *URL*. If no *URL* is provided, the contents of the
script will be inserted directly into the HTML header, preserving
portability at the price of efficiency. If you plan to use math on
several pages, it is much better to link to a copy of the script,
so it can be cached.
`--mathml`[=*URL*]
: Convert TeX math to MathML (in `docbook` as well as `html` and `html5`).
In standalone `html` output, a small javascript (or a link to such a
script if a *URL* is supplied) will be inserted that allows the MathML to
be viewed on some browsers.
`--jsmath`[=*URL*]
: Use [jsMath] to display embedded TeX math in HTML output.
The *URL* should point to the jsMath load script (e.g.
`jsMath/easy/load.js`); if provided, it will be linked to in
the header of standalone HTML documents. If a *URL* is not provided,
no link to the jsMath load script will be inserted; it is then
up to the author to provide such a link in the HTML template.
`--mathjax`[=*URL*]
: Use [MathJax] to display embedded TeX math in HTML output.
The *URL* should point to the `MathJax.js` load script.
If a *URL* is not provided, a link to the MathJax CDN will
be inserted.
`--gladtex`
: Enclose TeX math in `<eq>` tags in HTML output. These can then
be processed by [gladTeX] to produce links to images of the typeset
formulas.
`--mimetex`[=*URL*]
: Render TeX math using the [mimeTeX] CGI script. If *URL* is not
specified, it is assumed that the script is at `/cgi-bin/mimetex.cgi`.
`--webtex`[=*URL*]
: Render TeX formulas using an external script that converts TeX
formulas to images. The formula will be concatenated with the URL
provided. If *URL* is not specified, the Google Chart API will be used.
Options for wrapper scripts
---------------------------
`--dump-args`
: Print information about command-line arguments to *stdout*, then exit.
This option is intended primarily for use in wrapper scripts.
The first line of output contains the name of the output file specified
with the `-o` option, or `-` (for *stdout*) if no output file was
specified. The remaining lines contain the command-line arguments,
one per line, in the order they appear. These do not include regular
Pandoc options and their arguments, but do include any options appearing
after a `--` separator at the end of the line.
`--ignore-args`
: Ignore command-line arguments (for use in wrapper scripts).
Regular Pandoc options are not ignored. Thus, for example,
pandoc --ignore-args -o foo.html -s foo.txt -- -e latin1
is equivalent to
pandoc -o foo.html -s
[LaTeXMathML]: http://math.etsu.edu/LaTeXMathML/
[jsMath]: http://www.math.union.edu/~dpvc/jsmath/
[MathJax]: http://www.mathjax.org/
[gladTeX]: http://ans.hsh.no/home/mgg/gladtex/
[mimeTeX]: http://www.forkosh.com/mimetex.html
[CSL]: http://CitationStyles.org
Templates
=========
When the `-s/--standalone` option is used, pandoc uses a template to
add header and footer material that is needed for a self-standing
document. To see the default template that is used, just type
pandoc -D FORMAT
where `FORMAT` is the name of the output format. A custom template
can be specified using the `--template` option. You can also override
the system default templates for a given output format `FORMAT`
by putting a file `templates/default.FORMAT` in the user data
directory (see `--data-dir`, above). *Exceptions:* For `odt` output,
customize the `default.opendocument` template. For `pdf` output,
customize the `default.latex` template.
Templates may contain *variables*. Variable names are sequences of
alphanumerics, `-`, and `_`, starting with a letter. A variable name
surrounded by `$` signs will be replaced by its value. For example,
the string `$title$` in
<title>$title$</title>
will be replaced by the document title.
To write a literal `$` in a template, use `$$`.
Some variables are set automatically by pandoc. These vary somewhat
depending on the output format, but include:
`header-includes`
: contents specified by `-H/--include-in-header` (may have multiple
values)
`toc`
: non-null value if `--toc/--table-of-contents` was specified
`include-before`
: contents specified by `-B/--include-before-body` (may have
multiple values)
`include-after`
: contents specified by `-A/--include-after-body` (may have
multiple values)
`body`
: body of document
`title`
: title of document, as specified in title block
`author`
: author of document, as specified in title block (may have
multiple values)
`date`
: date of document, as specified in title block
`lang`
: language code for HTML or LaTeX documents
`slidy-url`
: base URL for Slidy documents (defaults to
`http://www.w3.org/Talks/Tools/Slidy2`)
`slideous-url`
: base URL for Slideous documents (defaults to `default`)
`s5-url`
: base URL for S5 documents (defaults to `ui/default`)
`fontsize`
: font size (10pt, 11pt, 12pt) for LaTeX documents
`documentclass`
: document class for LaTeX documents
`geometry`
: options for LaTeX `geometry` class, e.g. `margin=1in`;
may be repeated for multiple options
`mainfont`, `sansfont`, `monofont`, `mathfont`
: fonts for LaTeX documents (works only with xelatex
and lualatex)
`theme`
: theme for LaTeX beamer documents
`colortheme`
: colortheme for LaTeX beamer documents
`linkcolor`
: color for internal links in LaTeX documents (`red`, `green`,
`magenta`, `cyan`, `blue`, `black`)
`urlcolor`
: color for external links in LaTeX documents
`links-as-notes`
: causes links to be printed as footnotes in LaTeX documents
Variables may be set at the command line using the `-V/--variable`
option. This allows users to include custom variables in their
templates.
Templates may contain conditionals. The syntax is as follows:
$if(variable)$
X
$else$
Y
$endif$
This will include `X` in the template if `variable` has a non-null
value; otherwise it will include `Y`. `X` and `Y` are placeholders for
any valid template text, and may include interpolated variables or other
conditionals. The `$else$` section may be omitted.
When variables can have multiple values (for example, `author` in
a multi-author document), you can use the `$for$` keyword:
$for(author)$
<meta name="author" content="$author$" />
$endfor$
You can optionally specify a separator to be used between
consecutive items:
$for(author)$$author$$sep$, $endfor$
If you use custom templates, you may need to revise them as pandoc
changes. We recommend tracking the changes in the default templates,
and modifying your custom templates accordingly. An easy way to do this
is to fork the pandoc-templates repository
(<http://github.com/jgm/pandoc-templates>) and merge in changes after each
pandoc release.
Pandoc's markdown
=================
Pandoc understands an extended and slightly revised version of
John Gruber's [markdown] syntax. This document explains the syntax,
noting differences from standard markdown. Except where noted, these
differences can be suppressed by using the `markdown_strict` format instead
of `markdown`. An extensions can be enabled by adding `+EXTENSION`
to the format name and disabled by adding `-EXTENSION`. For example,
`markdown_strict+footnotes` is strict markdown with footnotes
enabled, while `markdown-footnotes-pipe_tables` is pandoc's
markdown without footnotes or pipe tables.
Philosophy
----------
Markdown is designed to be easy to write, and, even more importantly,
easy to read:
> A Markdown-formatted document should be publishable as-is, as plain
> text, without looking like it's been marked up with tags or formatting
> instructions.
> -- [John Gruber](http://daringfireball.net/projects/markdown/syntax#philosophy)
This principle has guided pandoc's decisions in finding syntax for
tables, footnotes, and other extensions.
There is, however, one respect in which pandoc's aims are different
from the original aims of markdown. Whereas markdown was originally
designed with HTML generation in mind, pandoc is designed for multiple
output formats. Thus, while pandoc allows the embedding of raw HTML,
it discourages it, and provides other, non-HTMLish ways of representing
important document elements like definition lists, tables, mathematics, and
footnotes.
Paragraphs
----------
A paragraph is one or more lines of text followed by one or more blank line.
Newlines are treated as spaces, so you can reflow your paragraphs as you like.
If you need a hard line break, put two or more spaces at the end of a line.
**Extension: `escaped_line_breaks`**
A backslash followed by a newline is also a hard line break.
Headers
-------
There are two kinds of headers, Setext and atx.
### Setext-style headers ###
A setext-style header is a line of text "underlined" with a row of `=` signs
(for a level one header) of `-` signs (for a level two header):
A level-one header
==================
A level-two header
------------------
The header text can contain inline formatting, such as emphasis (see
[Inline formatting](#inline-formatting), below).
### Atx-style headers ###
An Atx-style header consists of one to six `#` signs and a line of
text, optionally followed by any number of `#` signs. The number of
`#` signs at the beginning of the line is the header level:
## A level-two header
### A level-three header ###
As with setext-style headers, the header text can contain formatting:
# A level-one header with a [link](/url) and *emphasis*
**Extension: `blank_before_header`**
Standard markdown syntax does not require a blank line before a header.
Pandoc does require this (except, of course, at the beginning of the
document). The reason for the requirement is that it is all too easy for a
`#` to end up at the beginning of a line by accident (perhaps through line
wrapping). Consider, for example:
I like several of their flavors of ice cream:
#22, for example, and #5.
### Header identifiers in HTML, LaTeX, and ConTeXt ###
**Extension**
Each header element in pandoc's HTML and ConTeXt output is given a
unique identifier. This identifier is based on the text of the header.
To derive the identifier from the header text,
- Remove all formatting, links, etc.
- Remove all punctuation, except underscores, hyphens, and periods.
- Replace all spaces and newlines with hyphens.
- Convert all alphabetic characters to lowercase.
- Remove everything up to the first letter (identifiers may
not begin with a number or punctuation mark).
- If nothing is left after this, use the identifier `section`.
Thus, for example,
Header Identifier
------------------------------- ----------------------------
Header identifiers in HTML `header-identifiers-in-html`
*Dogs*?--in *my* house? `dogs--in-my-house`
[HTML], [S5], or [RTF]? `html-s5-or-rtf`
3. Applications `applications`
33 `section`
These rules should, in most cases, allow one to determine the identifier
from the header text. The exception is when several headers have the
same text; in this case, the first will get an identifier as described
above; the second will get the same identifier with `-1` appended; the
third with `-2`; and so on.
These identifiers are used to provide link targets in the table of
contents generated by the `--toc|--table-of-contents` option. They
also make it easy to provide links from one section of a document to
another. A link to this section, for example, might look like this:
See the section on
[header identifiers](#header-identifiers-in-html).
Note, however, that this method of providing links to sections works
only in HTML, LaTeX, and ConTeXt formats.
If the `--section-divs` option is specified, then each section will
be wrapped in a `div` (or a `section`, if `--html5` was specified),
and the identifier will be attached to the enclosing `<div>`
(or `<section>`) tag rather than the header itself. This allows entire
sections to be manipulated using javascript or treated differently in
CSS.
Block quotations
----------------
Markdown uses email conventions for quoting blocks of text.
A block quotation is one or more paragraphs or other block elements
(such as lists or headers), with each line preceded by a `>` character
and a space. (The `>` need not start at the left margin, but it should
not be indented more than three spaces.)
> This is a block quote. This
> paragraph has two lines.
>
> 1. This is a list inside a block quote.
> 2. Second item.
A "lazy" form, which requires the `>` character only on the first
line of each block, is also allowed:
> This is a block quote. This
paragraph has two lines.
> 1. This is a list inside a block quote.
2. Second item.
Among the block elements that can be contained in a block quote are
other block quotes. That is, block quotes can be nested:
> This is a block quote.
>
> > A block quote within a block quote.
**Extension: `blank_line_before_blockquote`**
Standard markdown syntax does not require a blank line before a block
quote. Pandoc does require this (except, of course, at the beginning of the
document). The reason for the requirement is that it is all too easy for a
`>` to end up at the beginning of a line by accident (perhaps through line
wrapping). So, unless the `markdown_strict` format is used, the following does
not produce a nested block quote in pandoc:
> This is a block quote.
>> Nested.
Verbatim (code) blocks
----------------------
### Indented code blocks ###
A block of text indented four spaces (or one tab) is treated as verbatim
text: that is, special characters do not trigger special formatting,
and all spaces and line breaks are preserved. For example,
if (a > 3) {
moveShip(5 * gravity, DOWN);
}
The initial (four space or one tab) indentation is not considered part
of the verbatim text, and is removed in the output.
Note: blank lines in the verbatim text need not begin with four spaces.
### Fenced code blocks ###
**Extension: `fenced_code_blocks`**
In addition to standard indented code blocks, Pandoc supports
*fenced* code blocks. These begin with a row of three or more
tildes (`~`) or backticks (`` ` ``) and end with a row of tildes or
backticks that must be at least as long as the starting row. Everything
between these lines is treated as code. No indentation is necessary:
~~~~~~~
if (a > 3) {
moveShip(5 * gravity, DOWN);
}
~~~~~~~
Like regular code blocks, fenced code blocks must be separated
from surrounding text by blank lines.
If the code itself contains a row of tildes or backticks, just use a longer
row of tildes or backticks at the start and end:
~~~~~~~~~~~~~~~~
~~~~~~~~~~
code including tildes
~~~~~~~~~~
~~~~~~~~~~~~~~~~