Parsing is the first step of converting markdown to HTML. The parsing rules are responsible for separating markdown syntax from plain text. The parsers scan the markdown content and use various rules to produce a list of tokens.
Each token represents either a piece of markdown syntax (for example, the begining of a fenced code block, a list item, etc.) or plain text that will be included as-is (escaped) in the final HTML text. The tokens will be later be used by the Renderer to produce actual HTML.
There are three kind of parsing rules in Remarkable:
Each uses different datastructures and signatures. Unless you wish to modify the internal workflow of Remarkable, you will most probably only deal with Block and Inline rules.
Tokens comes in three kinds:
- Tag token
- Content (
text
) token - Block content (
inline
) tokens
All tokens have the following properties:
type
: The type of the tokenlevel
: The nesting level of the associated markdown structure in the source.
Tokens generated by block parsing rules will also include a lines
property which is a 2 elements array marking the first and last line of the
src
used to generate the token.
Parsing rules will usually generates at least three tokens:
- The start or open token marking the beginning of the markdown structure
- The content token (usually with type
inline
for a block rule, ortext
for an inline rule) - The end or close token makring the end of the markdown structure
Tag tokens are used to represent markdown syntax. Each tag token represents a special markdown syntax in the original markdown source. They are usually used for the open and close tokens. For example the "```" at the begining of a fenced block code, the start of an item list or the end of a emphasized part of a line.
Tag tokens have a variety of types and each is associated to a rendering rule.
Text tokens represent plain text. It is usually used for the content of inline structures. Most of them will be generated automatically by the inline parser. They are also sometimes generated explicitly by the inline parsing rules.
A text token has a content
property containing the text it represents.
Inline tokens represent the content of a block structure. These tokens have two additional properties:
content
: The content of the block. This might include inline mardown syntax which may need further processing by the inline rules.children
: This is initialized with an empty array ([]
) and will be filled with the inline parser tokens as the inline parsing rules are applied.