Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formal syntax guide #14

Open
IcedQuinn opened this issue Oct 31, 2022 · 2 comments
Open

Formal syntax guide #14

IcedQuinn opened this issue Oct 31, 2022 · 2 comments

Comments

@IcedQuinn
Copy link

IcedQuinn commented Oct 31, 2022

hello. i came across your project and was contemplating implementing a parser in Nim. i did however notice there isn't really a syntax guide so i read the API docs and available examples and attempted a draft of one.


Node: a node associates some flags, zero or more children, zero or more tags, and a textual identifier together. Flags store the particular type of an node the parser belives the text is (such as a number) and contextual information (such as if it comes before or after a comma.)

Tags: a tag is a node that begins with a @. Tags may also be directly followed by one block which contains the arguments of the tag. Zero or more tags are written before the node they belong to.

Blocks: blocks begin with an opening symbol, contain zero or more ((20221031041407-jifzs9e "nodes")), and end with a closing symbol. The opening and closing symbols do not need to match but the symbol used is recorded as a flag so the programmer may enforce a standard if so chosen.

Opening symbols: {, (, and [

Closing symbols: }, ), and ]

Implied block: The character : after a node's name indicates an implied block. An implied block may consist of a single Block or it may consist of zero or more nodes until the first separator encountered. This allows two forms of syntax to exist: foo: bar baz, ... and foo: { ... }.

Separator: ;, or ,. A separator punctuates a list. Which separator was used is recorded in a node's flags so the programmer may enforce a standard if chosen.

Flags: flags hold special meanings attributed to a node. This can be whether the parser things the text represents a number, a string, what kind of boundary characters were used for the string, if it came before a comma or semicolon, and so on.

Line comments: a line comment begins with // and goes until a new line character.

Block comments: a block comment begins with /* until */ is read. Block comments also nest so for every /* there must exist a matching */.

Comments: comments come in line or block forms. Comments are read during parsing but are thrown away. They exist for authors to keep notes to themselves that are of no interest to the computer.

Escape codes: an escape code is a pair of \ followed by some other symbols. For example \\ means the backslash is itself escaped and so should be replaced with a single \ character in the output. This is for strings.

Strings: a string begins and ends with a boundary character. All text between the boundaries are stored within a single node as the node's text. If the boundary characters are entered in triplicate the string is allowed to contain multiple lines. Escape codes are allowed to insert special characters that cannot otherwise be entered. Single quotes, double quotes, and backticks are allowed as boundaries or triple boundaries.

Examples

foo: {
	bar baz
}
  • Create a node called foo
  • Open an implicit grouping
  • Create an explicit group with {
  • Create nodes bar and baz inside the group
  • Close the explicit group with }
  • Since we are in an implicit group we add the newly created group node as a child of foo. Since we have triggered the special rule, though, we move the children individually to foo rather than adding the group as its own node.
@IcedQuinn
Copy link
Author

IcedQuinn commented Nov 2, 2022

I think I may have misunderstood implicit grouping. It looks like : is an assignment of a single interesting node. So foo: bar baz only assigns bar to foo and baz is a separate thing entirely. That would match the examples which use only whitespace to separate two named blocks.


Oh. Seems if the child is a block then the block itself is the entire group and the parser moves on. Otherwise a set of nodes is read up until a separator is encountered.


I managed to implement this based on all that https://git.sr.ht/~icedquinn/icedmetadesk

@IcedQuinn
Copy link
Author

Checking in on this again. Will be doing a short review of what happened since I last posted this in a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant