Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How are delegates selected? #35

Open
jeffzoch opened this issue Feb 4, 2021 · 1 comment
Open

How are delegates selected? #35

jeffzoch opened this issue Feb 4, 2021 · 1 comment

Comments

@jeffzoch
Copy link

jeffzoch commented Feb 4, 2021

Im finding that the order I declare my delegates in a parser grammar affects whether or not it parses. I have a grammar like the following:

internal class Parser: Grammar<List<Command>>() {
    internal val comments by regexToken("#.*\n", true)
    internal val str by regexToken("\".*\"")
    internal val queryType by regexToken("[A-Z]+(?:_[A-Z]+)*")
    internal val word by regexToken("[A-Za-z]+")
    internal val LPAR by literalToken("(")
    internal val RPAR by literalToken(")")
    internal val COLON by literalToken(":")
    internal val LBRACE by literalToken("{")
    internal val RBRACE by literalToken("}")
    internal val equals by literalToken("=")
    internal val ws by regexToken("\\s+",true)
    internal val newline by regexToken("[\r\n]+",true)
    internal val comma by literalToken(",")
    internal val param: Parser<ValueMetadata> by (word and -COLON and word) map { (p, t) ->
        ValueMetadata(p.text, Type.valueOf(t.text))
    }
    val params by -LPAR and separatedTerms(param, comma, true) and -RPAR
    val outputs by -LPAR and separatedTerms(param, comma, true) and -RPAR
    val cmdParser by ( -LBRACE  and queryType and -equals and str and -RBRACE )
    val funcParser: Parser<Command> by (word and params and -COLON and params and cmdParser) map {
        (name, inputs, outputs, cmdFunc) ->
        val (type,cmd) = cmdFunc
        Command(name.text,
                inputs,
                outputs,
                cmd.text.subSequence(1, cmd.length - 1).toString(),
                QueryType.valueOf(type.text)
        )
    }

    override val rootParser: Parser<List<Command>> by zeroOrMore(funcParser)
}

thats meant to parse

# Documentation that should be ignored
findFoo(test:String,entity:String):(foo:String,bar:Int) {
  SQL_QUERY = "select foo,bar from baz where z = :name and y = :entity"
}
# Documentation that should be ignored
findBar(test:String,entity:String):(foo:String,bar:Int) {
  SQL_QUERY = "select foo,bar from baz where z = :name and y = :entity"
}

into a list of Commands. By just switching the order of str, queryType, and word the parse will fail / pass on different test cases with errors like Could not parse input: UnparsedRemainder(startsWith=word@2 for "findFoo" at 39 (2:1))

@h0tk3y
Copy link
Owner

h0tk3y commented Mar 17, 2021

The tokens you declare with delegation are matched in the same order as declared. So if the tokenizing is ambiguous (which is often the case) then the tokens declared earlier are prioritized.

Note also this section in the README:

Note: the tokens order matters in some cases, because the tokenizer tries to match them in exactly this order.
For instance, if literalToken("a")
is listed before literalToken("aa"), the latter will never be matched. Be careful with keyword tokens!
If you match them with regexes, a word boundary \b in the end may help against ambiguity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants