-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
javascript parser fails too early on incomplete statement in function body #3727
Comments
For all targets of the javascript/javascript/ grammar that have been ported (CSharp, Go, Java, JavaScript, Python3, and just now after updating Cpp locally), I get only one message. I use the default error strategy. The code is generated using trgen, with Antlr 4.13.1.
Even the old tester outputs only one message. That uses Antlr 4.11.1.
What Antlr4 version and target are you using? |
We are using Antlr 4.12.0 But for 4.12.0 or 4.13.0 it detects 3 errors for So it looks like master is also using antlr 4.11.1? |
The Antlr Maven tester--which is obsolete--sets the version that the code is compiled against. For version 1.22 of the plugin, the runtime is 4.11.1. Separately, the tool version is specified in the top-level pom.xml. It should be the value that the tester runtime uses, 4.11.1. These values should not be updated unless there are releases of both the tool and the tester so that version skews cannot occur. Yes, we have new version of the Antlr tool, but the tester is still stuck at 4.11.1. There is an Issue asking for a new version, but that was a while ago. And, it hasn't been published since last year. The problem is that the testers are compiled against a specific version of Antlr. Instead, tools like these should use dynamic loads and calls, so any version of the Antlr .jar can be used. Instead, I suggest using trgen. Attached is a .zip file with a driver for Linux. Generated-Java.zip. I modified the "build.sh" file in the generated code to use 4.12.0 instead, and I still only get one error message, the same as with other versions. (Note, there's a version required for the grammar in the desc.xml, but I added it because of the Go target, which only started working with 4.13.0, and it is only informational at this point. The grammar should work for other versions of Antlr.) |
This seems to be fixed by #3387
|
This problem is back if we add an optional identifier to the anonymous function production in order to support function expressions like examples/FunctionExpression.js
So the anonymousFunction becomes:
Then the weird thing is that a function declaration with one error in the function body (see examples/FunctionDeclaration.js) is not parsed as a function declaration and it has 3 errors instead of 1. Could it be that the recovery algorithm doesn't work well because the grammar is now ambiguous? |
I was thinking the same thing, that the recovery does not work well with ambiguous grammars. But, then, I checked parsing for
For a syntactically valid functionBody example, consider
For sure, it feels that there is something wrong with the error recovery, but I haven't had enough debug time on the problem to figure out what it is doing exactly. Here is an analysis starting with the first SyntaxError() called.
Recover() is called, with the error state is 202 and current token is "function".
After Recover(), the current token is "function". Recover() is called a second time.
After Recover(), we are now on "onAction". SyntaxError() is called a second time.
The current token is "{". Recover() is called with the error state 778 and current token is "function". Recover() is called a second time.
We are in state 1093. Afterwards, the current token is still '{'. Recover() is called a third time.
Current state is 202. After, the current token is still '{'. Recover() is called a fourth time.
The error state is now 202. Notably, the routine now Consume()'s one token, and we are now on "event".
We are now on token "}", state 911, and call Recover(), which finally moves past the problem. AnalysisAdaptivePredict() flags an error while trying to figure out which alt of The error recovery only consumes input until it arrives at a token that is valid for a state that has the error. This not a minimal error correction strategy. There is another possible problem here in that AdaptivePredict() should have just chosen the |
For
function onAction (event) { event. }
The parser (generated for the Javascript grammar) reports 3 syntax errors instead of 1:
Pb#SYNTAX_ERROR 0[33..34]:no viable alternative at input 'function onAction(event) { event.}'
Pb#SYNTAX_ERROR 0[25..26]:no viable alternative at input '{'
Pb#SYNTAX_ERROR 0[33..34]:mismatched input '}' expecting {'#', 'null', BooleanLiteral, 'break', 'do', 'instanceof', 'typeof', 'case', 'else', 'new', 'var', 'catch', 'finally', 'return', 'void', 'continue', 'for', 'switch', 'while', 'debugger', 'function', 'this', 'with', 'default', 'if', 'throw', 'delete', 'in', 'try', 'as', 'from', 'of', 'class', 'enum', 'extends', 'super', 'const', 'export', 'import', 'async', 'await', 'yield', 'implements', StrictLet, NonStrictLet, 'private', 'public', 'interface', 'package', 'protected', 'static', Identifier}
I think it should report only the last one.
It doesn't recognize the function declaration, although up to the { the code is correct and it could recognize that we have a function declaration and a function body.
Not sure how to change the sync/recover methods of the ErrorStrategy, or how can this issue be fixed.
The text was updated successfully, but these errors were encountered: