Possible improvements for tool builders #29

neon12345 · 2024-02-09T15:24:49Z

I am the author of a tool for converting the antlr parse tree into a denser AST, which can output a standalone JavaScript version to query/transform/print/visualize using CSS selectors. (The end goal is to use machine learning with rule induction for AST transformation/printing.) Demo

Since there is the possibility of changing things with a new version, it would be nice to have a unique antlr state for as many positions in the grammar as possible. This does not currently apply to recursive rules.

ftomassetti · 2024-02-09T15:29:40Z

Hi @neon12345 I trying the demo but I get an error. Can you help me figuring out how to use the tool?
I also created several libraries to generate an AST on top of ANTLR's parse tree. One we created for TypeScript is called Tylasu https://github.com/Strumenta/tylasu . Maybe there are some overlaps with you work.

I think getting suggestions for ANTLR 5 is very useful, especially when coming from tool builders like you.

Regarding your question, I am not sure I get what you mean with "unique antlr state for as many positions in the grammar as possible". Could you give me an example?

neon12345 · 2024-02-09T15:38:02Z

@ftomassetti

Your ISP needs to support IPv6. Other than that, when you press the Run button, the source code on the top left is sent to the server and the returned JavaScript AST is transformed with the script at the bottom into the top right result editor. There is also some help accessible with the bottom tab.

neon12345 · 2024-02-09T15:47:00Z

@ftomassetti

For recursive rules, the parentState is set for the context. What we need from the context and currently have to patch in, is the state before the precpred.

        _localctx = _tracker.createInstance<PatternContext>(parentContext, parentState);
        pushNewRecursionContext(_localctx, startState, RulePattern);
        setState(1913);

        if (!(precpred(_ctx, 2))) throw FailedPredicateException(this, "precpred(_ctx, 2)");
        setState(1914);
        match(Swift5Parser::AS);

ftomassetti · 2024-02-12T07:31:48Z

@neon12345 this is how I see the demo: https://www.loom.com/share/1523249bd9f2458785db29ef2c4ea421
I guess my ISP should support IPv6

neon12345 · 2024-02-12T08:34:13Z

@ftomassetti
Can you test IPv6 compatibility with for example https://ipv6-test.com or https://test-ipv6.com/ to be sure? Also try ping antlr.syncsocial.de

kaby76 · 2024-02-12T12:17:07Z

For recursive rules, the parentState is set for the context. What we need from the context and currently have to patch in, is the state before the precpred.

Can't this be compute using "parentState" and the ATN containing that NFA state number? Given a parse tree, get "parentState" and "see if it is one of your recursive productions". Then, examine the ATN: follow the edges in the ATN back until you find a transition that involves "_p". Presumably, this would be a PrecedencePredicateTransition edge.

neon12345 · 2024-02-12T12:29:31Z

Can't this be compute using "parentState" and the ATN containing that NFA state number?

The way how we work with antlr is a visitor that gives the information about the position in the grammar with the antlr states. From a parsing perspective, it is best to get the information from the current context without many calculations. Thus from our perspective, it would be more valuable to give another state to the context. But I don't know other use cases and why the state had to be stored this way.

ftomassetti · 2024-02-12T13:48:33Z

@ftomassetti Can you test IPv6 compatibility with for example https://ipv6-test.com or https://test-ipv6.com/ to be sure? Also try ping antlr.syncsocial.de

Oh... surprisingly ipv6 does not work for me, so I am afraid I cannot watch the demo

kaby76 · 2024-02-12T16:34:53Z

it would be more valuable to give another state to the context. But I don't know other use cases and why the state had to be stored this way.

Considering the result of a parse is a parse tree, it seems fine to "hang" the "calling state of the ATN to another ATN" in the parent parse tree node. (It really should have been an edge ID because, possibly, one could have multiple edges with the same non-terminal symbol from a state.)

To avoid computing the precedence predicate state in a parse tree traversal, you could pre-compute prior to parse the precedence predicate state from the ATNs, and place them in a O(1) map.

So, for the java grammar, state "1372" would be the "parentState" for an ExpressionContext, but the computed precedence predicate state be "1370".

neon12345 · 2024-02-12T17:44:58Z

@kaby76
Thanks for the suggestions, I don't have a problem getting the values I need and it is fine for me to keep things like they are. I just wonder if the stored information is used as intended or could be improved, considering the state is also not consistently stored in the parse tree.

ericvergnaud pinned this issue Feb 9, 2024

ericvergnaud unpinned this issue Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible improvements for tool builders #29

Possible improvements for tool builders #29

neon12345 commented Feb 9, 2024

ftomassetti commented Feb 9, 2024

neon12345 commented Feb 9, 2024

neon12345 commented Feb 9, 2024

ftomassetti commented Feb 12, 2024 •

edited

Loading

neon12345 commented Feb 12, 2024 •

edited

Loading

kaby76 commented Feb 12, 2024

neon12345 commented Feb 12, 2024

ftomassetti commented Feb 12, 2024

kaby76 commented Feb 12, 2024

neon12345 commented Feb 12, 2024

Possible improvements for tool builders #29

Possible improvements for tool builders #29

Comments

neon12345 commented Feb 9, 2024

ftomassetti commented Feb 9, 2024

neon12345 commented Feb 9, 2024

neon12345 commented Feb 9, 2024

ftomassetti commented Feb 12, 2024 • edited Loading

neon12345 commented Feb 12, 2024 • edited Loading

kaby76 commented Feb 12, 2024

neon12345 commented Feb 12, 2024

ftomassetti commented Feb 12, 2024

kaby76 commented Feb 12, 2024

neon12345 commented Feb 12, 2024

ftomassetti commented Feb 12, 2024 •

edited

Loading

neon12345 commented Feb 12, 2024 •

edited

Loading