-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document / Clarify naming rules for identifiers #883
Comments
The regular expression in the metamodel expresses the rules for a valid identifier in the Concerto (.cto) language, including the escaping rules allowed in the Concerto (.cto) language, but it shouldn't do that. For example, you can define an enum in Concerto like:
This is legal, and it gives you enum values that look like this in the AST (when saved as JSON): [
{"$class": "concerto.metamodel@1.0.0.EnumProperty", "name": "1D"},
{"$class": "concerto.metamodel@1.0.0.EnumProperty", "name": "2D"},
{"$class": "concerto.metamodel@1.0.0.EnumProperty", "name": "3D"}
] This model works fine, and you can validate objects against it: Where it breaks down is when trying to validate the AST JSON against the metamodel, because the metamodel has this regex validator applied to the "name" field of the EnumProperty concept (as well as the Property and Declaration concepts):
But this regex is expressing the rules of (unparsed, potentially escaped) identifier strings in the Concerto (.cto) language, which it shouldn't be doing. The strings in the AST are already parsed and are not escaped. Because the value from the AST (e.g. "2D") doesn't match this regex, the model fails to validate against the metamodel, even though the model is valid. Put graphically with a similar example, we have something like this:
The regex should apply to the name after parsing (i.e. A somewhat analogous bug would be if the regex was trying to account for JSON escaping rules, when it would never receive a JSON-escaped value because the JSON unescaping would already have been done by the JSON parser. In this case, the regex accounts for Concerto escaping rules, but it never receives a Concerto-escaped value because the unescaping has already been done by the Concerto parser. |
I'm not sure the "Difficulty: Challenging" tag is quite warranted, unless there are backwards-compatibility concerns or you want to change the design. I'm not aware of any backwards-compatibility concerns, though. The parser seems to be parsing the name correctly, and the AST seems to be storing it correctly, and in most or all cases the Concerto tools work fine with the generated model and interpret the names from the AST correctly. It's just that some names which should be legal (according to the Concerto language) are considered to be invalid according to the metamodel. Fixing the regex just makes something work that didn't work before. Also, I think the design for Concerto names is okay and nothing needs to change except the regex. The solution is simply to express any identifier naming rules that you want in the metamodel against the parsed, non-escaped name rather than the Concerto-escaped name. The current regex is purely syntactic and allows any non-empty name as long as it's appropriately escaped, so with escaping out of the picture "any non-empty name" can be matched by simply replacing the regex with a length validator ( However, you have the option of making the regex smarter than that by incorporating some of your higher-level rules that are currently only expressed in Javascript code. For example, you probably have a rule that says "no declaration can be named '$class' or '$identifier' because those are reserved property names", and that rule could potentially be expressed in the regex. But that is optional. The current regex doesn't attempt that kind of validation at all, and attempting to add new validations into the regex is not really in the scope of fixing the existing regex bug. |
Thank you @DS-AdamMilazzo that is very helpful. Another dimension to consider is all the code generation targets. An identifier like |
Yes, but only the JSON property name truly needs to be preserved, and I think all those languages have a way to specify the JSON property name used for serialization. If |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
Okay, bot, I'm commenting... |
The behaviour described here is a side-effect of an bug in the identifier grammar in the parser. I see three options to resolve this issue:
Option two provides backwards compatibility, however it circumvents the desired behaviour of identifiers. This option essentially would allow leading numbers in identifiers, which forces us to solve the problem of codegen conflicts/escaping (ref @dselman). |
What counts as 'undesired behavior' here? Presumably the whole purpose of the escaping mechanism is to allow the specification of names that are prohibited in their unescaped form. (Otherwise, why have an escaping mechanism at all?) So what subset of the otherwise prohibited names do you think should continue to be allowed via escaping, and what subset do you think should be blocked? |
Bug Report 🐛
.cto (o String \u0032\u005C3D) --parse--> AST ("2\3D") --serialize--> JSON ("2\3D")
-serialize--> Concerto (\u0032\u005C3D)
The regex should apply to the name after parsing (i.e. 2\3D in this example), because that's the true name, and that's what's stored in the Concerto object for the AST. The name/AST can be rendered into JSON (using JSON escaping rules) or rendered into Concerto (using Concerto escaping rules), but the escaping is not part of the value. It's part of the serialization format (JSON, Concerto, whatever).
Expected Behavior
Unclear. We need to tighten / document the expectations for round tripping from CTO to JSON of escaped identifiers and ensure that the CTO parser grammar is in sync with the AST regex, and code that checks for valid identifiers.
Current Behavior
Possible Solution
Steps to Reproduce
Context (Environment)
Desktop
Detailed Description
Possible Implementation
The text was updated successfully, but these errors were encountered: