-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: validate local and external references #26
feat: validate local and external references #26
Conversation
solves #7 |
Sorry, but that's not what is meant by "external reference". It means external to the schema document, not external to the workspace. I wasn't thinking about fetching schemas over HTTP and at this point and I don't want to go there. There are complications involved in doing that right that we don't want to deal with right now. For this task, you should only be considering schemas defined in the workspace or pre-registered in
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/address",
...
}
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/customer",
"type": "object",
"properties": {
...,
"address": { "$ref": "/schemas/address" } <-- External reference to a schema in the workspace
}
} |
Ahh! Got it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick review to point out a couple things. I didn't review in too much detail yet, because I know you're going to have to change a lot of it to get back on the right track.
What exactly is workspace, is it just a folder containing many schema.json files ? If we open a folder in an IDE and the folder has many schema.json files and with them it has another folder which also has schema,json files then how will the workspaces be defined here? |
"Workspace" is an LSP concept. Open a directory in vscode and look at the file tree for that directory. That's your workspace. Everything that's visible to the IDE/editor is your workspace. |
ok! Just a little confused. let's consider this example.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/c",
...
}
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/b",
...
}
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/a",
"type": "object",
"properties": {
...,
"address": { "$ref": "/schemas/b" }
}
...
} is the ref in another example just to kind of stress more on what I'm confused, is the ref pointing to $id or the location
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/c",
...
}
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/b",
...
}
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/a",
"type": "object",
"properties": {
...,
"address": { "$ref": "/schemas/b" }
}
...
} |
If you haven't read Structuring a Complex Schema on the website, definitely do that. It should help with the basic concepts. There are two schema identification concepts. One is the retrieval URI and the other is self-identification. Because these are files that exist on the file system, they have a However, JSON Schema also defines a way for schemas to declare their own identifier in addition to the natural identifier they get based on their file system location. This is done this using the So, going to your example, LSP-ID: {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/c",
...
} LSP-ID: {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/b",
...
} LSP-ID: {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/schemas/a",
"type": "object",
"properties": {
...,
"address": { "$ref": "/schemas/b" }
}
...
} The question is, what does the reference at Now consider if the schema at |
got it! I almost wrote another huge example, but I read the comment once again and now I completely get it! |
I think I'm done with all the tasks and my PR is ready for review!! |
Please rebase and resolve conflicts. Also, make sure you run tests locally to make sure they pass before pushing. Don't rely on automation. |
It's definitely related to your changes. The tests pass on "main". |
Thanks for pointing that out @jdesrosiers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A found a couple bugs in some edge cases you probably hadn't considered.
Reference to an embedded Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/main",
"allOf": [
{ "$ref": "a" },
{ "$ref": "b" },
]
"$defs": {
"embedded-a": {
"$id": "a"
},
"embedded-b": {
"$id": "https://example.com/b"
}
}
}
These are valid references that get flagged as invalid.
$ref in a non-schema location
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://example.com/main",
"properties": {
"$ref": "foo"
},
"not-a-keyword": { "$ref": "foo" }
}
These get flagged as invalid references, but neither are references. They just look like references.
The design isn't great in that finding references is coupled to validating references. There are other features where we'll need to use or know about references. Ideally, we'd be able to reuse the reference identification functionality you've developed for those other features, but we can't because it's too coupled to validating references.
Similarly, I'd like to see the validation of the references decoupled from language server concerns. Eventually, I want to use the functionality developed here to make a CLI that validates all the schemas in a workspace. That means I need to use this validation functionality outside of the context of a language server. So, things like calling buildDiagnostic
is a problem. It should just return the validation information. Then we can call it from the language server and run buildDiagnostic
on the result or call it from the CLI and run whatever is need to present those errors on the console.
I will get on those asap! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gone through this in detail yet, but this is a lot more code than I was expecting. I feel like this should be simpler. I think you're duplicating things that are already handled. I would expect you to make use of the managed instances. Every time the workspace changes, every schema resource in the workspace is revalidated and is available in the schemaResourceCache
. The schemas are already decomposed so there are no embedded schemas to worry about. You just need to collect the identifiers for every schema resource to get the list of valid identifiers. Then you just need to iterate over those schemaResources looking for and validating references, which shouldn't be too hard since you don't have to worry about embedded schemas.
Hey @jdesrosiers , I have refactored my solution more. I have removed the identifiers store but I have kept the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have kept the
inactiveDocumentStore
, since that caches the documents which are not open in the editor
I think you've misunderstood what I was telling you in my last review. The schemaResourceCache
should contain all schemas in the workspace, not just the ones that are open. All of them. You shouldn't need to invalidDocumentStore
. It should all be there already.
I have made some changes, please take a look @jdesrosiers ! |
Seems like there were a lot of changes made for schema resources 😅 |
Yes, sorry. There was a big change introducing a more powerful and performant AST for working with schemas that replaced |
Closing in favor of #44 |
validate external references
$id
validate internal references
TODO: