JSON Schema Ref Expander

This tool can normalize given JSON schemas and check them for recursion. It is based on Draft04.

Getting started

How to use the tool

Normalization:
To normalize your schemas some additional parameters are needed:
- For -repositorytype, you can choose between -normal, -testsuite, -corpus depending on where your schemas are from.
- For -allowDistributedSchemas you have to choose between -true and -false depending on whether you want to allow distributed schemas.
- For -fetchSchemasOnline you have to choose between -true and -false depending on whether you want to download references from the internet. If -false is chosen, the referenced URI will only be looked up in the file UriOfFiles.csv. In UriOfFiles.csv each line should look like file, URI. This means that the schema in file will be referenced if actually URI is referenced. file should be stored in a directory called Store. If the referenced URI is not in UriOfFiles.csv a StoreException is thrown. But if -true is chosen, afterward it will be tried to load the schema from URI.
- Finally, "pathToDir" should be the path to the directory in which the schemas are stored. Path should be in quotation marks
  
  java -jar jarfile -normalize -repositorytype -allowDistributedSchemas -fetchSchemasOnline "pathToDir"
  
  If corpus was chosen for the repositorytype an additional parameter with the path to the file repos_fullpath.csv (pathToReposFullpath) is needed.
  
  java -jar jarfile -normalize -corpus -allowDistributedSchemas -fetchSchemasOnline "pathToDir" "pathToReposFullpath"
- "linksToPermalinks" should be the path to a CSV file where a link prefix maps to a permalink prefix such that all web references in a schema are loaded using the permalink instead the original one. This parameter is optional and can be omitted if no permalink should be specified.
Recursion checking:
See here for an explanation.
Again pathToDir should be the path to the directory in which the schemas are stored.
java -jar jarfile -recursion "pathToDir"
Statistics:
Statistics about the distribution of single-file and distributed schemas and the frequency of recursion in them are made. Additionally, the change of the lines of code from the unnormalized to the normalized schemas is gathered. An overall overview is created, too.
Again pathToDir should be the path to the directory in which the schemas are stored. pathToNormalizedDir should be the path to the directory in which the normalized schemas are stored.
java -jar jarfile -stats "pathToDir" "pathToNormalizedDir"

Dockerfile

A dockerfile can be found here. In this, the schemas of the TestSuite (commit 0c223de), the SchemaStore (commit 2ad0b3d) and the SchemaCorpus (commit 9c0e796) will be normalized and afterward the statistics are fetched. To keep this process reproducible all external references have already been downloaded. These downloaded references will be used.

Normalization process

In a normalized schema, all references should follow the JSON Pointer Syntax and all of them should point to direct children of the definitions-section or to the top-level schema. Therefore distributed schemas are consolidated in one file.

Examples

{
  "properties": {
    "name": {"type": "string"},
    "surname": {"$ref": "#/properties/name"},
    "children": {
          "type": "array",
          "items": {"$ref": "#"}
    }
  }
}

All references are in JSON Pointer Syntax, but "#/properties/Vorname" is not pointing to a direct child of definitions-section. Therefore this reference is resolved and copied to the definitions-section. The normalized version of the above schema:

{
  "properties": {
    "name": {"type": "string"},
    "surname": {"$ref": "#/definitions/properties_name"},
    "children": {
          "type": "array",
          "items": {"$ref": "#"}
        }
  },
  "definitions": {
    "properties_name": {"type": "string"}
  }
}

Schemas can be distributed in separate files, too. See following:

{
  "type": "object",  
  "properties": {
    "name": {"type": "string"},
    "surname": {"type": "string"},
    "address": {"$ref": "folder/locations.json#/defintions/address"}
  }
}

File: schema.json

{
  "definitions": {
    "address": {
      "street": {"type": "string"},
      "number": {"type": "integer"},
      "city": {"type": "string"},
      "country": {"type": "string"}
    }
  }
}

File: folder/locations.json

The schema in schema.json has a reference to a child in folder/locations.json. Therefore the content of the reference is copied to the definitions-section. The normalized version of the schema in schema.json:

{
  "type": "object",  
  "properties": {
    "name": {"type": "string"},
    "surname": {"type": "string"},
    "address": {"$ref": "#/defintions/folder_locations.json_defintions_address"}
  },
  "definitions": {
    "folder_locations.json_defintions_address": {
      "street": {"type": "string"},
      "number": {"type": "integer"},
      "city": {"type": "string"},
      "country": {"type": "string"}
    }
  }
}

Recursion checking

A distinction is made between guarded and unguarded recursiveness. The difference is that the behavior of unguarded-recursive schemas during validation is not defined and therefore possibly leads to no validation in finite time. The tool only guarantees the correct output for normalized schemas.

Applicability on other drafts

This tool is based on Draft04 and therefore uses its specific keywords. There are two major problems when normalizing schemas using a higher draft.

One is that "id" was replaced with "$id". To keep things working, before normalization the tool scans for the keyword "$id" in the schema. If one is found, "$id" will be used for base-URI resolution.

Another is that in Draft06 and above unknown keywords should be explicitly ignored, which is not the case in Draft04. Therefore it can be referred to ids under unknown keywords. This leads to no problem unless an id is used more than once, which should never be the case.

Keep in mind that schemas using higher drafts are still normalized using Draft04 specific keywords.

License

This work is licensed under the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JSON Schema Ref Expander

Table of Contents

Getting started

How to use the tool

Dockerfile

Normalization process

Examples

Recursion checking

Applicability on other drafts

License

About

Releases

Packages

Contributors 2

Languages

sdbs-uni-p/jsonschema-refexpander

Folders and files

Latest commit

History

Repository files navigation

JSON Schema Ref Expander

Table of Contents

Getting started

How to use the tool

Dockerfile

Normalization process

Examples

Recursion checking

Applicability on other drafts

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages