Skip to content

sdbs-uni-p/jsonschema-refexpander

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JSON Schema Ref Expander

This tool can normalize given JSON schemas and check them for recursion. It is based on Draft04.

Table of Contents

Getting started

How to use the tool

  • Normalization:
    To normalize your schemas some additional parameters are needed:
    • For -repositorytype, you can choose between -normal, -testsuite, -corpus depending on where your schemas are from.
    • For -allowDistributedSchemas you have to choose between -true and -false depending on whether you want to allow distributed schemas.
    • For -fetchSchemasOnline you have to choose between -true and -false depending on whether you want to download references from the internet. If -false is chosen, the referenced URI will only be looked up in the file UriOfFiles.csv. In UriOfFiles.csv each line should look like file, URI. This means that the schema in file will be referenced if actually URI is referenced. file should be stored in a directory called Store. If the referenced URI is not in UriOfFiles.csv a StoreException is thrown. But if -true is chosen, afterward it will be tried to load the schema from URI.
    • Finally, "pathToDir" should be the path to the directory in which the schemas are stored. Path should be in quotation marks

      java -jar jarfile -normalize -repositorytype -allowDistributedSchemas -fetchSchemasOnline "pathToDir"

      If corpus was chosen for the repositorytype an additional parameter with the path to the file repos_fullpath.csv (pathToReposFullpath) is needed.

      java -jar jarfile -normalize -corpus -allowDistributedSchemas -fetchSchemasOnline "pathToDir" "pathToReposFullpath"

    • "linksToPermalinks" should be the path to a CSV file where a link prefix maps to a permalink prefix such that all web references in a schema are loaded using the permalink instead the original one. This parameter is optional and can be omitted if no permalink should be specified.
  • Recursion checking:
    See here for an explanation.
    Again pathToDir should be the path to the directory in which the schemas are stored.
    java -jar jarfile -recursion "pathToDir"

  • Statistics:
    Statistics about the distribution of single-file and distributed schemas and the frequency of recursion in them are made. Additionally, the change of the lines of code from the unnormalized to the normalized schemas is gathered. An overall overview is created, too.
    Again pathToDir should be the path to the directory in which the schemas are stored. pathToNormalizedDir should be the path to the directory in which the normalized schemas are stored.
    java -jar jarfile -stats "pathToDir" "pathToNormalizedDir"

Dockerfile

A dockerfile can be found here. In this, the schemas of the TestSuite (commit 0c223de), the SchemaStore (commit 2ad0b3d) and the SchemaCorpus (commit 9c0e796) will be normalized and afterward the statistics are fetched. To keep this process reproducible all external references have already been downloaded. These downloaded references will be used.

Normalization process

In a normalized schema, all references should follow the JSON Pointer Syntax and all of them should point to direct children of the definitions-section or to the top-level schema. Therefore distributed schemas are consolidated in one file.

Examples

{
  "properties": {
    "name": {"type": "string"},
    "surname": {"$ref": "#/properties/name"},
    "children": {
          "type": "array",
          "items": {"$ref": "#"}
    }
  }
}

All references are in JSON Pointer Syntax, but "#/properties/Vorname" is not pointing to a direct child of definitions-section. Therefore this reference is resolved and copied to the definitions-section. The normalized version of the above schema:

{
  "properties": {
    "name": {"type": "string"},
    "surname": {"$ref": "#/definitions/properties_name"},
    "children": {
          "type": "array",
          "items": {"$ref": "#"}
        }
  },
  "definitions": {
    "properties_name": {"type": "string"}
  }
}



Schemas can be distributed in separate files, too. See following:

{
  "type": "object",  
  "properties": {
    "name": {"type": "string"},
    "surname": {"type": "string"},
    "address": {"$ref": "folder/locations.json#/defintions/address"}
  }
}

File: schema.json

{
  "definitions": {
    "address": {
      "street": {"type": "string"},
      "number": {"type": "integer"},
      "city": {"type": "string"},
      "country": {"type": "string"}
    }
  }
}

File: folder/locations.json

The schema in schema.json has a reference to a child in folder/locations.json. Therefore the content of the reference is copied to the definitions-section. The normalized version of the schema in schema.json:

{
  "type": "object",  
  "properties": {
    "name": {"type": "string"},
    "surname": {"type": "string"},
    "address": {"$ref": "#/defintions/folder_locations.json_defintions_address"}
  },
  "definitions": {
    "folder_locations.json_defintions_address": {
      "street": {"type": "string"},
      "number": {"type": "integer"},
      "city": {"type": "string"},
      "country": {"type": "string"}
    }
  }
}

Recursion checking

A distinction is made between guarded and unguarded recursiveness. The difference is that the behavior of unguarded-recursive schemas during validation is not defined and therefore possibly leads to no validation in finite time. The tool only guarantees the correct output for normalized schemas.

Applicability on other drafts

This tool is based on Draft04 and therefore uses its specific keywords. There are two major problems when normalizing schemas using a higher draft.

One is that "id" was replaced with "$id". To keep things working, before normalization the tool scans for the keyword "$id" in the schema. If one is found, "$id" will be used for base-URI resolution.

Another is that in Draft06 and above unknown keywords should be explicitly ignored, which is not the case in Draft04. Therefore it can be referred to ids under unknown keywords. This leads to no problem unless an id is used more than once, which should never be the case.

Keep in mind that schemas using higher drafts are still normalized using Draft04 specific keywords.

License

This work is licensed under the Apache 2.0 License.

About

A normalization tool for JSON Schemas. Based on Draft04

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published