Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nyaml #1282

Closed
wants to merge 35 commits into from
Closed

Nyaml #1282

Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
0233d91
adding links to first references of the vocabulary items
sanbrock Jun 15, 2023
a671d15
do not display first reference redundantly if it is the only reference
sanbrock Jun 15, 2023
124b41e
reformatting
sanbrock Jun 15, 2023
8351366
changing to shorter link with tooltip
sanbrock Jun 15, 2023
1af3a42
linting
sanbrock Jun 15, 2023
8c0881d
linting
sanbrock Jun 15, 2023
d5faa6b
supporting unicode char for latex
sanbrock Jun 15, 2023
1138ebf
short tooltip and link
sanbrock Jun 15, 2023
cb92872
adjusted conf.py
sanbrock Jun 15, 2023
c117842
removing pynxtools as dependecy
sanbrock Jun 15, 2023
185f649
linting
sanbrock Jun 15, 2023
5a1de22
linting
sanbrock Jun 15, 2023
286c0c2
imports
sanbrock Jun 15, 2023
6837fb6
Adds pyproject
domna Jun 16, 2023
73d795b
Merge pull request #1276 from FAIRmat-NFDI/origin/python-package
sanbrock Jun 16, 2023
52a21ee
linting
sanbrock Jun 16, 2023
d3d101f
adjusted default location of definitions inside the module
sanbrock Jun 16, 2023
025c078
new characters as Code Camp suggested
sanbrock Jun 16, 2023
adf098e
make new char available for latex
sanbrock Jun 16, 2023
222a3c0
make new char available for latex
sanbrock Jun 16, 2023
7252a49
make new char available for latex
sanbrock Jun 16, 2023
fd4b4a6
collapsing doc_enum-s
sanbrock Jun 16, 2023
90ff26b
missing sphinx dependency
sanbrock Jun 16, 2023
dc32585
nyaml2nxdl
sanbrock Jun 19, 2023
5d08208
linting
sanbrock Jun 19, 2023
f76a1a9
linting
sanbrock Jun 19, 2023
9462a1d
imports
sanbrock Jun 19, 2023
49fc2dc
Merge remote-tracking branch 'origin/link_first_reference' into nyaml
sanbrock Jun 19, 2023
b65bf23
fixing imports
sanbrock Jun 19, 2023
90a9e45
test case added
sanbrock Jun 19, 2023
015aa77
removing h5py dependency
sanbrock Jun 21, 2023
f2cd2ac
remove dependencies also from pypi configuration
sanbrock Jun 21, 2023
5d8949c
removing h5-y dependency
sanbrock Jun 21, 2023
5e934d6
fixing imports
sanbrock Jun 21, 2023
ac5b156
removing the unnecessary ignoring of unknowns
sanbrock Jun 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,23 @@ makelog.txt
# Unknown
/python/
__github_creds__.txt

sanbrock marked this conversation as resolved.
Show resolved Hide resolved
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
4 changes: 4 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
recursive-include applications/ *.nxdl.xml
recursive-include contributed_definitions/ *.nxdl.xml
recursive-include base_classes/ *.nxdl.xml
include ./ *.xsd
13 changes: 13 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
PYTHON = python3
SPHINX = sphinx-build
BUILD_DIR = "build"
NXDL_DIRS := contributed_definitions applications base_classes

.PHONY: help install style autoformat test clean prepare html pdf impatient-guide all local

Expand Down Expand Up @@ -49,6 +50,9 @@ test ::

clean ::
$(RM) -rf $(BUILD_DIR)
for dir in $(NXDL_DIRS); do\
$(RM) -rf $${dir}/nyaml;\
done

prepare ::
$(PYTHON) -m dev_tools manual --prepare --build-root $(BUILD_DIR)
Expand Down Expand Up @@ -83,6 +87,15 @@ all ::
@echo "HTML built: `ls -lAFgh $(BUILD_DIR)/manual/build/html/index.html`"
@echo "PDF built: `ls -lAFgh $(BUILD_DIR)/manual/build/latex/nexus.pdf`"

NXDLS := $(foreach dir,$(NXDL_DIRS),$(wildcard $(dir)/*.nxdl.xml))
nyaml : $(DIRS) $(NXDLS)
for file in $^; do\
mkdir -p "$${file%/*}/nyaml";\
nyaml2nxdl --input-file $${file};\
FNAME=$${file##*/};\
mv -- "$${file%.nxdl.xml}_parsed.yaml" "$${file%/*}/nyaml/$${FNAME%.nxdl.xml}.yaml";\
done


# NeXus - Neutron and X-ray Common Data Format
#
Expand Down
99 changes: 82 additions & 17 deletions dev_tools/docs/nxdl.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from ..globals.errors import NXDLParseError
from ..globals.nxdl import NXDL_NAMESPACE
from ..globals.urls import REPO_URL
from ..utils import nexus as pynxtools_nxlib
from ..utils.types import PathLike
from .anchor_list import AnchorRegistry

Expand Down Expand Up @@ -109,7 +110,7 @@ def _parse_nxdl_file(self, nxdl_file: Path):
# print official description of this class
self._print("")
self._print("**Description**:\n")
self._print_doc(self._INDENTATION_UNIT, ns, root, required=True)
self._print_doc_enum("", ns, root, required=True)

# print symbol list
node_list = root.xpath("nx:symbols", namespaces=ns)
Expand All @@ -119,7 +120,7 @@ def _parse_nxdl_file(self, nxdl_file: Path):
elif len(node_list) > 1:
raise Exception(f"Invalid symbol table in {nxclass_name}")
else:
self._print_doc(self._INDENTATION_UNIT, ns, node_list[0])
self._print_doc_enum("", ns, node_list[0])
for node in node_list[0].xpath("nx:symbol", namespaces=ns):
doc = self._get_doc_line(ns, node)
self._print(f" **{node.get('name')}**", end="")
Expand Down Expand Up @@ -498,6 +499,35 @@ def _print_doc(self, indent, ns, node, required=False):
self._print(f"{indent}{line}")
self._print()

def long_doc(self, ns, node):
length = 0
line = "documentation"
fnd = False
blocks = self._get_doc_blocks(ns, node)
for block in blocks:
lines = block.splitlines()
length += len(lines)
for single_line in lines:
if len(single_line) > 2 and single_line[0] != "." and not fnd:
fnd = True
line = single_line
return (length, line, blocks)

def _print_doc_enum(self, indent, ns, node, required=False):
collapse_indent = indent
node_list = node.xpath("nx:enumeration", namespaces=ns)
(doclen, line, blocks) = self.long_doc(ns, node)
if len(node_list) + doclen > 1:
collapse_indent = f"{indent} "
self._print(f"{indent}{self._INDENTATION_UNIT}.. collapse:: {line} ...\n")
self._print_doc(
collapse_indent + self._INDENTATION_UNIT, ns, node, required=required
)
if len(node_list) == 1:
self._print_enumeration(
collapse_indent + self._INDENTATION_UNIT, ns, node_list[0]
)

def _print_attribute(self, ns, kind, node, optional, indent, parent_path):
name = node.get("name")
index_name = name
Expand All @@ -506,12 +536,9 @@ def _print_attribute(self, ns, kind, node, optional, indent, parent_path):
)
self._print(f"{indent}.. index:: {index_name} ({kind} attribute)\n")
self._print(
f"{indent}**@{name}**: {optional}{self._format_type(node)}{self._format_units(node)}\n"
f"{indent}**@{name}**: {optional}{self._format_type(node)}{self._format_units(node)} {self.get_first_parent_ref(f'{parent_path}/{name}', 'attribute')}\n"
)
self._print_doc(indent + self._INDENTATION_UNIT, ns, node)
node_list = node.xpath("nx:enumeration", namespaces=ns)
if len(node_list) == 1:
self._print_enumeration(indent + self._INDENTATION_UNIT, ns, node_list[0])
self._print_doc_enum(indent, ns, node)

def _print_if_deprecated(self, ns, node, indent):
deprecated = node.get("deprecated", None)
Expand Down Expand Up @@ -549,17 +576,12 @@ def _print_full_tree(self, ns, parent, name, indent, parent_path):
f"{self._format_type(node)}"
f"{dims}"
f"{self._format_units(node)}"
f" {self.get_first_parent_ref(f'{parent_path}/{name}', 'field')}"
"\n"
)

self._print_if_deprecated(ns, node, indent + self._INDENTATION_UNIT)
self._print_doc(indent + self._INDENTATION_UNIT, ns, node)

node_list = node.xpath("nx:enumeration", namespaces=ns)
if len(node_list) == 1:
self._print_enumeration(
indent + self._INDENTATION_UNIT, ns, node_list[0]
)
self._print_doc_enum(indent, ns, node)

for subnode in node.xpath("nx:attribute", namespaces=ns):
optional = self._get_required_or_optional_text(subnode)
Expand All @@ -585,10 +607,12 @@ def _print_full_tree(self, ns, parent, name, indent, parent_path):
# target = hTarget.replace(".. _", "").replace(":\n", "")
# TODO: https://github.com/nexusformat/definitions/issues/1057
self._print(f"{indent}{hTarget}")
self._print(f"{indent}**{name}**: {optional_text}{typ}\n")
self._print(
f"{indent}**{name}**: {optional_text}{typ} {self.get_first_parent_ref(f'{parent_path}/{name}', 'group')}\n"
)

self._print_if_deprecated(ns, node, indent + self._INDENTATION_UNIT)
self._print_doc(indent + self._INDENTATION_UNIT, ns, node)
self._print_doc_enum(indent, ns, node)

for subnode in node.xpath("nx:attribute", namespaces=ns):
optional = self._get_required_or_optional_text(subnode)
Expand Down Expand Up @@ -619,8 +643,49 @@ def _print_full_tree(self, ns, parent, name, indent, parent_path):
f"(suggested target: ``{node.get('target')}``)"
"\n"
)
self._print_doc(indent + self._INDENTATION_UNIT, ns, node)
self._print_doc_enum(indent, ns, node)

def _print(self, *args, end="\n"):
# TODO: change instances of \t to proper indentation
self._rst_lines.append(" ".join(args) + end)

def get_first_parent_ref(self, path, tag):
nx_name = path[1 : path.find("/", 1)]
path = path[path.find("/", 1) :]

try:
parents = pynxtools_nxlib.get_inherited_nodes(path, nx_name)[2]
except FileNotFoundError:
return ""
if len(parents) > 1:
parent = parents[1]
parent_path = parent_display_name = parent.attrib["nxdlpath"]
parent_path_segments = parent_path[1:].split("/")
parent_def_name = parent.attrib["nxdlbase"][
parent.attrib["nxdlbase"]
.rfind("/") : parent.attrib["nxdlbase"]
.rfind(".nxdl")
]

# Case where the first parent is a base_class
if parent_path_segments[0] == "":
return ""

# special treatment for NXnote@type
if (
tag == "attribute"
and parent_def_name == "/NXnote"
and parent_path == "/type"
):
return ""

if tag == "attribute":
pos_of_right_slash = parent_path.rfind("/")
parent_path = (
parent_path[:pos_of_right_slash]
+ "@"
+ parent_path[pos_of_right_slash + 1 :]
)
parent_display_name = f"{parent_def_name[1:]}{parent_path}"
return f":ref:`⤆ </{parent_display_name}-{tag}>`"
return ""
72 changes: 72 additions & 0 deletions dev_tools/nyaml2nxdl/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# YAML to NXDL converter and NXDL to YAML converter

**NOTE: Please use python3.8 or above to run this converter**

**Tools purpose**: Offer a simple YAML-based schema and a XML-based schema to describe NeXus instances. These can be NeXus application definitions or classes
such as base or contributed classes. Users either create NeXus instances by writing a YAML file or a XML file which details a hierarchy of data/metadata elements.
The forward (YAML -> NXDL.XML) and backward (NXDL.XML -> YAML) conversions are implemented.

**How the tool works**:
- yaml2nxdl.py
1. Reads the user-specified NeXus instance, either in YML or XML format.
2. If input is in YAML, creates an instantiated NXDL schema XML tree by walking the dictionary nest.
If input is in XML, creates a YML file walking the dictionary nest.
3. Write the tree into a YAML file or a properly formatted NXDL XML schema file to disk.
4. Optionally, if --append argument is given,
the XML or YAML input file is interpreted as an extension of a base class and the entries contained in it
are appended below a standard NeXus base class.
You need to specify both your input file (with YAML or XML extension) and NeXus class (with no extension).
Both .yml and .nxdl.xml file of the extended class are printed.

```console
user@box:~$ python yaml2nxdl.py

Usage: python yaml2nxdl.py [OPTIONS]

Options:
--input-file TEXT The path to the input data file to read.
--append TEXT Parse xml NeXus file and append to specified base class,
write the base class name with no extension.
--check-consistency Check consistency by generating another version of the input file.
E.g. for input file: NXexample.nxdl.xml the output file
NXexample_consistency.nxdl.xml.
--verbose Addictional std output info is printed to help debugging.
--help Show this message and exit.

```

## Documentation

**Rule set**: From transcoding YAML files we need to follow several rules.
* Named NeXus groups, which are instances of NeXus classes especially base or contributed classes. Creating (NXbeam) is a simple example of a request to define a group named according to NeXus default rules. mybeam1(NXbeam) or mybeam2(NXbeam) are examples how to create multiple named instances at the same hierarchy level.
* Members of groups so-called fields or attributes. A simple example of a member is voltage. Here the datatype is implied automatically as the default NeXus NX_CHAR type. By contrast, voltage(NX_FLOAT) can be used to instantiate a member of class which should be of NeXus type NX_FLOAT.
* And attributes of either groups or fields. Names of attributes have to be preceeded by \@ to mark them as attributes.
* Optionality: For all fields, groups and attributes in `application definitions` are `required` by default, except anything (`recommended` or `optional`) mentioned.

**Special keywords**: Several keywords can be used as childs of groups, fields, and attributes to specify the members of these. Groups, fields and attributes are nodes of the XML tree.
* **doc**: A human-readable description/docstring
* **exists** Options are recommended, required, [min, 1, max, infty] numbers like here 1 can be replaced by any uint, or infty to indicate no restriction on how frequently the entry can occur inside the NXDL schema at the same hierarchy level.
* **link** Define links between nodes.
* **units** A statement introducing NeXus-compliant NXDL units arguments, like NX_VOLTAGE
* **dimensions** Details which dimensional arrays to expect
* **enumeration** Python list of strings which are considered as recommended entries to choose from.
* **dim_parameters** `dim` which is a child of `dimension` and the `dim` might have several attributes `ref`,
`incr` including `index` and `value`. So while writting `yaml` file schema definition please following structure:
```
dimensions:
rank: integer value
dim: [[ind_1, val_1], [ind_2, val_2], ...]
dim_parameters:
ref: [ref_value_1, ref_value_2, ...]
incr: [incr_value_1, incr_value_2, ...]
```
Keep in mind that length of all the lists must be same.

## Next steps

The NOMAD team is currently working on the establishing of a one-to-one mapping between
NeXus definitions and the NOMAD MetaInfo. As soon as this is in place the YAML files will
be annotated with further metadata so that they can serve two purposes.
On the one hand they can serve as an instance for a schema to create a GUI representation
of a NOMAD Oasis ELN schema. On the other hand the YAML to NXDL converter will skip all
those pieces of information which are irrelevant from a NeXus perspective.
22 changes: 22 additions & 0 deletions dev_tools/nyaml2nxdl/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/usr/bin/env python3
"""
# Load paths
"""
# -*- coding: utf-8 -*-
#
# Copyright The NOMAD Authors.
#
# This file is part of NOMAD. See https://nomad-lab.eu for further info.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
Loading