-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: yaml_check internal CLI utility #22
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
.vscode | ||
__pycache__ | ||
*.rock | ||
.venv/ | ||
yaml_checker/yaml_checker.egg-info |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,111 @@ | ||||||
# YAML Checker | ||||||
|
||||||
An internal CLI util for formatting and validating YAML files. This project | ||||||
relies on Pydantic and Ruamel libraries. | ||||||
Comment on lines
+3
to
+4
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess the second line is not necessary, since there is a |
||||||
|
||||||
**Installation** | ||||||
```bash | ||||||
pip install -e yaml_checker | ||||||
``` | ||||||
|
||||||
**Usage** | ||||||
``` | ||||||
usage: yaml_checker [-h] [-v] [-w] [--config CONFIG] [files ...] | ||||||
Comment on lines
+12
to
+13
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be possible to add the following line here?
|
||||||
|
||||||
positional arguments: | ||||||
files Additional files to process (optional). | ||||||
|
||||||
options: | ||||||
-h, --help show this help message and exit | ||||||
-v, --verbose Enable verbose output. | ||||||
-w, --write Write yaml output to disk. | ||||||
--config CONFIG CheckYAML subclass to load | ||||||
Comment on lines
+16
to
+22
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: let's be consistent with the trailing dots. Usually we don't have trailing dots in help. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about the following? Or, something like that.
Suggested change
|
||||||
``` | ||||||
|
||||||
**Example** | ||||||
|
||||||
```bash | ||||||
# Lets cat a demonstration file for comparison. | ||||||
$ cat yaml_checker/demo/slice.yaml | ||||||
Comment on lines
+28
to
+29
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's have this as a |
||||||
# yaml_checker --config=Chisel demo/slice.yaml | ||||||
|
||||||
package: grep | ||||||
|
||||||
essential: | ||||||
- grep_copyright | ||||||
|
||||||
# hello: world | ||||||
|
||||||
slices: | ||||||
bins: | ||||||
essential: | ||||||
- libpcre2-8-0_libs # tests | ||||||
|
||||||
# another test | ||||||
- libc6_libs | ||||||
contents: | ||||||
/usr/bin/grep: | ||||||
|
||||||
deprecated: | ||||||
# These are shell scripts requiring a symlink from /usr/bin/dash to | ||||||
# /usr/bin/sh. | ||||||
# See: https://manpages.ubuntu.com/manpages/noble/en/man1/grep.1.html | ||||||
essential: | ||||||
- dash_bins | ||||||
- grep_bins | ||||||
contents: | ||||||
# we ned this leading comment | ||||||
/usr/bin/rgrep: # this should be last | ||||||
|
||||||
/usr/bin/fgrep: | ||||||
|
||||||
# careful with this path ... | ||||||
/usr/bin/egrep: # it is my favorite | ||||||
copyright: | ||||||
contents: | ||||||
/usr/share/doc/grep/copyright: | ||||||
# Note: Missing new line at EOF | ||||||
|
||||||
# Now we can run the yaml_checker to format the same file. | ||||||
# Note how comments are preserved during sorting of lists and | ||||||
# dict type objects. If you want to test the validator, | ||||||
# uncomment the hello field. | ||||||
Comment on lines
+67
to
+72
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's end the previous code block before line 67, have this text as normal text in the markdown and start another code block after line 72. |
||||||
$ yaml_checker --config=Chisel yaml_checker/demo/slice.yaml | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like the previous comment, let's have this as a |
||||||
# yaml_checker --config=Chisel demo/slice.yaml | ||||||
|
||||||
package: grep | ||||||
|
||||||
essential: | ||||||
- grep_copyright | ||||||
|
||||||
# hello: world | ||||||
|
||||||
slices: | ||||||
bins: | ||||||
essential: | ||||||
- libc6_libs | ||||||
- libpcre2-8-0_libs # tests | ||||||
|
||||||
# another test | ||||||
contents: | ||||||
/usr/bin/grep: | ||||||
|
||||||
deprecated: | ||||||
# These are shell scripts requiring a symlink from /usr/bin/dash to | ||||||
# /usr/bin/sh. | ||||||
# See: https://manpages.ubuntu.com/manpages/noble/en/man1/grep.1.html | ||||||
essential: | ||||||
- dash_bins | ||||||
- grep_bins | ||||||
contents: | ||||||
# we ned this leading comment | ||||||
|
||||||
# careful with this path ... | ||||||
/usr/bin/egrep: # it is my favorite | ||||||
/usr/bin/fgrep: | ||||||
/usr/bin/rgrep: # this should be last | ||||||
copyright: | ||||||
contents: | ||||||
/usr/share/doc/grep/copyright: | ||||||
|
||||||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# yaml_checker --config=Chisel demo/slice.yaml | ||
|
||
package: grep | ||
|
||
essential: | ||
- grep_copyright | ||
|
||
# hello: world | ||
|
||
slices: | ||
bins: | ||
essential: | ||
- libpcre2-8-0_libs # tests | ||
|
||
# another test | ||
- libc6_libs | ||
contents: | ||
/usr/bin/grep: | ||
|
||
deprecated: | ||
# These are shell scripts requiring a symlink from /usr/bin/dash to | ||
# /usr/bin/sh. | ||
# See: https://manpages.ubuntu.com/manpages/noble/en/man1/grep.1.html | ||
essential: | ||
- dash_bins | ||
- grep_bins | ||
contents: | ||
# we ned this leading comment | ||
/usr/bin/rgrep: # this should be last | ||
|
||
/usr/bin/fgrep: | ||
|
||
# careful with this path ... | ||
/usr/bin/egrep: # it is my favorite | ||
copyright: | ||
contents: | ||
/usr/share/doc/grep/copyright: |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
pydantic==2.8.2 | ||
ruamel.yaml==0.18.6 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Let's add a line break at the end here. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
from pathlib import Path | ||
|
||
from setuptools import find_packages, setup | ||
|
||
|
||
def read_text(filename): | ||
filepath = Path(__file__).parent / filename | ||
return filepath.read_text() | ||
|
||
|
||
setup( | ||
name="yaml_checker", | ||
version="0.1.0", | ||
long_description=read_text("README.md"), | ||
packages=find_packages(), | ||
install_requires=read_text("requirements.txt"), | ||
entry_points={ | ||
"console_scripts": [ | ||
"yaml_checker=yaml_checker.__main__:main", | ||
"clayaml=yaml_checker.__main__:main", | ||
], | ||
}, | ||
) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
import argparse | ||
import logging | ||
from pathlib import Path | ||
|
||
from .config.base import YAMLCheckConfigBase | ||
|
||
# TODO: display all available configs in help | ||
parser = argparse.ArgumentParser() | ||
|
||
parser.add_argument( | ||
"-v", "--verbose", action="store_true", help="Enable verbose output." | ||
) | ||
|
||
parser.add_argument( | ||
"-w", "--write", action="store_true", help="Write yaml output to disk." | ||
) | ||
|
||
parser.add_argument( | ||
"--config", | ||
type=str, | ||
default="YAMLCheckConfigBase", | ||
help="CheckYAML subclass to load", | ||
) | ||
|
||
parser.add_argument( | ||
"files", type=Path, nargs="*", help="Additional files to process (optional)." | ||
) | ||
|
||
|
||
def main(): | ||
args = parser.parse_args() | ||
|
||
log_level = logging.DEBUG if args.verbose else logging.INFO | ||
logging.basicConfig(level=log_level) | ||
|
||
check_yaml_config = YAMLCheckConfigBase.configs[args.config] | ||
|
||
yaml = check_yaml_config() | ||
|
||
for file in args.files: | ||
data = yaml.load(file.read_text()) | ||
data = yaml.apply_rules(data) | ||
yaml.validate_model(data) | ||
|
||
output = yaml.dump(data) | ||
|
||
if args.write: | ||
file.write_text(output) | ||
else: | ||
print(output) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
from importlib import import_module | ||
from pathlib import Path | ||
|
||
submodule_root = Path(__file__).parent | ||
package_name = __name__ | ||
|
||
# import all submodules so our configs registry is populated | ||
for submodule in submodule_root.glob("*.py"): | ||
submodule_name = submodule.stem | ||
|
||
if submodule_name.startswith("_"): | ||
continue | ||
|
||
import_module(f"{__name__}.{submodule_name}") |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
import fnmatch | ||
import logging | ||
from io import StringIO | ||
from pathlib import Path | ||
from typing import Any | ||
|
||
from pydantic import BaseModel | ||
from ruamel.yaml import YAML | ||
|
||
|
||
class YAMLCheckConfigReg(type): | ||
def __init__(cls, *args, **kwargs): | ||
"""Track all subclass configurations of YAMLCheckConfigBase for CLI""" | ||
super().__init__(*args, **kwargs) | ||
name = cls.__name__ | ||
if name not in cls.configs: | ||
cls.configs[name] = cls | ||
|
||
|
||
class YAMLCheckConfigBase(metaclass=YAMLCheckConfigReg): | ||
configs = {} # Store configs for access from CLI | ||
rules = {} # map glob strings to class method names | ||
|
||
class Model(BaseModel): | ||
"""Pydantic BaseModel to provide validation""" | ||
|
||
class Config: | ||
extra = "allow" | ||
|
||
class Config: | ||
"""ruamel.yaml configuration set before loading.""" | ||
|
||
preserve_quotes = True | ||
width = 80 | ||
map_indent = 2 | ||
sequence_indent = 4 | ||
sequence_dash_offset = 2 | ||
|
||
def __init__(self): | ||
"""YAMLCheck Base Config""" | ||
self.yaml = YAML() | ||
|
||
# load Config into yaml | ||
for attr in dir(self.Config): | ||
if attr.startswith("__"): | ||
continue | ||
|
||
attr_val = getattr(self.Config, attr) | ||
|
||
if hasattr(self.yaml, attr): | ||
setattr(self.yaml, attr, attr_val) | ||
else: | ||
raise AttributeError(f"Invalid ruamel.yaml attribute: {attr}") | ||
|
||
def load(self, yaml_str: str): | ||
"""Load YAML data from string""" | ||
data = self.yaml.load(yaml_str) | ||
|
||
return data | ||
|
||
def dump(self, data: Any): | ||
"""Dump data to YAML string""" | ||
with StringIO() as sio: | ||
self.yaml.dump(data, sio) | ||
sio.seek(0) | ||
|
||
return sio.read() | ||
|
||
def validate_model(self, data: Any): | ||
"""Apply validate data against model""" | ||
if issubclass(self.Model, BaseModel): | ||
_ = self.Model(**data) | ||
|
||
def _apply_rules(self, path: Path, data: Any): | ||
"""Recursively apply rules starting from the outermost elements.""" | ||
logging.debug(f"Walking path {path}.") | ||
|
||
# recurse over dicts and lists | ||
if isinstance(data, dict): | ||
for key, value in data.items(): | ||
data[key] = self._apply_rules(path / str(key), value) | ||
|
||
elif isinstance(data, list): | ||
for index, item in enumerate(data): | ||
data[index] = self._apply_rules(path / str(item), item) | ||
|
||
# scan for applicable rules at each directory | ||
# TODO: selection of rules here does not scale well and should be improved | ||
for key, value in self.rules.items(): | ||
if fnmatch.fnmatch(path, key): | ||
logging.debug(f'Applying rule "{value}" at {path}') | ||
rule = getattr(self, value) | ||
data = rule(path, data) | ||
|
||
return data | ||
|
||
def apply_rules(self, data: Any): | ||
"""Walk all objects in data and apply rules where applicable.""" | ||
return self._apply_rules(Path("/"), data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: let's stick with clayaml since this is an internal tool anyway.