-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] new config system, 1.2 tagset support #700
base: main
Are you sure you want to change the base?
Conversation
so that other classes inheriting from it can use them * Move methods from SafeConstructor to BaseConstructor * Move methods from SafeRepresenter to BaseRepresenter
More and more YAML libraries are implementing YAML 1.2, either new ones simply starting with 1.2 or older ones adding support for it. While also the syntax was changed in YAML 1.2, this pull request is about the schema changes. As an example, in 1.1, Y, yes, NO, on etc. are resolved as booleans in 1.1. This sounds convenient, but also means that all these 22 different strings must be quoted if they are not meant as booleans. A very common obstacle is the country code for Norway, NO ("Norway Problem"). In YAML 1.2 this was improved by reducing the list of boolean representations. Also other types have been improved. The 1.1 regular expression for float allows . and ._ as floats, although there isn't a single digit in these strings. While the 1.2 Core Schema, the recommended default for 1.2, still allows a few variations (true, True and TRUE, etc.), the 1.2 JSON Schema is there to match JSON behaviour regarding types, so it allows only true and false. Note that this implementation of the YAML JSON Schema might not be exactly like the spec defines it (all plain scalars not resolving to numbers, null or booleans would be an error). Short usage example: class MyCoreLoader(yaml.BaseLoader): pass class MyCoreDumper(yaml.CommonDumper): pass MyCoreLoader.init_tags('core') MyCoreDumper.init_tags('core') data = yaml.load(input, Loader=MyCoreLoader) output = yaml.dump(data, Dumper=MyCoreDumper) Detailed example code to play with: import yaml class MyCoreLoader(yaml.BaseLoader): pass MyCoreLoader.init_tags('core') class MyJSONLoader(yaml.BaseLoader): pass MyJSONLoader.init_tags('json') class MyCoreDumper(yaml.CommonDumper): pass MyCoreDumper.init_tags('core') class MyJSONDumper(yaml.CommonDumper): pass MyJSONDumper.init_tags('json') input = """ - TRUE - yes - ~ - true #- .inf #- 23 #- #empty #- !!str #empty #- 010 #- 0o10 #- 0b100 #- 0x20 #- -0x20 #- 1_000 #- 3:14 #- 0011 #- +0 #- 0001.23 #- !!str +0.3e3 #- +0.3e3 #- &x foo #- *x #- 1e27 #- 1x+27 """ print('--------------------------------------------- BaseLoader') data = yaml.load(input, Loader=yaml.BaseLoader) print(data) print('--------------------------------------------- SafeLoader') data = yaml.load(input, Loader=yaml.SafeLoader) print(data) print('--------------------------------------------- CoreLoader') data = yaml.load(input, Loader=MyCoreLoader) print(data) print('--------------------------------------------- JSONLoader') data = yaml.load(input, Loader=MyJSONLoader) print(data) print('--------------------------------------------- SafeDumper') out = yaml.dump(data, Dumper=yaml.SafeDumper) print(out) print('--------------------------------------------- MyCoreDumper') out = yaml.dump(data, Dumper=MyCoreDumper) print(out) print('--------------------------------------------- MyJSONDumper') out = yaml.dump(data, Dumper=MyJSONDumper) print(out)
This way people can play with it, and we don't promise this wrapper will stay around forever, and newly created classes CommonDumper/CommonRepresenter aren't exposed. MyCoreLoader = yaml.experimental_12_Core_loader() data = yaml.load(input, Loader=MyCoreLoader) MyCoreDumper = yaml.experimental_12_Core_dumper() out = yaml.dump(data, Dumper=MyCoreDumper)
* Loader/Dumper config mixins to create dynamic types and configure them at instantiation with generated partials * New `FastestBaseLoader`/`FastestBaseDumper` base classes to auto-select C-back impl if available
|
||
# preserve wrapped config defaults for values where we didn't get a default | ||
# FIXME: share this code with the one in __init__.dump_all (and implement on others) | ||
dumper_init_kwargs = dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would probably be better done with inspect.signature()
; then it's 100% dynamic off whatever init args the current class accepts
This is a great start, Matt! I really like the approach overall of generating a configured subclass around the existing architecture. The new While we can add this method to the the Loader and Dumper classes I think that:
I think that's a super clean top level way of using the new ideas you've added.
Also, the I assume you can do this:
All this is to say, I really like where you have gone so far, at least as I understand it. I'd just like to see the common usage idioms be even cleaner. I'll try to write up a file of all the possible usages so we can discuss them among the release team. |
Any updates on this? |
Here's a quick and dirty crack at a more broadly-encompassing dynamic config system like we've talked about before...
TagSet
type with some pre-defined instances for common 1.2 schemasyaml.load()
/yaml.dump()
et al to get the customized behavior, and the customized subclasses are GC'd once dropped on the floor.FastestBaseLoader
andFastestBaseDumper
helper classes backed by libyaml if available, and the pure-Python version if notWe can also wrap up the existing individual types that make up the tagsets so that users can pick and choose, and combine partial schemas at will without having to redefine everything. Once that's all done, we should actually be able to completely redefine all the existing currently unrelated Loader and Dumper subclasses using this, and they'll actually be related in the class hierarchy instead of just sharing mixins.
A few examples of what's in and working:
json tagset on fastest available loader
minimal custom tagset
override dumper behavior so it doesn't have to be specified every call to
dump
Still TODO: