Log of changes in the versions
- hotfix
skipND
when callingdump_jsonld()
. Option was not passed correctly to underlying function.
rootgroup
as alias forrootparent
ZenodoRecord
has new propertyenv_name_for_token
to define the environment variable name to be used for the Zenodo token- bugfix downloading zenodo files
- allowing higher versions of
pymongo
- bugfix dumping json-ld
- a json-ld string can be assigned to a rdf object (see https://h5rdmtoolbox.readthedocs.io/en/latest/userguide/wrapper/FAIRAttributes.html)
- make compliant with higher
pydantic
andontolutils
versions - concrete version selection for other dependencies
- Downloading files will be cached by their checksum and/or URL. This avoids multiple downloads of the same file.
RepositoryFile
has new abstract propertysuffix
,RepositoryInterface
has new abstract methodget_jsonld
RepositoryInterface
has new abstract propertyidentifier
andtitle
- bugfixes
- update package dependency versions
- minor bugfixes and updates in documentation
- using suffix
.jsonld
instead of.json
for JSON-LD files, as it is recommended (see https://www.w3.org/TR/json-ld/#iana-considerations) - bugfixes in documentation (links, figures, ...)
- enhancing zenodo interfaces:
- removed depr methods, e.g.
get()
fromAbstractZenodoInterface
- using cached json dict for zenodo records. call
refresh()
to update the json - minor bugfixes
- introduced property
files
, which isDict[str, RepositoryFile]
- improve url handling by using properties instead of class variables
- removed depr methods, e.g.
- The repository interface to Zenodo has one single upload method
upload_file
with the parametermetamapper
. It is a callable which extracts meta information from the actual file to be uploaded. This is especially useful and specifically intended for HDF5 files. Unless the value formetamapper
isNone
, theupload_file
method will use the built-in hdf5 extraction function automatically on HDF5 files. - Clarify abstraction for HDF5 database interfaces.
HDF5DBInterface
is the top abstraction from whichExtHDF5Interface
inherits.ExtHDF5Interface
makes use of external databases such as mongoDB. - fix issue in online documentation: mongomock is used to run the mongodb jupyter notebook in the documentation
- codemeta.json file is updated with author and institution ROR ID
- calling the RDF accessor on an attribute name will only work if the attribute already exists. If not, an error is raised.
- Likewise, if an attribute is deleted, the entry in the RDF accessor dictionary is deleted
- minor fixes
- add $in operator to query functions
- add argument
rebase
to layout specifications
- important changes:
- improved and consequent support of RDF/JSON-LD. This means, an HDF5 can be created from a JSON-LD file and vice versa. The JSON-LD file contains the structural and contextual metadata of the HDF5 file.
- namespaces are outsourced to
ontolutils
- minor changes:
- When a file is opened with a filename which does not exist and mode is None, the file will NOT be created. This was the case in the past, but this may lead to unwanted empty files.
- Bugfix namespace creation
- some method renaming and refactoring
- accessors are refactored and improved (especially shifted away from xarray and fully integrated in hdf)
- Hotfix dumping json-ld data (dimension scales were the issue)
- Add codemeta namespace
- Improved json-ld export
- Updated qudt namespace
- colab notebook will be managed on a separate branch. the readme link points to the branch
- Improved assignment of IRI to attributes
- Export of a JSON-LD file possible
- Updated documentation
- bugfixes
- bugfix: Setting a default value for toolbox validators in convention yaml file was not working. Fixed it.
- simplified and clean up much code, especially convention sub package
- added identifier utils
- updated and improved documentation
- fixed unnecessary call in
create_dataset
, which writes the data twice. Now, the time data is written is comparable to the timeh5py
needs to write the data (for small datasetsh5py
is still faster due to the (constant) overhead,h5tbx
adds).
major changes:
- zenodo is not a dependency anymore but is implemented as a new subpackage of the toolbox
- zenodo is part or
repository
which is designed to provide interfaces to different data repositories (however, onlyzenodo
is implemented at the moment) - the database architecture is changed similarly, such that it has a more logic structure
- both above changes follow a more or less strict inheritance structure from abstract classes defining the interface to repositories or databases (databases are meant to be local, like mongoDB, SQL, etc., repositories are online data storage, like zenodo, which allows searching for metadata but not within the raw files.)
- python 3.8 until 3.12 inclusive are supported
- IRI as persistent identifier is now supported, which fulfills the F3 requirement of the FAIR principles ("Metadata clearly and explicitly include the identifier of the data they describe", https://www.go-fair.org/fair-principles/)
- package renaming and reorganization:
conventions
is nowconvention
,layout
is now a module, new isrepository
- usage of IRI (persistent identifier) is now supported
- scale and offset is now implemented in the package. it should no longer be defined in a convention.
- bugfix normalization extension
- bugfix exporting xr.DataArray built with the toolbox to netCDF
- support usage of IRI to describe metadata
- bugfix requirements
- add
$exist
find()
-methods inside of HDF files - bugfix 0D time data as dimension
- module query functions (
find
, ...) can guess the filename from objects that have withhasattr(hdf_filename, 'hdf_filename')
- bugfix in zenodo search (did Zenodo change their API?)
- 0D data is written to MongoDB
- new utils like computing file size
- update to new zenodo_search package due to change in backend at Zenodo.org
find
,find_one
anddistinct
can be called on HDF files- small bugfixes
- bugfix standard attribute validation
- bugfix in
EngMeta.ipynb
- working with time data is now possible:
- time data can be created using the high-level method
create_time_dataset
- slicing the above "time datasets" will return xarray data
- See
docs/wrapper/Misc.ipynb
- time data can be created using the high-level method
- fixed issue with user-defined & nested standard attributes
- Initial version, published on
pypi