Releases: DataONEorg/hashstore
Releases · DataONEorg/hashstore
1.1.0
HashStore 1.1.0 🎉
Release date: 2024-10-01
Release Notes
This minor HashStore release refactors the storage of data objects from persistent identifier-based to content identifier-based hashes with a tagging system, while optimizing thread safety, synchronization, readability, and logging.
Overview of Major Changes ⚙️
- Data objects are now stored with their content identifier, and are managed with reference files to establish the relationship between its authority-based or persistent identifiers (pids) I-73
- Clients can now store a data object without an identifier. They are then expected to call
tag_object
separately to create this connection between a data object and its identifier. Additionally, we recommend to calldelete_if_invalid_object
afterwards which will remove a data object that is determined to be invalid - Refactored
delete_object
to also remove all associated metadata for a given identifier and improved the atomicity of the process by first renaming the files before proceeding to delete I-87 - Metadata (ex. sysmeta, annotations) are now stored with a document name formed by the hash of the
pid+format_id
and stored in a hashstore directory formed with the hash of thepid
I-99
New Features & Enhancements 🛠️
- New Public API methods:
tag_object
,delete_if_invalid_object
and supporting methods & processes.tag_object
creates reference files linking an identifier (ex. pid) to its content identifier. I-124, I-75, I-76, I-81, I-97, I-101, I-109, I-111, I-113, I-114, I-122, I-124 - The
hashstore.yaml
config file content relating to the keys and values are now created with a.yaml
library to ensure reliability of content written I-138 - Misc. improvements to the hashstore client, along with a script entry point which is a part of the
poetry install
process which simplifies the syntax/client usage I-92, I-82, I-94, I-106 - Enhanced the thread/process synchronization process with specific threading and mulitprocessing locks to address race conditions (improved pytest time to less than 2s!) I-98
- Added thread safety to all public API calls when working with metadata objects I-99
- Refactored
ObjectMetadata
is be a@dataclass
I-126 - Various bug fixes and optimizations to the overall codebase to improve overall readability and clarity to resolve linting warnings I-139, I-136, I-72, I-112, I-119, I-121, I-125, I-85
- Revised python docstrings into reStructuredText (
sphinx
autodocumentation compatible format) and added type hints I-70, I-137 - Cleaned up logging statements which now utilizes a logger object I-93, I-140, I-90
1.0.0
HashStore 1.0.0 🎉
We are excited to announce the first major release of HashStore (1.0.0). To start using HashStore, include it in your project using your preferred package manager or download the source code from our GitHub repository. To see code/usage examples, please refer to our documentation.
Key Features
- Public API:
- store_object
- store_metadata
- delete_object
- delete_metadata
- retrieve_object
- retrieve_metadata
- get_hex_digest
- Command Line Tool
- Create a new HashStore or interact directly with a HashStore with the command line/terminal
Developer Notes:
- HashStore has been extensively tested with python's multiprocessing and threading standard libraries, please see issue-32 for more details.
- Interrupting
store_object
abruptly (like with a forceful volume unmount or a keyboard interrupt) will leave temporary files behind. To manage these files, we recommend implementing a separate monitor/watchdog to keep on top of the processes.
Feedback and Contributions:
- HashStore is an open source project, and we welcome full participation in the project. Contributions are reviewed and suggestions are made to increase the value of HashStore to the community. We strive to incorporate code, documentation, and other useful contributions quickly and efficiently while maintaining a high-quality software product.
- We would appreciate any feedback! If you encounter any issues, have suggestions, or want to contribute to the project, please create an issue or submit a pull request on our GitHub repository.