HashStore-java 1.1.0 🎉
Release date: 2024-11-21
Release Notes
The HashStore-java 1.1.0 release brings feature parity with the parallel Python-based HashStore 1.1.0 version. Key features include storing data objects based on the hashes of their content identifiers, a tagging system that uses persistent identifiers to manage and retrieve data objects and metadata, and support for storing multiple metadata documents (ex. annotations, sysmeta) for a single identifier.
Overview of Features & Enhancements ⚙️
- Data objects are now stored with their content identifier, and are managed with reference files to establish the relationship between its authority-based or persistent identifiers (pids) I-55, I-60
- Clients can now store a data object without an identifier. They are then expected to call
tagObject
separately to create this connection between a data object and its identifier. Additionally, we recommend to calldeleteIfInvalidObject
afterwards which will remove a data object that is determined to be invalid I-65, I-80 - Removed unused
storeObject
method overloads to simplify the public API I-82 - Introduced new utility classes,
HashStoreConverter
andFileHashStoreLinks
, to assist Metacat in migrating existing data and metadata objects into HashStore via the creation of hard links (rather than writing and storing a data object) I-95, I-110, I-101 - Added guard rails to prevent illegal characters being used for given persistent identifiers I-81
- Refactored
deleteObject
to also remove all associated metadata for a given identifier and improved the atomicity of the process by first renaming the files before proceeding to delete I-57 - Metadata (ex. sysmeta, annotations) are stored with a document name formed by the hash of the
pid+format_id
and stored in a hashstore directory formed with the hash of thepid
I-61 - Optimized the .jar file packaging process to include only necessary dependencies to create a more streamlined and efficient build I-67, I-104
- Clarified exception handling within
HashStoreFactory
by replacing .fillInStackTrace() with .getMessage() to provide more informative error messages I-68. - Refined the HashStore initialization process to ensure clients specify a precise folder path (storePath) when creating a HashStore instance via HashStoreFactory. This update includes improved exception handling and clearer messaging for debugging purposes I-69.
- Refactored
findObject
to be a private method and to provide comprehensive information to assist with internal processes I-72, I-87 - Explicitly set default permission settings when generating temporary files and directories to ensure HashStore objects are group-readable (facilitating proper access control for external services) I-108, I-111
- Revised the NonMatchingChecksumException to include the calculated hex digests, providing more detailed information for debugging checksum mismatches I-114.
- Upgraded to Java 17 from 1.8 I-54
Bug Fixes 🛠️
- Resolved potential race condition issues & improved thread safety of
tagObject
I-75, I-74, I-83, I-97 - Resolved an issue where temporary .tmp files were not consistently deleted when the same data object was stored multiple times, improving reliability in storage processes I-88
- Updated inaccurate junit examples regarding system metadata format I-62