Skip to content

1.1.0

Latest
Compare
Choose a tag to compare
@doulikecookiedough doulikecookiedough released this 21 Nov 18:44
99dbd78

HashStore-java 1.1.0 🎉

Release date: 2024-11-21

Release Notes

The HashStore-java 1.1.0 release brings feature parity with the parallel Python-based HashStore 1.1.0 version. Key features include storing data objects based on the hashes of their content identifiers, a tagging system that uses persistent identifiers to manage and retrieve data objects and metadata, and support for storing multiple metadata documents (ex. annotations, sysmeta) for a single identifier.

Overview of Features & Enhancements ⚙️

  • Data objects are now stored with their content identifier, and are managed with reference files to establish the relationship between its authority-based or persistent identifiers (pids) I-55, I-60
  • Clients can now store a data object without an identifier. They are then expected to call tagObject separately to create this connection between a data object and its identifier. Additionally, we recommend to call deleteIfInvalidObject afterwards which will remove a data object that is determined to be invalid I-65, I-80
  • Removed unused storeObject method overloads to simplify the public API I-82
  • Introduced new utility classes, HashStoreConverter and FileHashStoreLinks, to assist Metacat in migrating existing data and metadata objects into HashStore via the creation of hard links (rather than writing and storing a data object) I-95, I-110, I-101
  • Added guard rails to prevent illegal characters being used for given persistent identifiers I-81
  • Refactored deleteObject to also remove all associated metadata for a given identifier and improved the atomicity of the process by first renaming the files before proceeding to delete I-57
  • Metadata (ex. sysmeta, annotations) are stored with a document name formed by the hash of the pid+format_id and stored in a hashstore directory formed with the hash of the pid I-61
  • Optimized the .jar file packaging process to include only necessary dependencies to create a more streamlined and efficient build I-67, I-104
  • Clarified exception handling within HashStoreFactory by replacing .fillInStackTrace() with .getMessage() to provide more informative error messages I-68.
  • Refined the HashStore initialization process to ensure clients specify a precise folder path (storePath) when creating a HashStore instance via HashStoreFactory. This update includes improved exception handling and clearer messaging for debugging purposes I-69.
  • Refactored findObject to be a private method and to provide comprehensive information to assist with internal processes I-72, I-87
  • Explicitly set default permission settings when generating temporary files and directories to ensure HashStore objects are group-readable (facilitating proper access control for external services) I-108, I-111
  • Revised the NonMatchingChecksumException to include the calculated hex digests, providing more detailed information for debugging checksum mismatches I-114.
  • Upgraded to Java 17 from 1.8 I-54

Bug Fixes 🛠️

  • Resolved potential race condition issues & improved thread safety of tagObject I-75, I-74, I-83, I-97
  • Resolved an issue where temporary .tmp files were not consistently deleted when the same data object was stored multiple times, improving reliability in storage processes I-88
  • Updated inaccurate junit examples regarding system metadata format I-62