Skip to content

VEuPathDB/lib-hash-id

Repository files navigation

HashID

Provides an implementation of a concrete type for a collidable 128-bit digest based identifier that is safer and more robust than passing around a raw hash string.

Reasoning

HashID guarantees that its value is a valid MD5 hash value and allows consuming code to rely on that guarantee and make safe assumptions and calls without needing to verify the contents of the ID.

This effectively eliminates the class of bugs that could arise from a missed validation on a raw input string or byte array.

In addition to the standard constructors taking a hex string hash, or a raw byte array, HashID provides convenience methods for generating new HashID instances by generating an MD5 hash of a given arbitrary input, including from InputStream.

Usage

dependencies {
  implementation("org.veupathdb.lib:hash-id:1.1.0")
}

Getting a New HashID

From a raw value
// Create a HashID by MD5 hashing a string value.
var myID_1 = HashID.ofMD5("my raw value");

// Create a HashID by MD5 hashing the contents of an InputStream.
var myID_2 = HashID.ofMD5(someInputStream);

// Create a HashID by MD5 hashing the stringified form of an arbitrary object.
var myID_3 = HashID.ofMD5(someStringifiableValue);
From a hash value
// Construct a hash ID from a valid Hex string
var myID_1 = new HashID(myMD5String);

// Construct a hash ID from a valid `byte[16]` array.
var myID_2 = new HashID(myMD5ByteArray);

Validation

The HashID type validates its wrapped value on construction to ensure that it cannot contain an invalid value. This means that you can rely on the fact that if your code was called, passing in a HashID instance, the wrapped value is, in fact, a valid 128-bit digest.

Invalid Instantiation
try {
  var myID_1 = new HashID("Hello world!");
} catch (IllegalArgumentException e) {
  System.out.println("Oops, can't construct a HashID from an invalid string.")
}

try {
  var myID_2 = new HashID(new byte[19]);
} catch (IllegalArgumentException e) {
  System.out.println("Oops, can't construct a HashID from an invalid byte array.")
}

Schema

Json Schema:
https://raw.githubusercontent.com/VEuPathDB/lib-hash-id/v1.1.0/schema/hash-id.json
RAML DataType
https://raw.githubusercontent.com/VEuPathDB/lib-hash-id/v1.1.0/schema/hash-id.raml

Implementation

HashID is effectively just a "new-type" wrapper around an immutable 16 byte array that offers methods for accessing the raw value as either the contained byte array or as a 32 character hex string.

A HashID is safe to use in Sets or as keys in a Map.