Rust bindings to Google CityHash's C++ API.
CityHash-sys do not load the standard library (a.k.a no_std
).
Status
Table of contents
CityHash provides hash functions for strings. Functions mix the input bits thoroughly but are not suitable for cryptography. CityHash-sys is tested on little-endian but should work on big-endian architecture.
use cityhash_sys::CityHashBuildHasher;
use std::collections::HashMap;
const KEY: &str = "hash";
const VALUE: &str = "me!";
// Create a HashMap that use CityHash64 to hash keys
let mut map = HashMap::with_hasher(CityHashBuildHasher::default());
map.insert(KEY, VALUE);
assert_eq!(map.get(&KEY), Some(&VALUE));
Note CityHashBuildHasher
is an alias to the the 64-bits CityHash CityHash64Hasher
. CityHash32Hasher
and CityHash128Hasher
are also available but result are still u64
. See documentation for more details.
Rust bindings provides a safe interface to all Google's CityHash hash functions that do not make use of x86_64 CRC intrinsic:
32-bit hash
// uint32 CityHash32(const char *, size_t);
fn city_hash_32(buf: &[u8]) -> u32;
64-bit hash
// uint64 CityHash64(const char *, size_t);
fn city_hash_64(buf: &[u8]) -> u64;
// uint64 CityHash64WithSeed(const char *, size_t, uint64);
fn city_hash_64_with_seed(buf: &[u8], seed: u64) -> u64;
// uint64 CityHash64WithSeeds(const char *, size_t, uint64, uint64);
fn city_hash_64_with_seeds(buf: &[u8], seed_0: u64, seed_1: u64) -> u64;
128-bit hash
// uint128 CityHash128(const char *, size_t);
fn city_hash_128(buf: &[u8]) -> u128;
// uint128 CityHash128WithSeed(const char *, size_t, uint128);
fn city_hash_128_with_seed(buf: &[u8], seed: u128) -> u128;
// uint64 Hash128to64(const uint128&);
fn city_hash_128_to_64(hash: u128) -> u64;
Note: Depending on your compiler and hardware, it's likely faster than CityHash64() on sufficiently long strings. It's slower than necessary on shorter strings.
Some functions are available only if the target is x86_64
and support at least sse4.2
target feature because of the usage of CRC-32 intrinsic _mm_crc32_u64
. If we want to enable those functions use -C target-feature=+sse4.2
or above (avx
or avx2
).
Note that depending of the length of the buffer you want to hash, it can be faster to use the non-intrinsic version.
If the buffer to hash is less than 900 bytes, CityHashCrc128WithSeed
and CityHashCrc128
will respectivelly internally call CityHash128WithSeed
and CityHash128
, in this case, it is better to call directly CityHash128WithSeed
or CityHash128
.
128-bit hash with CRC-32 intrinsic
// uint128 CityHashCrc128(const char *, size_t);
unsafe fn city_hash_crc_128(buf: &[u8]) -> u128;
// uint128 CityHashCrc128WithSeed(const char *, size_t, uint128);
unsafe fn city_hash_crc_128_with_seed(buf: &[u8], seed: u128) -> u128;
256-bit hash with CRC-32 intrinsic
// void CityHashCrc256(const char *, size_t, uint64 *);
unsafe fn city_hash_crc_256(buf: &[u8]) -> [u64; 4];
On 64-bits hardware, CityHash is suitable for short string hashing, e.g., most hash table keys, especially city_hash_64
that is faster than city_hash_128
.
On 32-bits hardware, CityHash is the nearest competitor of Murmur3 on x86.
See the Google Cityhash README