Skip to content

Commit

Permalink
pack-objects: use 64-bit name hash
Browse files Browse the repository at this point in the history
This change takes the "uniform" full-name hash and the "locality-preserving"
name hash and combines them into a single 64-bit name hash. This increases
the memory load of the packing process (by four bytes per object).

By sorting by name-hash and then full-name-hash, this should put objects
with same name-hash close, but broken down within that via the
full-name-hash. However, this is not demonstrating significant differences
in the size of the pack-files.

Test                                           HEAD~1                 HEAD
------------------------------------------------------------------------------------------------
5313.2: thin pack                              0.08(0.06+0.01)        0.08(0.06+0.01) +0.0%
5313.3: thin pack size                                  852.7K                 852.7K +0.0%
5313.4: thin pack with --full-name-hash        0.03(0.01+0.01)        0.03(0.01+0.01) +0.0%
5313.5: thin pack size with --full-name-hash            401.9K                 401.9K +0.0%
5313.6: big pack                               1.02(1.87+0.14)        1.11(1.91+0.20) +8.8%
5313.7: big pack size                                    58.5M                  58.5M -0.0%
5313.8: big pack with --full-name-hash         0.91(1.46+0.12)        1.00(1.52+0.16) +9.9%
5313.9: big pack size with --full-name-hash              58.0M                  58.0M +0.0%
5313.10: repack                                104.65(391.84+12.31)   105.51(383.09+14.05) +0.8%
5313.11: repack size                                    438.6M                 438.4M -0.1%
5313.12: repack with --full-name-hash          22.71(74.94+6.08)      22.91(74.73+6.16) +0.9%
5313.13: repack size with --full-name-hash              167.6M                 168.3M +0.4%

Signed-off-by: Derrick Stolee <stolee@gmail.com>
  • Loading branch information
derrickstolee committed Sep 11, 2024
1 parent 0ec4b66 commit ab5a3e5
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
7 changes: 5 additions & 2 deletions builtin/pack-objects.c
Original file line number Diff line number Diff line change
Expand Up @@ -268,10 +268,13 @@ static struct oidmap configured_exclusions;
static struct oidset excluded_by_config;
static int use_full_name_hash;

static inline uint32_t pack_name_hash_fn(const char *name)
static inline uint64_t pack_name_hash_fn(const char *name)
{
if (use_full_name_hash)
return pack_full_name_hash(name);
/* Use name-hash as most-significant bits. */
return (((uint64_t)pack_name_hash(name)) << 32) |
(uint64_t) pack_full_name_hash(name);

return pack_name_hash(name);
}

Expand Down
2 changes: 1 addition & 1 deletion pack-objects.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ struct object_entry {
struct pack_idx_entry idx;
void *delta_data; /* cached delta (uncompressed) */
off_t in_pack_offset;
uint32_t hash; /* name hint hash */
uint64_t hash; /* name hint hash */
unsigned size_:OE_SIZE_BITS;
unsigned size_valid:1;
uint32_t delta_idx; /* delta base object */
Expand Down

0 comments on commit ab5a3e5

Please sign in to comment.