{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":103164780,"defaultBranch":"master","name":"git","ownerLogin":"derrickstolee","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2017-09-11T17:11:03.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/570044?v=4","public":true,"private":false,"isOrgOwned":false},"refInfo":{"name":"","listCacheKey":"v0:1726758318.0","currentOid":""},"activityList":{"items":[{"before":"27bd942cc8eca497a06de7933bc9feeeed3bb5d1","after":"4059aca9c00047af8b8b2e06655f06ff4cb52d9e","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-19T20:09:19.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: thread the path-based compression\n\nAdapting the implementation of ll_find_deltas(), create a threaded\nversion of the --path-walk compression step in 'git pack-objects'.\n\nThis involves adding a 'regions' member to the thread_params struct,\nallowing each thread to own a section of paths. We can simplify the way\njobs are split because there is no value in extending the batch based on\nname-hash the way sections of the object entry array are attempted to be\ngrouped. We re-use the 'list_size' and 'remaining' items for the purpose\nof borrowing work in progress from other \"victim\" threads when a thread\nhas finished its batch of work more quickly.\n\nUsing the Git repository as a test repo, the p5313 performance test\nshows that the resulting size of the repo is the same, but the threaded\nimplementation gives gains of varying degrees depending on the number of\nobjects being packed. (This was tested on a 16-core machine.)\n\nTest HEAD~1 HEAD\n-------------------------------------------------------------\n5313.6: thin pack with --path-walk 0.01 0.01 +0.0%\n5313.7: thin pack size with --path-walk 475 475 +0.0%\n5313.12: big pack with --path-walk 1.99 1.87 -6.0%\n5313.13: big pack size with --path-walk 14.4M 14.3M -0.4%\n5313.18: repack with --path-walk 98.14 41.46 -57.8%\n5313.19: repack size with --path-walk 197.2M 197.3M +0.0%\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: thread the path-based compression"}},{"before":"80beb909b63c3214e28f6ae01ee529c7399ed369","after":"51a1ba3d0274faf276f866d4bb61365dfabb1a07","ref":"refs/heads/survey-on-full","pushedAt":"2024-09-19T20:09:19.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"survey: add report of \"largest\" paths\n\nSince we are already walking our reachable objects using the path-walk API,\nlet's now collect lists of the paths that contribute most to different\nmetrics. Specifically, we care about\n\n * Number of versions.\n * Total size on disk.\n * Total inflated size (no delta or zlib compression).\n\nThis information can be critical to discovering which parts of the\nrepository are causing the most growth, especially on-disk size. Different\npacking strategies might help compress data more efficiently, but the toal\ninflated size is a representation of the raw size of all snapshots of those\npaths. Even when stored efficiently on disk, that size represents how much\ninformation must be processed to complete a command such as 'git blame'.\n\nSince the on-disk size is likely to be fragile, stop testing the exact\noutput of 'git survey' and check that the correct set of headers is\noutput.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"survey: add report of \"largest\" paths"}},{"before":"1d8fb85c6c15be0028707a54ee1d6d3d9d6a5c87","after":"f078d6cf89bf5e80cfcd00ae274fb76aa7c40d6c","ref":"refs/heads/backfill-on-full","pushedAt":"2024-09-19T20:09:19.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"backfill: assume --sparse when sparse-checkout is enabled\n\nThe previous change introduced the '--[no-]sparse' option for the 'git\nbackfill' command, but did not assume it as enabled by default. However,\nthis is likely the behavior that users will most often want to happen.\nWithout this default, users with a small sparse-checkout may be confused\nwhen 'git backfill' downloads every version of every object in the full\nhistory.\n\nHowever, this is left as a separate change so this decision can be reviewed\nindependently of the value of the '--[no-]sparse' option.\n\nAdd a test of adding the '--sparse' option to a repo without sparse-checkout\nto make it clear that supplying it without a sparse-checkout is an error.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"backfill: assume --sparse when sparse-checkout is enabled"}},{"before":"a89ee77358d72fe84f9486b7712447ef7d3461dd","after":"1d8fb85c6c15be0028707a54ee1d6d3d9d6a5c87","ref":"refs/heads/backfill-on-full","pushedAt":"2024-09-19T15:54:40.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"backfill: assume --sparse when sparse-checkout is enabled\n\nThe previous change introduced the '--[no-]sparse' option for the 'git\nbackfill' command, but did not assume it as enabled by default. However,\nthis is likely the behavior that users will most often want to happen.\nWithout this default, users with a small sparse-checkout may be confused\nwhen 'git backfill' downloads every version of every object in the full\nhistory.\n\nHowever, this is left as a separate change so this decision can be reviewed\nindependently of the value of the '--[no-]sparse' option.\n\nAdd a test of adding the '--sparse' option to a repo without sparse-checkout\nto make it clear that supplying it without a sparse-checkout is an error.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"backfill: assume --sparse when sparse-checkout is enabled"}},{"before":"b84cf7ab5bf66b9f0c6a4a54ed7be9ece0075125","after":"27bd942cc8eca497a06de7933bc9feeeed3bb5d1","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-19T15:54:40.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: thread the path-based compression\n\nAdapting the implementation of ll_find_deltas(), create a threaded\nversion of the --path-walk compression step in 'git pack-objects'.\n\nThis involves adding a 'regions' member to the thread_params struct,\nallowing each thread to own a section of paths. We can simplify the way\njobs are split because there is no value in extending the batch based on\nname-hash the way sections of the object entry array are attempted to be\ngrouped. We re-use the 'list_size' and 'remaining' items for the purpose\nof borrowing work in progress from other \"victim\" threads when a thread\nhas finished its batch of work more quickly.\n\nUsing the Git repository as a test repo, the p5313 performance test\nshows that the resulting size of the repo is the same, but the threaded\nimplementation gives gains of varying degrees depending on the number of\nobjects being packed. (This was tested on a 16-core machine.)\n\nTest HEAD~1 HEAD\n-------------------------------------------------------------\n5313.6: thin pack with --path-walk 0.01 0.01 +0.0%\n5313.7: thin pack size with --path-walk 475 475 +0.0%\n5313.12: big pack with --path-walk 1.99 1.87 -6.0%\n5313.13: big pack size with --path-walk 14.4M 14.3M -0.4%\n5313.18: repack with --path-walk 98.14 41.46 -57.8%\n5313.19: repack size with --path-walk 197.2M 197.3M +0.0%\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: thread the path-based compression"}},{"before":"dcb5b05f912f52c73906ee051f4bf87e4139d1dd","after":"80beb909b63c3214e28f6ae01ee529c7399ed369","ref":"refs/heads/survey-on-full","pushedAt":"2024-09-19T15:54:40.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"survey: add report of \"largest\" paths\n\nSince we are already walking our reachable objects using the path-walk API,\nlet's now collect lists of the paths that contribute most to different\nmetrics. Specifically, we care about\n\n * Number of versions.\n * Total size on disk.\n * Total inflated size (no delta or zlib compression).\n\nThis information can be critical to discovering which parts of the\nrepository are causing the most growth, especially on-disk size. Different\npacking strategies might help compress data more efficiently, but the toal\ninflated size is a representation of the raw size of all snapshots of those\npaths. Even when stored efficiently on disk, that size represents how much\ninformation must be processed to complete a command such as 'git blame'.\n\nSince the on-disk size is likely to be fragile, stop testing the exact\noutput of 'git survey' and check that the correct set of headers is\noutput.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"survey: add report of \"largest\" paths"}},{"before":"21c06f3c563ef3a69cb01d2caecc54b20da5fcbf","after":"dcb5b05f912f52c73906ee051f4bf87e4139d1dd","ref":"refs/heads/survey-on-full","pushedAt":"2024-09-19T15:52:52.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"survey: add report of \"largest\" paths\n\nSince we are already walking our reachable objects using the path-walk API,\nlet's now collect lists of the paths that contribute most to different\nmetrics. Specifically, we care about\n\n * Number of versions.\n * Total size on disk.\n * Total inflated size (no delta or zlib compression).\n\nThis information can be critical to discovering which parts of the\nrepository are causing the most growth, especially on-disk size. Different\npacking strategies might help compress data more efficiently, but the toal\ninflated size is a representation of the raw size of all snapshots of those\npaths. Even when stored efficiently on disk, that size represents how much\ninformation must be processed to complete a command such as 'git blame'.\n\nSince the on-disk size is likely to be fragile, stop testing the exact\noutput of 'git survey' and check that the correct set of headers is\noutput.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"survey: add report of \"largest\" paths"}},{"before":"2d68dbac2863533fe63b5fda89d981110e727fba","after":"21c06f3c563ef3a69cb01d2caecc54b20da5fcbf","ref":"refs/heads/survey-on-full","pushedAt":"2024-09-19T15:51:42.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"survey: add report of \"largest\" paths\n\nSince we are already walking our reachable objects using the path-walk API,\nlet's now collect lists of the paths that contribute most to different\nmetrics. Specifically, we care about\n\n * Number of versions.\n * Total size on disk.\n * Total inflated size (no delta or zlib compression).\n\nThis information can be critical to discovering which parts of the\nrepository are causing the most growth, especially on-disk size. Different\npacking strategies might help compress data more efficiently, but the toal\ninflated size is a representation of the raw size of all snapshots of those\npaths. Even when stored efficiently on disk, that size represents how much\ninformation must be processed to complete a command such as 'git blame'.\n\nSince the on-disk size is likely to be fragile, stop testing the exact\noutput of 'git survey' and check that the correct set of headers is\noutput.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"survey: add report of \"largest\" paths"}},{"before":"7b97e4d7a9c01c17fe9f0bba83937b334b0baa52","after":"2d68dbac2863533fe63b5fda89d981110e727fba","ref":"refs/heads/survey-on-full","pushedAt":"2024-09-19T15:50:20.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"survey: add report of \"largest\" paths\n\nSince we are already walking our reachable objects using the path-walk API,\nlet's now collect lists of the paths that contribute most to different\nmetrics. Specifically, we care about\n\n * Number of versions.\n * Total size on disk.\n * Total inflated size (no delta or zlib compression).\n\nThis information can be critical to discovering which parts of the\nrepository are causing the most growth, especially on-disk size. Different\npacking strategies might help compress data more efficiently, but the toal\ninflated size is a representation of the raw size of all snapshots of those\npaths. Even when stored efficiently on disk, that size represents how much\ninformation must be processed to complete a command such as 'git blame'.\n\nSince the on-disk size is likely to be fragile, stop testing the exact\noutput of 'git survey' and check that the correct set of headers is\noutput.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"survey: add report of \"largest\" paths"}},{"before":null,"after":"7b97e4d7a9c01c17fe9f0bba83937b334b0baa52","ref":"refs/heads/survey-on-full","pushedAt":"2024-09-19T15:05:18.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"survey: add report of \"largest\" paths\n\nSince we are already walking our reachable objects using the path-walk API,\nlet's now collect lists of the paths that contribute most to different\nmetrics. Specifically, we care about\n\n * Number of versions.\n * Total size on disk.\n * Total inflated size (no delta or zlib compression).\n\nThis information can be critical to discovering which parts of the\nrepository are causing the most growth, especially on-disk size. Different\npacking strategies might help compress data more efficiently, but the toal\ninflated size is a representation of the raw size of all snapshots of those\npaths. Even when stored efficiently on disk, that size represents how much\ninformation must be processed to complete a command such as 'git blame'.\n\nSince the on-disk size is likely to be fragile, stop testing the exact\noutput of 'git survey' and check that the correct set of headers is\noutput.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"survey: add report of \"largest\" paths"}},{"before":"33fa4316d636a6c5eff97910c2ed6dd452182f56","after":"a89ee77358d72fe84f9486b7712447ef7d3461dd","ref":"refs/heads/backfill-on-full","pushedAt":"2024-09-19T15:05:18.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"backfill: assume --sparse when sparse-checkout is enabled\n\nThe previous change introduced the '--[no-]sparse' option for the 'git\nbackfill' command, but did not assume it as enabled by default. However,\nthis is likely the behavior that users will most often want to happen.\nWithout this default, users with a small sparse-checkout may be confused\nwhen 'git backfill' downloads every version of every object in the full\nhistory.\n\nHowever, this is left as a separate change so this decision can be reviewed\nindependently of the value of the '--[no-]sparse' option.\n\nAdd a test of adding the '--sparse' option to a repo without sparse-checkout\nto make it clear that supplying it without a sparse-checkout is an error.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"backfill: assume --sparse when sparse-checkout is enabled"}},{"before":"a9fc233390ae00e3d4b156be64d6b3974e30d8a1","after":"b84cf7ab5bf66b9f0c6a4a54ed7be9ece0075125","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-19T15:05:18.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: thread the path-based compression\n\nAdapting the implementation of ll_find_deltas(), create a threaded\nversion of the --path-walk compression step in 'git pack-objects'.\n\nThis involves adding a 'regions' member to the thread_params struct,\nallowing each thread to own a section of paths. We can simplify the way\njobs are split because there is no value in extending the batch based on\nname-hash the way sections of the object entry array are attempted to be\ngrouped. We re-use the 'list_size' and 'remaining' items for the purpose\nof borrowing work in progress from other \"victim\" threads when a thread\nhas finished its batch of work more quickly.\n\nUsing the Git repository as a test repo, the p5313 performance test\nshows that the resulting size of the repo is the same, but the threaded\nimplementation gives gains of varying degrees depending on the number of\nobjects being packed. (This was tested on a 16-core machine.)\n\nTest HEAD~1 HEAD\n-------------------------------------------------------------\n5313.6: thin pack with --path-walk 0.01 0.01 +0.0%\n5313.7: thin pack size with --path-walk 475 475 +0.0%\n5313.12: big pack with --path-walk 1.99 1.87 -6.0%\n5313.13: big pack size with --path-walk 14.4M 14.3M -0.4%\n5313.18: repack with --path-walk 98.14 41.46 -57.8%\n5313.19: repack size with --path-walk 197.2M 197.3M +0.0%\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: thread the path-based compression"}},{"before":null,"after":"965a08a5d526ae75428727d0f9aa22ea22a25ed9","ref":"refs/heads/background-quiet-credentials","pushedAt":"2024-09-19T13:50:02.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"scalar: configure maintenance during 'reconfigure'\n\nThe 'scalar reconfigure' command is intended to update registered repos\nwith the latest settings available. However, up to now we were not\nreregistering the repos with background maintenance.\n\nIn particular, this meant that the background maintenance schedule would\nnot be updated if there are improvements between versions.\n\nBe sure to register repos for maintenance during the reconfigure step.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"scalar: configure maintenance during 'reconfigure'"}},{"before":null,"after":"33fa4316d636a6c5eff97910c2ed6dd452182f56","ref":"refs/heads/backfill-on-full","pushedAt":"2024-09-19T02:05:21.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"backfill: assume --sparse when sparse-checkout is enabled\n\nThe previous change introduced the '--[no-]sparse' option for the 'git\nbackfill' command, but did not assume it as enabled by default. However,\nthis is likely the behavior that users will most often want to happen.\nWithout this default, users with a small sparse-checkout may be confused\nwhen 'git backfill' downloads every version of every object in the full\nhistory.\n\nHowever, this is left as a separate change so this decision can be reviewed\nindependently of the value of the '--[no-]sparse' option.\n\nAdd a test of adding the '--sparse' option to a repo without sparse-checkout\nto make it clear that supplying it without a sparse-checkout is an error.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"backfill: assume --sparse when sparse-checkout is enabled"}},{"before":"20e309f8f6a94a4ea826475706d39509b6baec61","after":"7e47fc8cb53647ad92c86801204c3089a5dfe8e6","ref":"refs/heads/full-name","pushedAt":"2024-09-18T20:46:01.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"test-tool: add helper for name-hash values\n\nAdd a new test-tool helper, name-hash, to output the value of the\nname-hash algorithms for the input list of strings, one per line.\n\nSince the name-hash values can be stored in the .bitmap files, it is\nimportant that these hash functions do not change across Git versions.\nAdd a simple test to t5310-pack-bitmaps.sh to provide some testing of\nthe current values. Due to how these functions are implemented, it would\nbe difficult to change them without disturbing these values.\n\nCreate a performance test that uses test_size to demonstrate how\ncollisions occur for these hash algorithms. This test helps inform\nsomeone as to the behavior of the name-hash algorithms for their repo\nbased on the paths at HEAD.\n\nMy copy of the Git repository shows modest statistics around the\ncollisions of the default name-hash algorithm:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 4.5K\n5314.2: number of distinct name-hashes 4.1K\n5314.3: number of distinct full-name-hashes 4.5K\n5314.4: maximum multiplicity of name-hashes 13\n5314.5: maximum multiplicity of fullname-hashes 1\n\nHere, the maximum collision multiplicity is 13, but around 10% of paths\nhave a collision with another path.\n\nIn a more interesting example, the microsoft/fluentui [1] repo had these\nstatistics at time of committing:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 19.6K\n5314.2: number of distinct name-hashes 8.2K\n5314.3: number of distinct full-name-hashes 19.6K\n5314.4: maximum multiplicity of name-hashes 279\n5314.5: maximum multiplicity of fullname-hashes 1\n\n[1] https://github.com/microsoft/fluentui\n\nThat demonstrates that of the nearly twenty thousand path names, they\nare assigned around eight thousand distinct values. 279 paths are\nassigned to a single value, leading the packing algorithm to sort\nobjects from those paths together, by size.\n\nIn this repository, no collisions occur for the full-name-hash\nalgorithm.\n\nIn a more extreme example, an internal monorepo had a much worse\ncollision rate:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 221.6K\n5314.2: number of distinct name-hashes 72.0K\n5314.3: number of distinct full-name-hashes 221.6K\n5314.4: maximum multiplicity of name-hashes 14.4K\n5314.5: maximum multiplicity of fullname-hashes 2\n\nEven in this repository with many more paths at HEAD, the collision rate\nwas low and the maximum number of paths being grouped into a single\nbucket by the full-path-name algorithm was two.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"test-tool: add helper for name-hash values"}},{"before":"f47464efe72067c96887ddc9e45071b259c3c880","after":"e27b149960d8ebc6c48188947ea8d965477bee50","ref":"refs/heads/full-name-wip","pushedAt":"2024-09-18T20:46:01.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: output debug info about deltas\n\nIn order to debug what is going on during delta calculations, add a\n--debug-file= option to 'git pack-objects'. This leads to sending\na JSON-formatted description of the delta information to that file.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: output debug info about deltas"}},{"before":"5cad24b1b0533948e9bd10ba3abc5a565c4eca8d","after":"20e309f8f6a94a4ea826475706d39509b6baec61","ref":"refs/heads/full-name","pushedAt":"2024-09-18T20:22:37.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"test-tool: add helper for name-hash values\n\nAdd a new test-tool helper, name-hash, to output the value of the\nname-hash algorithms for the input list of strings, one per line.\n\nSince the name-hash values can be stored in the .bitmap files, it is\nimportant that these hash functions do not change across Git versions.\nAdd a simple test to t5310-pack-bitmaps.sh to provide some testing of\nthe current values. Due to how these functions are implemented, it would\nbe difficult to change them without disturbing these values.\n\nCreate a performance test that uses test_size to demonstrate how\ncollisions occur for these hash algorithms. This test helps inform\nsomeone as to the behavior of the name-hash algorithms for their repo\nbased on the paths at HEAD.\n\nMy copy of the Git repository shows modest statistics around the\ncollisions of the default name-hash algorithm:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 4.5K\n5314.2: number of distinct name-hashes 4.1K\n5314.3: number of distinct full-name-hashes 4.5K\n5314.4: maximum multiplicity of name-hashes 13\n5314.5: maximum multiplicity of fullname-hashes 1\n\nHere, the maximum collision multiplicity is 13, but around 10% of paths\nhave a collision with another path.\n\nIn a more interesting example, the microsoft/fluentui [1] repo had these\nstatistics at time of committing:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 19.6K\n5314.2: number of distinct name-hashes 8.2K\n5314.3: number of distinct full-name-hashes 19.6K\n5314.4: maximum multiplicity of name-hashes 279\n5314.5: maximum multiplicity of fullname-hashes 1\n\n[1] https://github.com/microsoft/fluentui\n\nThat demonstrates that of the nearly twenty thousand path names, they\nare assigned around eight thousand distinct values. 279 paths are\nassigned to a single value, leading the packing algorithm to sort\nobjects from those paths together, by size.\n\nIn this repository, no collisions occur for the full-name-hash\nalgorithm.\n\nIn a more extreme example, an internal monorepo had a much worse\ncollision rate:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 221.6K\n5314.2: number of distinct name-hashes 72.0K\n5314.3: number of distinct full-name-hashes 221.6K\n5314.4: maximum multiplicity of name-hashes 14.4K\n5314.5: maximum multiplicity of fullname-hashes 2\n\nEven in this repository with many more paths at HEAD, the collision rate\nwas low and the maximum number of paths being grouped into a single\nbucket by the full-path-name algorithm was two.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"test-tool: add helper for name-hash values"}},{"before":"2e285fa6b30efd638877a8209e8ede03b1d7fb74","after":"5cad24b1b0533948e9bd10ba3abc5a565c4eca8d","ref":"refs/heads/full-name","pushedAt":"2024-09-18T20:16:39.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"test-tool: add helper for name-hash values\n\nAdd a new test-tool helper, name-hash, to output the value of the\nname-hash algorithms for the input list of strings, one per line.\n\nSince the name-hash values can be stored in the .bitmap files, it is\nimportant that these hash functions do not change across Git versions.\nAdd a simple test to t5310-pack-bitmaps.sh to provide some testing of\nthe current values. Due to how these functions are implemented, it would\nbe difficult to change them without disturbing these values.\n\nCreate a performance test that uses test_size to demonstrate how\ncollisions occur for these hash algorithms. This test helps inform\nsomeone as to the behavior of the name-hash algorithms for their repo\nbased on the paths at HEAD.\n\nMy copy of the Git repository shows modest statistics around the\ncollisions of the default name-hash algorithm:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 4.5K\n5314.2: number of distinct name-hashes 4.1K\n5314.3: number of distinct full-name-hashes 4.5K\n5314.4: maximum multiplicity of name-hashes 13\n5314.5: maximum multiplicity of fullname-hashes 1\n\nHere, the maximum collision multiplicity is 13, but around 10% of paths\nhave a collision with another path.\n\nIn a more interesting example, the microsoft/fluentui [1] repo had these\nstatistics at time of committing:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 19.6K\n5314.2: number of distinct name-hashes 8.2K\n5314.3: number of distinct full-name-hashes 19.6K\n5314.4: maximum multiplicity of name-hashes 279\n5314.5: maximum multiplicity of fullname-hashes 1\n\n[1] https://github.com/microsoft/fluentui\n\nThat demonstrates that of the nearly twenty thousand path names, they\nare assigned around eight thousand distinct values. 279 paths are\nassigned to a single value, leading the packing algorithm to sort\nobjects from those paths together, by size.\n\nIn this repository, no collisions occur for the full-name-hash\nalgorithm.\n\nIn a more extreme example, an internal monorepo had a much worse\ncollision rate:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 221.6K\n5314.2: number of distinct name-hashes 72.0K\n5314.3: number of distinct full-name-hashes 221.6K\n5314.4: maximum multiplicity of name-hashes 14.4K\n5314.5: maximum multiplicity of fullname-hashes 2\n\nEven in this repository with many more paths at HEAD, the collision rate\nwas low and the maximum number of paths being grouped into a single\nbucket by the full-path-name algorithm was two.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"test-tool: add helper for name-hash values"}},{"before":null,"after":"f47464efe72067c96887ddc9e45071b259c3c880","ref":"refs/heads/full-name-wip","pushedAt":"2024-09-18T20:16:28.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: output debug info about deltas\n\nIn order to debug what is going on during delta calculations, add a\n--debug-file= option to 'git pack-objects'. This leads to sending\na JSON-formatted description of the delta information to that file.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: output debug info about deltas"}},{"before":null,"after":"3fb745257b30a643ee78c9a7c52ab107c82e4745","ref":"refs/heads/full-base","pushedAt":"2024-09-18T20:15:32.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"ci updates\n\nThis batch is solely to unbreak the 32-bit CI jobs that can no\nlonger work with Ubuntu xenial image that is too ancient.\n\nSigned-off-by: Junio C Hamano ","shortMessageHtmlLink":"ci updates"}},{"before":"27ab560b628fc5981c1b10835a3f8d232ec2e336","after":"02d577fd7bd398279b340e2700784bee76dbacb7","ref":"refs/heads/full-name-windows","pushedAt":"2024-09-18T20:05:58.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"test-tool: add helper for name-hash values\n\nAdd a new test-tool helper, name-hash, to output the value of the\nname-hash algorithms for the input list of strings, one per line.\n\nSince the name-hash values can be stored in the .bitmap files, it is\nimportant that these hash functions do not change across Git versions.\nAdd a simple test to t5310-pack-bitmaps.sh to provide some testing of\nthe current values. Due to how these functions are implemented, it would\nbe difficult to change them without disturbing these values.\n\nCreate a performance test that uses test_size to demonstrate how\ncollisions occur for these hash algorithms. This test helps inform\nsomeone as to the behavior of the name-hash algorithms for their repo\nbased on the paths at HEAD.\n\nMy copy of the Git repository shows modest statistics around the\ncollisions of the default name-hash algorithm:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 4.5K\n5314.2: number of distinct name-hashes 4.1K\n5314.3: number of distinct full-name-hashes 4.5K\n5314.4: maximum multiplicity of name-hashes 13\n5314.5: maximum multiplicity of fullname-hashes 1\n\nHere, the maximum collision multiplicity is 13, but around 10% of paths\nhave a collision with another path.\n\nIn a more interesting example, the microsoft/fluentui [1] repo had these\nstatistics at time of committing:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 19.6K\n5314.2: number of distinct name-hashes 8.2K\n5314.3: number of distinct full-name-hashes 19.6K\n5314.4: maximum multiplicity of name-hashes 279\n5314.5: maximum multiplicity of fullname-hashes 1\n\n[1] https://github.com/microsoft/fluentui\n\nThat demonstrates that of the nearly twenty thousand path names, they\nare assigned around eight thousand distinct values. 279 paths are\nassigned to a single value, leading the packing algorithm to sort\nobjects from those paths together, by size.\n\nIn this repository, no collisions occur for the full-name-hash\nalgorithm.\n\nIn a more extreme example, an internal monorepo had a much worse\ncollision rate:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 221.6K\n5314.2: number of distinct name-hashes 72.0K\n5314.3: number of distinct full-name-hashes 221.6K\n5314.4: maximum multiplicity of name-hashes 14.4K\n5314.5: maximum multiplicity of fullname-hashes 2\n\nEven in this repository with many more paths at HEAD, the collision rate\nwas low and the maximum number of paths being grouped into a single\nbucket by the full-path-name algorithm was two.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"test-tool: add helper for name-hash values"}},{"before":"5064a49bb6e392221eb938d3178d6b6b6207706b","after":"a9fc233390ae00e3d4b156be64d6b3974e30d8a1","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-18T20:05:58.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: thread the path-based compression\n\nAdapting the implementation of ll_find_deltas(), create a threaded\nversion of the --path-walk compression step in 'git pack-objects'.\n\nThis involves adding a 'regions' member to the thread_params struct,\nallowing each thread to own a section of paths. We can simplify the way\njobs are split because there is no value in extending the batch based on\nname-hash the way sections of the object entry array are attempted to be\ngrouped. We re-use the 'list_size' and 'remaining' items for the purpose\nof borrowing work in progress from other \"victim\" threads when a thread\nhas finished its batch of work more quickly.\n\nUsing the Git repository as a test repo, the p5313 performance test\nshows that the resulting size of the repo is the same, but the threaded\nimplementation gives gains of varying degrees depending on the number of\nobjects being packed. (This was tested on a 16-core machine.)\n\nTest HEAD~1 HEAD\n-------------------------------------------------------------\n5313.6: thin pack with --path-walk 0.01 0.01 +0.0%\n5313.7: thin pack size with --path-walk 475 475 +0.0%\n5313.12: big pack with --path-walk 1.99 1.87 -6.0%\n5313.13: big pack size with --path-walk 14.4M 14.3M -0.4%\n5313.18: repack with --path-walk 98.14 41.46 -57.8%\n5313.19: repack size with --path-walk 197.2M 197.3M +0.0%\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: thread the path-based compression"}},{"before":"5dd47c416091846d5fba59d37fef7f11333969fb","after":"5064a49bb6e392221eb938d3178d6b6b6207706b","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-18T19:25:14.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: thread the path-based compression\n\nAdapting the implementation of ll_find_deltas(), create a threaded\nversion of the --path-walk compression step in 'git pack-objects'.\n\nUsing the Git repository as a test repo, the p5313 performance test\nshows that the resulting size of the repo is the same, but the threaded\nimplementation gives gains of varying degrees depending on the number of\nobjects being packed. (This was tested on a 16-core machine.)\n\nTest HEAD~1 HEAD\n--------------------------------------------------------------\n5313.6: thin pack with --path-walk 0.01 0.01 +0.0%\n5313.7: thin pack size with --path-walk 475 475 +0.0%\n5313.12: big pack with --path-walk 2.27 2.01 -11.5%\n5313.13: big pack size with --path-walk 13.3M 13.3M +0.0%\n5313.18: repack with --path-walk 98.00 41.53 -57.6%\n5313.19: repack size with --path-walk 215.0K 215.0K +0.0%\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: thread the path-based compression"}},{"before":"6fe77a7eafd7ff8c5fb2d39016a9044081bd8498","after":"27ab560b628fc5981c1b10835a3f8d232ec2e336","ref":"refs/heads/full-name-windows","pushedAt":"2024-09-18T19:25:02.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"test-tool: add helper for name-hash values\n\nAdd a new test-tool helper, name-hash, to output the value of the\nname-hash algorithms for the input list of strings, one per line.\n\nSince the name-hash values can be stored in the .bitmap files, it is\nimportant that these hash functions do not change across Git versions.\nAdd a simple test to t5310-pack-bitmaps.sh to provide some testing of\nthe current values. Due to how these functions are implemented, it would\nbe difficult to change them without disturbing these values.\n\nCreate a performance test that uses test_size to demonstrate how\ncollisions occur for these hash algorithms. This test helps inform\nsomeone as to the behavior of the name-hash algorithms for their repo\nbased on the paths at HEAD.\n\nMy copy of the Git repository shows modest statistics around the\ncollisions of the default name-hash algorithm:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 4.5K\n5314.2: number of distinct name-hashes 4.1K\n5314.3: number of distinct full-name-hashes 4.5K\n5314.4: maximum multiplicity of name-hashes 13\n5314.5: maximum multiplicity of fullname-hashes 1\n\nHere, the maximum collision multiplicity is 13, but around 10% of paths\nhave a collision with another path.\n\nIn a more interesting example, the microsoft/fluentui [1] repo had these\nstatistics at time of committing:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 19.6K\n5314.2: number of distinct name-hashes 8.2K\n5314.3: number of distinct full-name-hashes 19.6K\n5314.4: maximum multiplicity of name-hashes 279\n5314.5: maximum multiplicity of fullname-hashes 1\n\n[1] https://github.com/microsoft/fluentui\n\nThat demonstrates that of the nearly twenty thousand path names, they\nare assigned around eight thousand distinct values. 279 paths are\nassigned to a single value, leading the packing algorithm to sort\nobjects from those paths together, by size.\n\nIn this repository, no collisions occur for the full-name-hash\nalgorithm.\n\nIn a more extreme example, an internal monorepo had a much worse\ncollision rate:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 221.6K\n5314.2: number of distinct name-hashes 72.0K\n5314.3: number of distinct full-name-hashes 221.6K\n5314.4: maximum multiplicity of name-hashes 14.4K\n5314.5: maximum multiplicity of fullname-hashes 2\n\nEven in this repository with many more paths at HEAD, the collision rate\nwas low and the maximum number of paths being grouped into a single\nbucket by the full-path-name algorithm was two.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"test-tool: add helper for name-hash values"}},{"before":"1f2ee2ad39b7535ccb74fdc2add50b99d725c402","after":"5dd47c416091846d5fba59d37fef7f11333969fb","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-18T17:41:53.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: refactor path-walk delta phase\n\nPreviously, the --path-walk option to 'git pack-objects' would compute\ndeltas inline with the path-walk logic. This would make the progress\nindicator look like it is taking a long time to enumerate objects, and\nthen very quickly computed deltas.\n\nInstead of computing deltas on each region of objects organized by tree,\nstore a list of regions corresponding to these groups. These can later\nbe pulled from the list for delta compression before doing the \"global\"\ndelta search.\n\nThe current implementation is not integrated with threads, but could be\ndone in a future update.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: refactor path-walk delta phase"}},{"before":"955b25ff9b99b6d9d8325a08f6dea647575a8179","after":"1f2ee2ad39b7535ccb74fdc2add50b99d725c402","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-18T15:58:02.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"scalar: enable path-walk during push via config\n\nRepositories registered with Scalar are expected to be client-only\nrepositories that are rather large. This means that they are more likely to\nbe good candidates for using the --path-walk option when running 'git\npack-objects', especially under the hood of 'git push'. Enable this config\nin Scalar repositories.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"scalar: enable path-walk during push via config"}},{"before":null,"after":"6fe77a7eafd7ff8c5fb2d39016a9044081bd8498","ref":"refs/heads/full-name-windows","pushedAt":"2024-09-18T15:38:47.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"test-tool: add helper for name-hash values\n\nAdd a new test-tool helper, name-hash, to output the value of the\nname-hash algorithms for the input list of strings, one per line.\n\nSince the name-hash values can be stored in the .bitmap files, it is\nimportant that these hash functions do not change across Git versions.\nAdd a simple test to t5310-pack-bitmaps.sh to provide some testing of\nthe current values. Due to how these functions are implemented, it would\nbe difficult to change them without disturbing these values.\n\nCreate a performance test that uses test_size to demonstrate how\ncollisions occur for these hash algorithms. This test helps inform\nsomeone as to the behavior of the name-hash algorithms for their repo\nbased on the paths at HEAD.\n\nMy copy of the Git repository shows modest statistics around the\ncollisions of the default name-hash algorithm:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 4.5K\n5314.2: number of distinct name-hashes 4.1K\n5314.3: number of distinct full-name-hashes 4.5K\n5314.4: maximum multiplicity of name-hashes 13\n5314.5: maximum multiplicity of fullname-hashes 1\n\nHere, the maximum collision multiplicity is 13, but around 10% of paths\nhave a collision with another path.\n\nIn a more interesting example, the microsoft/fluentui [1] repo had these\nstatistics at time of committing:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 19.6K\n5314.2: number of distinct name-hashes 8.2K\n5314.3: number of distinct full-name-hashes 19.6K\n5314.4: maximum multiplicity of name-hashes 279\n5314.5: maximum multiplicity of fullname-hashes 1\n\n[1] https://github.com/microsoft/fluentui\n\nThat demonstrates that of the nearly twenty thousand path names, they\nare assigned around eight thousand distinct values. 279 paths are\nassigned to a single value, leading the packing algorithm to sort\nobjects from those paths together, by size.\n\nIn this repository, no collisions occur for the full-name-hash\nalgorithm.\n\nIn a more extreme example, an internal monorepo had a much worse\ncollision rate:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 221.6K\n5314.2: number of distinct name-hashes 72.0K\n5314.3: number of distinct full-name-hashes 221.6K\n5314.4: maximum multiplicity of name-hashes 14.4K\n5314.5: maximum multiplicity of fullname-hashes 2\n\nEven in this repository with many more paths at HEAD, the collision rate\nwas low and the maximum number of paths being grouped into a single\nbucket by the full-path-name algorithm was two.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"test-tool: add helper for name-hash values"}},{"before":"63fde3444b539eca217443f595cdb08833db54ba","after":"2e285fa6b30efd638877a8209e8ede03b1d7fb74","ref":"refs/heads/full-name","pushedAt":"2024-09-17T19:30:40.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"test-tool: add helper for name-hash values\n\nAdd a new test-tool helper, name-hash, to output the value of the\nname-hash algorithms for the input list of strings, one per line.\n\nSince the name-hash values can be stored in the .bitmap files, it is\nimportant that these hash functions do not change across Git versions.\nAdd a simple test to t5310-pack-bitmaps.sh to provide some testing of\nthe current values. Due to how these functions are implemented, it would\nbe difficult to change them without disturbing these values.\n\nCreate a performance test that uses test_size to demonstrate how\ncollisions occur for these hash algorithms. This test helps inform\nsomeone as to the behavior of the name-hash algorithms for their repo\nbased on the paths at HEAD.\n\nMy copy of the Git repository shows modest statistics around the\ncollisions of the default name-hash algorithm:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 4.5K\n5314.2: number of distinct name-hashes 4.1K\n5314.3: number of distinct full-name-hashes 4.5K\n5314.4: maximum multiplicity of name-hashes 13\n5314.5: maximum multiplicity of fullname-hashes 1\n\nHere, the maximum collision multiplicity is 13, but around 10% of paths\nhave a collision with another path.\n\nIn a more interesting example, the microsoft/fluentui [1] repo had these\nstatistics at time of committing:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 19.6K\n5314.2: number of distinct name-hashes 8.2K\n5314.3: number of distinct full-name-hashes 19.6K\n5314.4: maximum multiplicity of name-hashes 279\n5314.5: maximum multiplicity of fullname-hashes 1\n\n[1] https://github.com/microsoft/fluentui\n\nThat demonstrates that of the nearly twenty thousand path names, they\nare assigned around eight thousand distinct values. 279 paths are\nassigned to a single value, leading the packing algorithm to sort\nobjects from those paths together, by size.\n\nIn this repository, no collisions occur for the full-name-hash\nalgorithm.\n\nIn a more extreme example, an internal monorepo had a much worse\ncollision rate:\n\nTest this tree\n-----------------------------------------------------------------\n5314.1: paths at head 221.6K\n5314.2: number of distinct name-hashes 72.0K\n5314.3: number of distinct full-name-hashes 221.6K\n5314.4: maximum multiplicity of name-hashes 14.4K\n5314.5: maximum multiplicity of fullname-hashes 2\n\nEven in this repository with many more paths at HEAD, the collision rate\nwas low and the maximum number of paths being grouped into a single\nbucket by the full-path-name algorithm was two.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"test-tool: add helper for name-hash values"}},{"before":null,"after":"955b25ff9b99b6d9d8325a08f6dea647575a8179","ref":"refs/heads/path-walk-on-full","pushedAt":"2024-09-13T14:43:46.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"scalar: enable path-walk during push via config\n\nRepositories registered with Scalar are expected to be client-only\nrepositories that are rather large. This means that they are more likely to\nbe good candidates for using the --path-walk option when running 'git\npack-objects', especially under the hood of 'git push'. Enable this config\nin Scalar repositories.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"scalar: enable path-walk during push via config"}},{"before":"ab5a3e562ae3b944f43e658284e4736f283468df","after":"63fde3444b539eca217443f595cdb08833db54ba","ref":"refs/heads/full-name","pushedAt":"2024-09-12T03:01:08.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"derrickstolee","name":"Derrick Stolee","path":"/derrickstolee","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/570044?s=80&v=4"},"commit":{"message":"pack-objects: use two-pass delta calculation\n\nFirst by full-name-hash, then by name-hash to catch deltas across the same\npath.\n\nSigned-off-by: Derrick Stolee ","shortMessageHtmlLink":"pack-objects: use two-pass delta calculation"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xOVQyMDowOToxOS4wMDAwMDBazwAAAAS7RxOi","startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xOVQyMDowOToxOS4wMDAwMDBazwAAAAS7RxOi","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOS0xMlQwMzowMTowOC4wMDAwMDBazwAAAASz4Guq"}},"title":"Activity ยท derrickstolee/git"}