forked from git/git
-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
for-each-ref: add --count-matches option
In order to count references of different types based on their initial prefixes, there are two current approaches: 1. Run 'git for-each-ref' on all refs and parse the output to count those that match each prefix. 2. Run 'git for-each-ref "prefix/"' and pipe that output to 'wc -l' to count the number of output lines. Each of these approaches is wasteful as it requires sending the list of matching reference names across a pipe plus the cost of parsing that output. Instead, it would be helpful to have a Git command that counts the number of refs matching a list of patterns. This change adds a new mode, '--count-matches' to 'git for-each-ref' so we can make use of the existing infrastructure around parsing refspecs in the correct places. '--count' is already taken as a "maximum number" of refs to output. An alternative approach could be to make a brand-new builtin that is focused on counting ref matches. This would involve duplicating a bit of code around parsing refpecs, but would not be terribly difficult. The actual overlap of implementation here with 'git for-each-ref' is small enough that we could instead extract this elsewhere. My gut feeling is that this behavior doesn't merit a new builtin. The implementation is extremely simple: iterate through all references and compare each ref to each refspec. On a match, increment a counter value for that refspec. In the end, output each refspec followed by the count. If all given refspecs were prefixes, then it is tempting to instead use a counting behavior that we can use in things like "how many OIDs start with this short hex string?" Navigating to the start and end of the range of refs starting with that prefix is possible in the packed-refs file. However, the records do not have a constant size so we cannot infer the number of references in that range using the current format. (Perhaps, in the future we will have a ref storage system that allows this kind of counting to be easy to do in O(log N) time.) A new performance test is included to check the performance of iterating through these references and counting them appropriately. This presents a 3x improvement over the trivial piping through to 'wc -l', and that assumes there is a single pattern to match instead of multiple. We can see that testing three patterns sequentially adds to the total time, but doing a single process with --count-match continues to be as fast. (It's difficult to tell since it _also_ matches the sum of the three for this example repo.) Test this tree ---------------------------------------------------------------------------- 1501.2: count refs/heads/: git for-each-ref | wc -l 0.01(0.00+0.01) 1501.3: count refs/heads/: git for-each-ref --count-match 0.00(0.00+0.00) 1501.4: count refs/tags/: git for-each-ref | wc -l 0.02(0.00+0.02) 1501.5: count refs/tags/: git for-each-ref --count-match 0.00(0.00+0.00) 1501.6: count refs/remotes: git for-each-ref | wc -l 0.15(0.08+0.07) 1501.7: count refs/remotes: git for-each-ref --count-match 0.04(0.01+0.02) 1501.8: count all patterns: git for-each-ref | wc -l 0.18(0.08+0.10) 1501.9: count all patterns: git for-each-ref --count-match 0.04(0.02+0.02) Signed-off-by: Derrick Stolee <derrickstolee@github.com>
- Loading branch information
1 parent
4aa261d
commit 9121e02
Showing
6 changed files
with
145 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
#!/bin/sh | ||
|
||
test_description='Ref iteration performance tests' | ||
. ./perf-lib.sh | ||
|
||
test_perf_large_repo | ||
|
||
# Optimize ref backend store | ||
test_expect_success 'setup' ' | ||
git pack-refs | ||
' | ||
|
||
for pattern in "refs/heads/" "refs/tags/" "refs/remotes" | ||
do | ||
test_perf "count $pattern: git for-each-ref | wc -l" " | ||
git for-each-ref $pattern | wc -l | ||
" | ||
|
||
test_perf "count $pattern: git for-each-ref --count-match" " | ||
git for-each-ref --count-matches $pattern | ||
" | ||
done | ||
|
||
test_perf "count all patterns: git for-each-ref | wc -l" " | ||
git for-each-ref refs/heads/ | wc -l && | ||
git for-each-ref refs/tags/ | wc -l && | ||
git for-each-ref refs/remotes/ | wc -l | ||
" | ||
|
||
test_perf "count all patterns: git for-each-ref --count-match" " | ||
git for-each-ref --count-matches \ | ||
refs/heads/ refs/tags/ refs/remotes/ | ||
" | ||
|
||
test_done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters