Add benchmarks for scanning files #36078
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
So, we can compare how much slower the fingerprinting mode against using
device
andinode
in the scanner.Why is it important?
Our customers would want to know the performance impact.
Benchmark Results
Baseline:
main
branch before the scanner revamp and the fingerprint mode was introducedGit HEAD at a755bbc
Average ~33194198,2 ns/op
After the scanner revamp and
prospector.scanner.fingerprint.enabled: false
Git HEAD at 061cb88
Average ~5398583 ns/op
The scanner optimisations made in #35734 led to ~84% performance boost in normal mode (
device+inode
, not fingerprint).prospector.scanner.fingerprint.enabled: true
,prospector.scanner.fingerprint.length: 1024
The results are taken with this change applied #36073
Git HEAD at 061cb88
Average ~22251610,4 ns/op
Using fingerprint mode with length
1024
is ~76% slower than the defaultdevice+inode
mode in the NEW scanner, however it's still faster than the defaultdevice+inode
in the old scanner (baseline).prospector.scanner.fingerprint.enabled: true
,prospector.scanner.fingerprint.length: 512
Using fingerprint mode with a shorter length
512
does not significantly affect the performanceCPU Profile
As you can see from this CPU profile collected for the
prospector.scanner.fingerprint.enabled: true
,prospector.scanner.fingerprint.length: 1024
case the main contributor – 94,01% is the syscall to either open or close the file. So, optimisations on hashing would not make much of a difference.Full report:
profile001.pdf
Conclusion
Even activating the fingerprint mode in the new FileScanner it's working faster than the old FileScanner before the optimisations introduced in #35734
Related issues