Improve performance when there are many files #1184
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Probably this PR fixes #661.
The execution time of
steep check
increases withO(n^2)
order (n == number of files).This PR improves the performance. The order will probably be
O(n)
.Problem
TypeCheckRequest#all_paths
is called on each file. This method allocates a largeSet
ifsteep check
checks many files. It makessteep check
slow.Solution
Avoid allocating the large set in
all_paths
method.Benchmarking
I benchmarked
steep check
command with the following script, creating many.rb
and.rbs
files.The result is the following.
https://docs.google.com/spreadsheets/d/1cF6KGS12i2_dBOW0GQ59EDEkhTnpxR-qAokEoB73VKQ/edit?usp=sharing
With this patch, the execution time looks linear. (The n=10000 case looks strange... I guess the execution time is unexpectedly long for some unintentional reason. But I'm not sure)
Profiling
I found this problem while investigating Steep's memory usage issue. I found a bottleneck for the
Set
allocation withmemory_profiler
gem on a Rails application.I've confirmed that this change dramatically decreases the memory allocation size for
Set
with the memory profiler.The results are n=4000 cases.
before
after
BTW, I am currently working on reducing the peek memory usage of Steep. This change affects the execution time but probably doesn't affect the peek memory usage, unfortunately, because they're short-lived objects.