Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance when there are many files #1184

Merged

Conversation

pocke
Copy link
Contributor

@pocke pocke commented Jul 4, 2024

Probably this PR fixes #661.

The execution time of steep check increases with O(n^2) order (n == number of files).
This PR improves the performance. The order will probably be O(n).

Problem

TypeCheckRequest#all_paths is called on each file. This method allocates a large Set if steep check checks many files. It makes steep check slow.

Solution

Avoid allocating the large set in all_paths method.

Benchmarking

I benchmarked steep check command with the following script, creating many .rb and .rbs files.

# Usage: ruby setup.rb 100

require 'pathname'

lib = Pathname('lib').tap { _1.rmtree; _1.mkpath }
sig = Pathname('sig').tap { _1.rmtree; _1.mkpath }

ARGV.first.to_i.times do |i|
  lib.join("foo#{i}.rb").write(<<~RUBY)
    class Foo#{i}
    end
  RUBY
  sig.join("foo#{i}.rbs").write(<<~RUBY)
    class Foo#{i}
    end
  RUBY
end

The result is the following.

Screen Shot 2024-07-04 at 16 32 08

https://docs.google.com/spreadsheets/d/1cF6KGS12i2_dBOW0GQ59EDEkhTnpxR-qAokEoB73VKQ/edit?usp=sharing

With this patch, the execution time looks linear. (The n=10000 case looks strange... I guess the execution time is unexpectedly long for some unintentional reason. But I'm not sure)

Profiling

I found this problem while investigating Steep's memory usage issue. I found a bottleneck for the Set allocation with memory_profiler gem on a Rails application.

I've confirmed that this change dramatically decreases the memory allocation size for Set with the memory profiler.
The results are n=4000 cases.

before

Total allocated: 7563837242 bytes (5427818 objects)
Total retained:  1062365 bytes (171 objects)

allocated memory by gem
-----------------------------------
7114488384  set
 253915960  pathname
  53144417  steep/lib
  47872344  language_server-protocol-3.17.0.3
  46181320  json-2.7.2
  29675372  rbs-3.5.1
  12868648  uri
   3050296  other
   2588320  activesupport-7.1.3.4
     36592  optparse
      3671  rainbow-3.1.1
      3600  securerandom-0.3.1
      2776  bundled_gems
      2001  concurrent-ruby-1.3.3
      1893  arm64-darwin21
      1448  logger-1.6.0
       200  digest

after

Total allocated: 450887826 bytes (5314647 objects)
Total retained:  1062365 bytes (171 objects)

allocated memory by gem
-----------------------------------
 253915960  pathname
  53144417  steep/lib
  47872472  language_server-protocol-3.17.0.3
  46181320  json-2.7.2
  29675372  rbs-3.5.1
  12868648  uri
   3050296  other
   2588320  activesupport-7.1.3.4
   1538880  set
     36592  optparse
      3671  rainbow-3.1.1
      3560  securerandom-0.3.1
      2776  bundled_gems
      2001  concurrent-ruby-1.3.3
      1893  arm64-darwin21
      1448  logger-1.6.0
       200  digest

BTW, I am currently working on reducing the peek memory usage of Steep. This change affects the execution time but probably doesn't affect the peek memory usage, unfortunately, because they're short-lived objects.

Copy link
Owner

@soutaro soutaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@soutaro soutaro merged commit c05eaec into soutaro:master Jul 5, 2024
21 checks passed
@soutaro soutaro added this to the Steep 1.8 milestone Jul 5, 2024
@pocke pocke deleted the Improve_performance_when_there_are_many_files branch July 5, 2024 01:09
@soutaro soutaro added the Released The PR is already included in a published release label Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Released The PR is already included in a published release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large number of files slows down the checking process.
2 participants