Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to run linux workflows on large runners #6273

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

rashidnhm
Copy link
Contributor

@rashidnhm rashidnhm commented Sep 13, 2024

Context:

Currently the CI gets congested when large amounts of pull requests are being updated simultaneously. This pull request gives PRs an escape hatch and use large runners and use different queue to have CI jobs be picked up.

Description of the Change:

This pull request adds two new features:

  • Ability to add the urgent label to any pull request and switch it over to large runners
  • Automatic swap of rc branch to large runner
    • This assumes the rc branch is of the format vX.Y.Z-rcN

Large runners, albeit slightly more powerful than standard runners, can be spawned at a much higher volume than standard runners ... this is because we pay per minute for these runners vs being included on our GitHub Plan.

If a PR needs CI run without waiting for a runner, add the urgent label to the pull request.

Important Note:

Benefits:
Ability to leverage large runner to have quick time for a runner to pick up a job.

Possible Drawbacks:
Though we dictate the pool size of large runners, it is possible to still saturate it.

Related GitHub Issues:
None. sc-73711

Copy link
Contributor

Hello. You may have forgotten to update the changelog!
Please edit doc/releases/changelog-dev.md with:

  • A one-to-two sentence description of the change. You may include a small working example for new features.
  • A link back to this PR.
  • Your name (or GitHub username) in the contributors section.

@rashidnhm rashidnhm added the urgent Mark a pull request as high priority label Sep 13, 2024
id: runner_group
env:
# We are not able to use \d to check numeric values as bash does not allow them (not POSIX compliant)
RC_BRANCH_FORMAT_REGEX: v[0-9]+\.[0-9]+\.[0-9]+_rc[0-9]?
Copy link
Contributor

@mudit2812 mudit2812 Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RC_BRANCH_FORMAT_REGEX: v[0-9]+\.[0-9]+\.[0-9]+_rc[0-9]?
RC_BRANCH_FORMAT_REGEX: v[0-9]+\.[0-9]+\.[0-9]+-rc[0-9]?

We usually use dashes between the version number and "rc0", not underscores

Comment on lines +15 to +20
determine_runner:
if: github.event.pull_request.draft == false
name: Determine runner type to use
uses: ./.github/workflows/determine-workflow-runner.yml
with:
default_runner: ubuntu-latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it make sense to potentially use the large runners for running black and pylint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you say we should exclude format/docs_check/upload to codecov from large runners?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excluding upload-to-codecov and format makes sense to me. I think including docs has some merit, considering that it's actually very useful for new features or big refactors to verify that the sphinx build passes without issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about it, the more sense it makes to me to use the large runners for the actions above. If a PR is marked as urgent, we should treat all workflows as such.

Copy link
Contributor

@mudit2812 mudit2812 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good! Just a couple of questions about where it might make sense to not use large runners.

@rashidnhm
Copy link
Contributor Author

There is also a syntax error with the runs-on expression, I will fix it up Monday morning!

Copy link

codecov bot commented Sep 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.58%. Comparing base (d0344b0) to head (18fd83e).
Report is 4 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6273      +/-   ##
==========================================
- Coverage   99.59%   99.58%   -0.01%     
==========================================
  Files         443      443              
  Lines       42255    42273      +18     
==========================================
+ Hits        42082    42096      +14     
- Misses        173      177       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@mudit2812 mudit2812 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the test runtimes expected to stay the same even while using the large runners? I don't see noticeable improvements in the runtimes based on the most recent CI run, especially in the core tests. Additionally, looks like the jax tests are quite imbalanced when using the large runners. We might need different durations.json files for the case where large runners are used.

If that's expected, once the conversation about using large runners with the docs and format workflows is resolved, I'm happy to approve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
urgent Mark a pull request as high priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants