Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Full-Partition-Scan): Increase scan interval #5033

Merged
merged 1 commit into from
Jul 20, 2022

Conversation

yarongilor
Copy link
Contributor

@yarongilor yarongilor commented Jul 19, 2022

In order to decrease connection requests load on cluster by scan thread.
Intervals of 4 large-partitions longevities are increased to 5 minutes.
Fixes: https://github.com/scylladb/scylla-cluster-tests/issues/4960

Notes:

  1. I'm not sure whether this should be backported or not (not a must).
  2. The interval unit is not changed from seconds to minutes in order to keep future ability to test this lower granularity if needed.
    That is since it is an important feature to an important customer (Discord) and scylla-bench doesn't have a stable supporting version for that.

https://trello.com/c/U3bhzNUL

PR pre-checks (self review)

  • I followed KISS principle and best practices
  • I didn't leave commented-out/debugging code
  • I added the relevant backport labels
  • New configuration option are added and documented (in sdcm/sct_config.py)
  • I have added tests to cover my changes (Infrastructure only - under unit-test/ folder)
  • All new and existing unit tests passed (CI)
  • I have updated the Readme/doc folder accordingly (if needed)

	In order to decrease connection requests load on cluster by scan thread.
	Intervals of 4 large-partitions longevities are increased to 5 minutes.
	Fixes: scylladb#4960
Copy link
Contributor

@roydahan roydahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
No need to backport for now.

Copy link
Contributor

@fgelcer fgelcer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue #4960 it was asked not only to increase the timeout, but to align the unit used (from seconds, to minutes) to be according to run_fullscan threads..
please change the unit to be in minutes.

@roydahan
Copy link
Contributor

issue #4960 it was asked not only to increase the timeout, but to align the unit used (from seconds, to minutes) to be according to run_fullscan threads.. please change the unit to be in minutes.

I disagree with that.
@yarongilor wrote correctly that he leaves the units as seconds in case we want to change it back in the future.

@fgelcer
Copy link
Contributor

fgelcer commented Jul 19, 2022

issue #4960 it was asked not only to increase the timeout, but to align the unit used (from seconds, to minutes) to be according to run_fullscan threads.. please change the unit to be in minutes.

I disagree with that. @yarongilor wrote correctly that he leaves the units as seconds in case we want to change it back in the future.

the issue this PR claims to fix, has a very clear request: we should align the meaning of interval, and also not do the scan in such small intervals

so if we really want to keep it in seconds to still have the ability to return to this smaller granularity, we shall change the other one to use seconds, instead of minutes..

@roydahan
Copy link
Contributor

Not needed.

@roydahan roydahan added the backport/none Backport is not required label Jul 20, 2022
@roydahan roydahan merged commit 415907a into scylladb:master Jul 20, 2022
@fgelcer
Copy link
Contributor

fgelcer commented Jul 26, 2022

@roydahan , IIUC we are seeing this issue on 2022.1 Azure longevity, as the instances in there are kind of 1/2 of the ones we usually use for the parallel longevity in AWS, so i believe the cluster is super overloaded, and these full scans are surely not helping.

or maybe we should disable the full scans for Azure, until we get better quota to run the test with the resources we need?
@soyacz , WDYT?

@soyacz
Copy link
Contributor

soyacz commented Jul 27, 2022

I wouldn't disable it, we don't hit too many issues related to high load on 2022.1 branch (perf. is different on master - higher ops/s and load, while on 2022.1 ops/s and load is lower)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/none Backport is not required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants