`full_scan` interval are in minutes while `run_full_partition_scan` intervals are in secoands #4960

fruch · 2022-06-30T08:15:02Z

run_fullscan: '{"ks_cf": "scylla_bench.test", "interval": 10}' # 'ks.cf|random, interval(min)'

run_full_partition_scan: '{"ks_cf": "scylla_bench.test", "interval": 2, "pk_name":"pk", "rows_count": 5000, "validate_data": "true", "include_data_column": "true"}' # 'ks.cf, interval(sec), partition-key name, number-of-rows-per-partition, validate reversed query output, include data-column or only validate pk + ck'

some cases have 2sec intervals, which seem to be very problematic (since it reconnect ~3 time each round)

we should align the meaning of interval, and also not do the scan in such small intervals

The text was updated successfully, but these errors were encountered:

fgelcer · 2022-06-30T08:35:46Z

@yarongilor , please fix it ASAP

yarongilor · 2022-07-03T07:18:11Z

@fruch , @fgelcer , why is it problematic? why is it 'ASAP'. It was a request by @roydahan in the original PR.
@roydahan , can it be changed to a higher value like 60 seconds?

fgelcer · 2022-07-03T07:21:22Z

@fruch , @fgelcer , why is it problematic? why is it 'ASAP'. It was a request by @roydahan in the original PR. @roydahan , can it be changed to a higher value like 60 seconds?

we may be overloading some tests, with this high frequency scans... and as @fruch says, we have few reconnects for each round, and it can have an impact in the whole test/network...

as a side change (less important) is to align the units used

yarongilor · 2022-07-03T07:41:59Z

I don't think it ever caused a network issue or something. It is quite negligible since uses only 1 single connection.
The interval format is different on purpose.
These are the only tests that covers reversed-queries and we don't currently have any other alternatices for testing that.
IMPORTANT NOTE: it will be more reasonable reducing this frequency once scylla-bench new stable version exist, since it should already include reversed-queries support.
I think until then, we can close this issue unless we see some unwanted impact on tests.

fruch · 2022-07-03T08:00:12Z

I don't think it ever caused a network issue or something. It is quite negligible since uses only 1 single connection. The interval format is different on purpose. These are the only tests that covers reversed-queries and we don't currently have any other alternatices for testing that. IMPORTANT NOTE: it will be more reasonable reducing this frequency once scylla-bench new stable version exist, since it should already include reversed-queries support. I think until then, we can close this issue unless we see some unwanted impact on tests.

who exactly is working on reversed query support for scylla-bench ?

and how's scylla-bench stability is related to SCT chocking scylla with all those rapid full scans ?

also it's not 1 single connection, it's 1 connection X number of shards each time a session is opened. and the issue isn't overloading the network, but overloading scylla cluster. (which in turns can causes scylla-bench queries to timeout), as well as the full scans to timeout (in some cases)

…terval make `run_full_partition_scan.interval` 2 minutes, insted two 2sec to avoid overloading scylla during this use case Ref: scylladb#4960

roydahan · 2022-07-17T19:17:41Z

@yarongilor I think in the beginning you set it such low by mistake and then we saw it can handle this so we kept it.
I think it's ok now, to reduce it to happen every few minutes.

Regarding the scylla-bench support, I don't remember what happened with this.
I recall that we got the support for it, but maybe there was a bug in scylla-bench that caused it to crash and we had to revert to previous version?
Please revive this thread / issue.

yarongilor · 2022-07-19T08:30:57Z

@yarongilor I think in the beginning you set it such low by mistake and then we saw it can handle this so we kept it. I think it's ok now, to reduce it to happen every few minutes.

Regarding the scylla-bench support, I don't remember what happened with this. I recall that we got the support for it, but maybe there was a bug in scylla-bench that caused it to crash and we had to revert to previous version? Please revive this thread / issue.

@roydahan , correct, this is what happened.
So there's the following s-b open issue: scylladb/scylla-bench#90
And there are 2 tasks that are 'done':
https://trello.com/c/A93qT8XQ
https://trello.com/c/eaotAoI2

In order to decrease connection requests load on cluster by scan thread. Intervals of 4 large-partitions longevities are increased to 5 minutes. Fixes: scylladb#4960

roydahan · 2022-07-19T10:13:07Z

Did you see the comment from Dmitry from February?
He said that s-b exited due to reaching the error limit, not due to a coredump.

fruch assigned yarongilor and fgelcer Jun 30, 2022

fruch added the Bug Something isn't working right label Jun 30, 2022

fruch changed the title ~~rull_scan interval are in minutes while run_full_partition_scan intervals are in secoands~~ full_scan interval are in minutes while run_full_partition_scan intervals are in secoands Jul 3, 2022

roydahan mentioned this issue Jul 19, 2022

fix(Full-Partition-Scan): Increase scan interval #5033

Merged

7 tasks

roydahan closed this as completed in 415907a Jul 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`full_scan` interval are in minutes while `run_full_partition_scan` intervals are in secoands #4960

`full_scan` interval are in minutes while `run_full_partition_scan` intervals are in secoands #4960

fruch commented Jun 30, 2022

fgelcer commented Jun 30, 2022

yarongilor commented Jul 3, 2022

fgelcer commented Jul 3, 2022

yarongilor commented Jul 3, 2022

fruch commented Jul 3, 2022

roydahan commented Jul 17, 2022

yarongilor commented Jul 19, 2022

roydahan commented Jul 19, 2022

full_scan interval are in minutes while run_full_partition_scan intervals are in secoands #4960

full_scan interval are in minutes while run_full_partition_scan intervals are in secoands #4960

Comments

fruch commented Jun 30, 2022

fgelcer commented Jun 30, 2022

yarongilor commented Jul 3, 2022

fgelcer commented Jul 3, 2022

yarongilor commented Jul 3, 2022

fruch commented Jul 3, 2022

roydahan commented Jul 17, 2022

yarongilor commented Jul 19, 2022

roydahan commented Jul 19, 2022

`full_scan` interval are in minutes while `run_full_partition_scan` intervals are in secoands #4960

`full_scan` interval are in minutes while `run_full_partition_scan` intervals are in secoands #4960