Fix flaky tests with 'await' #3972

atakavci · 2024-09-27T13:07:08Z

No description provided.

sazzad16

3 extra commands seems many. Isn't 1 enough?

atakavci · 2024-09-30T08:43:46Z

@sazzad16 i agree with you it should have been ok with 1. Tried with 1 in the past which still get fails intermittently. Just want to get rid of this flakiness once and for all.
TBH looking for another approach (not a Thread.sleep) but couldnt come up with a clean one.

sazzad16 · 2024-09-30T13:58:47Z

Tried with 1 in the past which still get fails intermittently.

@atakavci In that case, I'd take the Thread.sleep(); at least those would be easier to document and understand 😓

atakavci · 2024-09-30T15:30:00Z

@sazzad16 i'm strongly against to use of sleep in unit tests. Since its not sensitive and responsive to the environment like resource utilization, latencies etc. there will be always an unpredictable requirement for how long we need to wait.
Is it good enough to add some comments and signal the significance of these dummy operation to the future maintainer ?

sazzad16 · 2024-09-30T18:36:51Z

@atakavci

there will be always an unpredictable requirement for how long we need to wait.

Well, same thing can be said about the number of commands you're executing. It's possible that if we run the tests way too many times, we would find the flaky tests are failing which would've passed if 4 or 5 commands were executed instead of 3.

i'm strongly against to use of sleep in unit tests.

I respect that. But in my humble understanding, by executing dummy commands you're just doing Thread.sleep() without writing Thread.sleep(). By executing 3 commands, you're just using a larger sleep time compared to the equivalent time of executing 1 command.

... to the future maintainer

This is what I indicated in my last comment. A line of of Thread.sleep() + comments would be much easier to understand than 3 lines of dummy command + comments.

atakavci · 2024-10-01T07:10:01Z

@sazzad16 i see your point, let me try to explain mine;

Well, same thing can be said about the number of commands you're executing.

this is not correct. When you execute a similar task, you will get same kind of delay and they will always be proportional in average since it uses the same resources and impacted by same set of factors.
so when you do a 'read X' and then a 'read Y' you will get the same ratio of execution times in average.

you're just doing Thread.sleep() without writing Thread.sleep().

right in a sense, purpose is same but,, but sleep is utilization agnostic. It doesn't care if the execution is faster or slower then usual/expected. If we put 1000ms, it will be 1000ms always(And and its not even guaranteed that thread will be back exactly after 1000ms, furthermore consider we start to spread these sleep's here and there in tests, pipeline will start to degrade inevitably).
so, thread.sleep will not remain proportional to system throughput and speed as i described above,, yet command execution will.

i appricaite your stance with readability and really would like to improve it,,, without using sleep stuff.

killergerbah · 2024-10-01T07:12:35Z

Hello, I am seeing some of these tests fail in my own fork of jedis. In my case the simple test is failing.

Can this be resolved by implementing an assertion method that evaluates eventual consistency? For example, repeatedly querying the key with some small sleeps and a very large total timeout (e.g. 30 seconds). This could look something like:

assertEventuallyTrue(() -> cache.getSize() == 1)

atakavci · 2024-10-01T07:23:41Z

hi @killergerbah, trying to figure out best options and i d like to have a way of addressing it without having to wait a fixed long period of time in case of a failure.

tishun · 2024-10-09T12:00:17Z

I am also not a fan of doing either Thread.sleep or sending some (arbitrary) number of commands.
Ideally we should be able to react to the environment change, based on some server push, but when we are testing the server push itself we are left with two choices (that I can think of):

the solution that @killergerbah suggested, accepting the shortcoming that it might take us some time to fail (but this is ok as we should mostly NOT fail and would only slow down the tests in case of a regression)
subscribe to another alternative mechanism, e.g. Redis keyspace notifications where this would work

What do you think?

atakavci · 2024-10-10T13:56:02Z

subscribe to another alternative mechanism, e.g. Redis keyspace notifications where this would work

this requires another connection which leaves the test case possibly flaky again. If it is the original connection to subscribe, then it is no different then making a 'write' on same connection, which will block on a response read.

first choice remains feasible while it is just a 'sleep' in the heart of it.

there are options like tapping the port for incoming data or pending buffer, but wouldn't it be the perfect example of over engineering.
i believe having pipeline healthy again is getting more important than this topic. let me put a sleep cycle and we ll move fwd.

src/test/java/redis/clients/jedis/csc/UnifiedJedisClientSideCacheTestBase.java

src/test/java/redis/clients/jedis/util/AssertUtil.java

src/test/java/redis/clients/jedis/csc/UnifiedJedisClientSideCacheTestBase.java

ggivo · 2024-10-11T07:48:47Z

What do you think about using an already available lib for this? There is a nice one that integrates with assert, hamcrest.. like https://github.com/awaitility/awaitility

tishun · 2024-10-11T08:16:48Z

What do you think about using an already available lib for this? There is a nice one that integrates with assert, hamcrest.. like https://github.com/awaitility/awaitility

Vote for that. It would only increase the dependencies in the test phase, so it does not have impact on the driver itself.

src/test/java/redis/clients/jedis/csc/UnifiedJedisClientSideCacheTestBase.java

atakavci · 2024-10-14T10:55:09Z

@sazzad16 looks like there are other flaky tests existing in pipeline.
could you point those you are aware of ??

sazzad16 · 2024-10-14T13:15:36Z

src/test/java/redis/clients/jedis/PipeliningTest.java

+      Awaitility.await().atMost(5, TimeUnit.SECONDS).pollInterval(50, TimeUnit.MILLISECONDS)
+          .untilAsserted(() -> assertEquals("aof", j.get("wait")));


Love it. Got to add it to Lettuce tests too at some point!

The base branch was changed.

untilasserted

tishun

I think this is the best solution for now.

tishun · 2024-10-16T15:44:44Z

src/test/java/redis/clients/jedis/PipeliningTest.java

+      Awaitility.await().atMost(5, TimeUnit.SECONDS).pollInterval(50, TimeUnit.MILLISECONDS)
+          .untilAsserted(() -> assertEquals("aof", j.get("wait")));


Love it. Got to add it to Lettuce tests too at some point!

* fix flaky test - dumy set-get to gain time * adding same commands for 'simple' * introdue tryAssert in CSC tests * remove leftovers * introduce awaitility for polling * nit * fix pipelining test untilasserted

atakavci marked this pull request as draft September 27, 2024 13:07

atakavci changed the title ~~fix flaky test - set and get dummy key~~ Fix flaky test - set and get dummy key Sep 30, 2024

atakavci marked this pull request as ready for review September 30, 2024 07:04

atakavci requested review from uglide and sazzad16 September 30, 2024 07:04

sazzad16 reviewed Sep 30, 2024

View reviewed changes

atakavci requested a review from tishun October 2, 2024 12:50

sazzad16 reviewed Oct 10, 2024

View reviewed changes

src/test/java/redis/clients/jedis/csc/UnifiedJedisClientSideCacheTestBase.java Outdated Show resolved Hide resolved

sazzad16 reviewed Oct 10, 2024

View reviewed changes

src/test/java/redis/clients/jedis/util/AssertUtil.java Outdated Show resolved Hide resolved

sazzad16 reviewed Oct 10, 2024

View reviewed changes

src/test/java/redis/clients/jedis/csc/UnifiedJedisClientSideCacheTestBase.java Show resolved Hide resolved

sazzad16 added the testing label Oct 10, 2024

sazzad16 reviewed Oct 14, 2024

View reviewed changes

src/test/java/redis/clients/jedis/csc/UnifiedJedisClientSideCacheTestBase.java Outdated Show resolved Hide resolved

sazzad16 added the backport required label Oct 14, 2024

sazzad16 reviewed Oct 14, 2024

View reviewed changes

sazzad16 previously approved these changes Oct 14, 2024

View reviewed changes

sazzad16 changed the title ~~Fix flaky test - set and get dummy key~~ Fix flaky tests with 'await' Oct 14, 2024

uglide changed the base branch from 5.2.0 to master October 15, 2024 11:34

atakavci and others added 7 commits October 15, 2024 15:33

fix flaky test - dumy set-get to gain time

96526d2

adding same commands for 'simple'

b702cee

introdue tryAssert in CSC tests

8b8dc70

remove leftovers

d370441

introduce awaitility for polling

d39cc0b

nit

ef8d4a0

fix pipelining test

5cb7b30

untilasserted

atakavci force-pushed the ali/flakytests branch from cfd9ee2 to 5cb7b30 Compare October 15, 2024 12:36

tishun approved these changes Oct 16, 2024

View reviewed changes

atakavci merged commit 498fee3 into redis:master Oct 17, 2024
5 checks passed

sazzad16 removed the backport required label Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky tests with 'await' #3972

Fix flaky tests with 'await' #3972

atakavci commented Sep 27, 2024

sazzad16 left a comment

atakavci commented Sep 30, 2024

sazzad16 commented Sep 30, 2024

atakavci commented Sep 30, 2024 •

edited

Loading

sazzad16 commented Sep 30, 2024

atakavci commented Oct 1, 2024

killergerbah commented Oct 1, 2024

atakavci commented Oct 1, 2024

tishun commented Oct 9, 2024 •

edited

Loading

atakavci commented Oct 10, 2024

ggivo commented Oct 11, 2024

tishun commented Oct 11, 2024

atakavci commented Oct 14, 2024

sazzad16 Oct 14, 2024

tishun Oct 16, 2024

tishun left a comment

tishun Oct 16, 2024

		Awaitility.await().atMost(5, TimeUnit.SECONDS).pollInterval(50, TimeUnit.MILLISECONDS)
		.untilAsserted(() -> assertEquals("aof", j.get("wait")));

Fix flaky tests with 'await' #3972

Fix flaky tests with 'await' #3972

Conversation

atakavci commented Sep 27, 2024

sazzad16 left a comment

Choose a reason for hiding this comment

atakavci commented Sep 30, 2024

sazzad16 commented Sep 30, 2024

atakavci commented Sep 30, 2024 • edited Loading

sazzad16 commented Sep 30, 2024

atakavci commented Oct 1, 2024

killergerbah commented Oct 1, 2024

atakavci commented Oct 1, 2024

tishun commented Oct 9, 2024 • edited Loading

atakavci commented Oct 10, 2024

ggivo commented Oct 11, 2024

tishun commented Oct 11, 2024

atakavci commented Oct 14, 2024

sazzad16 Oct 14, 2024

Choose a reason for hiding this comment

tishun Oct 16, 2024

Choose a reason for hiding this comment

tishun left a comment

Choose a reason for hiding this comment

tishun Oct 16, 2024

Choose a reason for hiding this comment

atakavci commented Sep 30, 2024 •

edited

Loading

tishun commented Oct 9, 2024 •

edited

Loading