[common] Introduce testlogger as a workaround of poor lifecycle #1398

3vilhamster · 2024-11-08T13:16:54Z

What changed?
I've copied over the workaround racy logs in tests from the backend.
We have the same issues with the client who cannot establish a clear stop for internal processes at the first stop. This is not an issue for real workflow usage but can lead to test failures due to racy log messages after the test ends.

Sidenote:
Changed the approach from using atomic to RWMutex, since this situation requires semantic:

We can have concurrent log.Writes - so RLock is not stopping them.
We must wait for all writes to go on Cleanup and then acquire the lock
All writes after the update should go to fallback.

Why?
Improving test stability

How did you test it?
Run unit tests a few hundred times.
This is a bit tricky, since the TestLoggerShouldNotFailIfLoggedLate requires rerun, but I've done for i in {1..100}; do go test . --count 1; done with race and without.

Potential risks
Some tests could be flaky, though they were failing before the fix.

codecov · 2024-11-08T14:01:37Z

Codecov Report

Attention: Patch coverage is 95.00000% with 5 lines in your changes missing coverage. Please review.

Project coverage is 81.91%. Comparing base (7a3beaa) to head (e209ff6).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
internal/common/testlogger/testlogger.go	94.56%	4 Missing and 1 partial ⚠️

Additional details and impacted files

Files with missing lines	Coverage Δ
internal/common/convert.go	`100.00% <100.00%> (ø)`
internal/common/testlogger/testlogger.go	`94.56% <94.56%> (ø)`

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5ec4386...e209ff6. Read the comment docs.

internal/common/testlogger/testlogger.go

Groxx

LGTM though I don't see the reason for the added mutex.
If something seemed to be flaky without it, I can pretty much guarantee that it didn't fix it, so there's something else that needs fixing instead.

Groxx · 2024-11-08T19:29:00Z

From some more reading and thinking and IRL chat: I'm pretty sure the mutex doesn't change anything, but it's not risky or anything. So yeah, still approved, merge any time.

Now that there's a counter-example showing this logger isn't reliable, I think it's pretty clear it's just that the atomics aren't ensuring "fallback logger is used after test completes". it currently allows: pausing after checking the atomic is not-completed -> test completes -> resuming and using the T logger.

But I can make a PR for that if you want. The mutex here shouldn't make anything worse, and a mutex will be needed to fix the "atomics are not enough" issue anyway.

3vilhamster requested review from Groxx, shijiesheng, jakobht, dkrotx, taylanisikdemir and demirkayaender as code owners November 8, 2024 13:16

[common] Introduce testlogger as a workaround of poor lifecycle

9ec9816

3vilhamster force-pushed the introduce-testlogger branch from 55eb3d6 to 9ec9816 Compare November 8, 2024 13:17

3vilhamster added 3 commits November 8, 2024 14:38

maybe fix autoscaler tests

83e76d2

fix tests

2287430

make test logs thread safe

f7eb168

3vilhamster added 2 commits November 8, 2024 17:16

fmt

56e6dde

Merge branch 'master' into introduce-testlogger

02ceb76

taylanisikdemir approved these changes Nov 8, 2024

View reviewed changes

Groxx reviewed Nov 8, 2024

View reviewed changes

internal/common/testlogger/testlogger.go Outdated Show resolved Hide resolved

Groxx approved these changes Nov 8, 2024

View reviewed changes

3vilhamster added 3 commits November 11, 2024 11:49

switched to RWMutex to ensure switch to fallback on the test stop

27b16db

fmt

d2b87f4

add a nil case to ValueFromPtr

e209ff6

3vilhamster merged commit b51e891 into cadence-workflow:master Nov 11, 2024
13 checks passed

3vilhamster deleted the introduce-testlogger branch November 11, 2024 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[common] Introduce testlogger as a workaround of poor lifecycle #1398

[common] Introduce testlogger as a workaround of poor lifecycle #1398

3vilhamster commented Nov 8, 2024 •

edited

Loading

codecov bot commented Nov 8, 2024 •

edited

Loading

Groxx left a comment

Groxx commented Nov 8, 2024 •

edited

Loading

[common] Introduce testlogger as a workaround of poor lifecycle #1398

[common] Introduce testlogger as a workaround of poor lifecycle #1398

Conversation

3vilhamster commented Nov 8, 2024 • edited Loading

codecov bot commented Nov 8, 2024 • edited Loading

Codecov Report

Groxx left a comment

Choose a reason for hiding this comment

Groxx commented Nov 8, 2024 • edited Loading

3vilhamster commented Nov 8, 2024 •

edited

Loading

codecov bot commented Nov 8, 2024 •

edited

Loading

Groxx commented Nov 8, 2024 •

edited

Loading