Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: add TestWALFailoverRandomized #3893

Merged
merged 1 commit into from
Aug 28, 2024
Merged

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Aug 27, 2024

Add a randomized test of WAL failover and recovery of a DB from failover WAL logs.

Close #3865.

@jbowens jbowens requested a review from a team as a code owner August 27, 2024 22:09
@jbowens jbowens requested a review from sumeerbhola August 27, 2024 22:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jbowens
Copy link
Collaborator Author

jbowens commented Aug 27, 2024

Stressing produces a failure:

    open_test.go:1657: 
                Error Trace:    /Users/jackson/go/src/github.com/cockroachdb/pebble/open_test.go:1657
                                                        /Users/jackson/go/src/github.com/cockroachdb/pebble/open_test.go:1693
                Error:          Received unexpected error:
                                pebble: error when replaying WAL: pebble/record: invalid chunk
                                (1) replaying wal 43, offset (secondary/000043-003.log: 0), 105095 from previous files
                                Wraps: (2) attached stack trace
                                  -- stack trace:
                                  | github.com/cockroachdb/pebble.(*DB).replayWAL
                                  |     /Users/jackson/go/src/github.com/cockroachdb/pebble/open.go:885
                                  | github.com/cockroachdb/pebble.Open
                                  |     /Users/jackson/go/src/github.com/cockroachdb/pebble/open.go:505
                                  | github.com/cockroachdb/pebble.TestWALFailoverRandomized.func4
                                  |     /Users/jackson/go/src/github.com/cockroachdb/pebble/open_test.go:1656
                                  | github.com/cockroachdb/pebble.TestWALFailoverRandomized
                                  |     /Users/jackson/go/src/github.com/cockroachdb/pebble/open_test.go:1693
                                  | [...repeated from below...]
                                Wraps: (3) pebble: error when replaying WAL
                                Wraps: (4) forced error mark
                                  | "pebble: corruption"
                                  | github.com/cockroachdb/errors/withstack/*withstack.withStack::
                                Wraps: (5) attached stack trace
                                  -- stack trace:
                                  | github.com/cockroachdb/pebble/internal/base.CorruptionErrorf
                                  |     /Users/jackson/go/src/github.com/cockroachdb/pebble/internal/base/error.go:30
                                  | github.com/cockroachdb/pebble/record.init
                                  |     /Users/jackson/go/src/github.com/cockroachdb/pebble/record/record.go:150
                                  | runtime.doInit1
                                  |     /usr/local/go/src/runtime/proc.go:7176
                                  | runtime.doInit
                                  |     /usr/local/go/src/runtime/proc.go:7143
                                  | runtime.main
                                  |     /usr/local/go/src/runtime/proc.go:253
                                  | runtime.goexit
                                  |     /usr/local/go/src/runtime/asm_arm64.s:1222
                                Wraps: (6) pebble/record: invalid chunk
                                Error types: (1) *hintdetail.withDetail (2) *withstack.withStack (3) *errutil.withPrefix (4) *markers.withMark (5) *withstack.withStack (6) *errutil.leafError
                Test:           TestWALFailoverRandomized

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @jbowens and @sumeerbhola)


open_test.go line 1656 at r1 (raw file):

			t.Log("initiating hard crash")
			setIsCrashing(true)
			// Take a strict clone of the filesystem and use that going forward.

[nit] crash-consistent clone

Add a randomized test of WAL failover and recovery of a DB from failover WAL
logs.

Close cockroachdb#3865.
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

The failure was resolved by #3896.

Reviewable status: 0 of 1 files reviewed, all discussions resolved (waiting on @sumeerbhola)

@jbowens jbowens merged commit 983d98b into cockroachdb:master Aug 28, 2024
11 checks passed
@jbowens jbowens deleted the wal-stress branch August 28, 2024 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wal: reader batch corruption
3 participants