-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GODRIVER-2335 Preemptively cancel in progress operations when SDAM heartbeats timeout. #1423
Changes from 6 commits
eb7f25e
4cfdf49
ef6679f
78e4d2c
b4352fc
d938fed
9dfddd9
97d4d33
10bd0ef
61e0bc0
31b50f1
c3233d8
f998418
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,6 +32,11 @@ var ( | |
"Write commands with snapshot session do not affect snapshot reads": "Test fails frequently. See GODRIVER-2843", | ||
// TODO(GODRIVER-2943): Fix and unskip this test case. | ||
"Topology lifecycle": "Test times out. See GODRIVER-2943", | ||
// The current logic, which was implemented with GODRIVER-2577, only clears pools and cancels in-progress ops if | ||
// the heartbeat fails twice. Therefore, we skip the following spec tests, which requires canceling ops immediately. | ||
"Connection pool clear uses interruptInUseConnections=true after monitor timeout": "Godriver clears after multiple timeout", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add a comment above these tests with a TODO to resolve or a more concise explanation as to why we are skipping them? Not running these tests avoids ever executing the runOnThread logic. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @qingyang-hu So is the reason we skip this is because these cases don't apply to the Go Driver? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correct. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we skip these spec tests, the new "cancel in-progress operations" feature seems untested by any new or existing tests. We should add tests that make sure the feature works. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm going to update the test cases for GODRIVER-2577 in server_test.go to sync up with the changes. |
||
"Error returned from connection pool clear with interruptInUseConnections=true is retryable": "Godriver clears after multiple timeout", | ||
"Error returned from connection pool clear with interruptInUseConnections=true is retryable for write": "Godriver clears after multiple timeout", | ||
} | ||
|
||
logMessageValidatorTimeout = 10 * time.Millisecond | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
{ | ||
"version": 1, | ||
"style": "unit", | ||
"description": "Connections MUST be interrupted as soon as possible (interruptInUseConnections=true)", | ||
"poolOptions": { | ||
"backgroundThreadIntervalMS": 10000 | ||
}, | ||
"operations": [ | ||
{ | ||
"name": "ready" | ||
}, | ||
{ | ||
"name": "checkOut" | ||
}, | ||
{ | ||
"name": "checkOut", | ||
"label": "conn" | ||
}, | ||
{ | ||
"name": "clear", | ||
"interruptInUseConnections": true | ||
}, | ||
{ | ||
"name": "waitForEvent", | ||
"event": "ConnectionPoolCleared", | ||
"count": 1, | ||
"timeout": 1000 | ||
}, | ||
{ | ||
"name": "waitForEvent", | ||
"event": "ConnectionClosed", | ||
"count": 2, | ||
"timeout": 1000 | ||
}, | ||
{ | ||
"name": "close" | ||
} | ||
], | ||
"events": [ | ||
{ | ||
"type": "ConnectionCheckedOut", | ||
"connectionId": 1, | ||
"address": 42 | ||
}, | ||
{ | ||
"type": "ConnectionCheckedOut", | ||
"connectionId": 2, | ||
"address": 42 | ||
}, | ||
{ | ||
"type": "ConnectionPoolCleared", | ||
"interruptInUseConnections": true | ||
}, | ||
{ | ||
"type": "ConnectionClosed", | ||
"reason": "stale", | ||
"address": 42 | ||
}, | ||
{ | ||
"type": "ConnectionClosed", | ||
"reason": "stale", | ||
"address": 42 | ||
}, | ||
{ | ||
"type": "ConnectionPoolClosed", | ||
"address": 42 | ||
} | ||
], | ||
"ignore": [ | ||
"ConnectionCreated", | ||
"ConnectionPoolReady", | ||
"ConnectionReady", | ||
"ConnectionCheckOutStarted", | ||
"ConnectionPoolCreated", | ||
"ConnectionCheckedIn" | ||
] | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
version: 1 | ||
style: unit | ||
description: Connections MUST be interrupted as soon as possible (interruptInUseConnections=true) | ||
poolOptions: | ||
# ensure it's not involved by default | ||
backgroundThreadIntervalMS: 10000 | ||
operations: | ||
- name: ready | ||
- name: checkOut | ||
- name: checkOut | ||
label: conn | ||
- name: clear | ||
interruptInUseConnections: true | ||
- name: waitForEvent | ||
event: ConnectionPoolCleared | ||
count: 1 | ||
timeout: 1000 | ||
- name: waitForEvent | ||
event: ConnectionClosed | ||
count: 2 | ||
timeout: 1000 | ||
- name: close | ||
events: | ||
- type: ConnectionCheckedOut | ||
connectionId: 1 | ||
address: 42 | ||
- type: ConnectionCheckedOut | ||
connectionId: 2 | ||
address: 42 | ||
- type: ConnectionPoolCleared | ||
interruptInUseConnections: true | ||
- type: ConnectionClosed | ||
reason: stale | ||
address: 42 | ||
- type: ConnectionClosed | ||
reason: stale | ||
address: 42 | ||
- type: ConnectionPoolClosed | ||
address: 42 | ||
ignore: | ||
- ConnectionCreated | ||
- ConnectionPoolReady | ||
- ConnectionReady | ||
- ConnectionCheckOutStarted | ||
- ConnectionPoolCreated | ||
- ConnectionCheckedIn |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
{ | ||
"version": 1, | ||
"style": "integration", | ||
"description": "clear with interruptInUseConnections = true closes pending connections", | ||
"runOn": [ | ||
{ | ||
"minServerVersion": "4.9.0" | ||
} | ||
], | ||
"failPoint": { | ||
"configureFailPoint": "failCommand", | ||
"mode": "alwaysOn", | ||
"data": { | ||
"failCommands": [ | ||
"isMaster", | ||
"hello" | ||
], | ||
"closeConnection": false, | ||
"blockConnection": true, | ||
"blockTimeMS": 1000 | ||
} | ||
}, | ||
"poolOptions": { | ||
"minPoolSize": 0 | ||
}, | ||
"operations": [ | ||
{ | ||
"name": "ready" | ||
}, | ||
{ | ||
"name": "start", | ||
"target": "thread1" | ||
}, | ||
{ | ||
"name": "checkOut", | ||
"thread": "thread1" | ||
}, | ||
{ | ||
"name": "waitForEvent", | ||
"event": "ConnectionCreated", | ||
"count": 1 | ||
}, | ||
{ | ||
"name": "clear", | ||
"interruptInUseConnections": true | ||
}, | ||
{ | ||
"name": "waitForEvent", | ||
"event": "ConnectionCheckOutFailed", | ||
"count": 1 | ||
} | ||
], | ||
"events": [ | ||
{ | ||
"type": "ConnectionCheckOutStarted" | ||
}, | ||
{ | ||
"type": "ConnectionCreated" | ||
}, | ||
{ | ||
"type": "ConnectionPoolCleared", | ||
"interruptInUseConnections": true | ||
}, | ||
{ | ||
"type": "ConnectionClosed" | ||
}, | ||
{ | ||
"type": "ConnectionCheckOutFailed" | ||
} | ||
], | ||
"ignore": [ | ||
"ConnectionCheckedIn", | ||
"ConnectionCheckedOut", | ||
"ConnectionPoolCreated", | ||
"ConnectionPoolReady" | ||
] | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
version: 1 | ||
style: integration | ||
description: clear with interruptInUseConnections = true closes pending connections | ||
runOn: | ||
- | ||
minServerVersion: "4.9.0" | ||
failPoint: | ||
configureFailPoint: failCommand | ||
mode: "alwaysOn" | ||
data: | ||
failCommands: ["isMaster","hello"] | ||
closeConnection: false | ||
blockConnection: true | ||
blockTimeMS: 1000 | ||
poolOptions: | ||
minPoolSize: 0 | ||
operations: | ||
- name: ready | ||
- name: start | ||
target: thread1 | ||
- name: checkOut | ||
thread: thread1 | ||
- name: waitForEvent | ||
event: ConnectionCreated | ||
count: 1 | ||
- name: clear | ||
interruptInUseConnections: true | ||
- name: waitForEvent | ||
event: ConnectionCheckOutFailed | ||
count: 1 | ||
events: | ||
- type: ConnectionCheckOutStarted | ||
- type: ConnectionCreated | ||
- type: ConnectionPoolCleared | ||
interruptInUseConnections: true | ||
- type: ConnectionClosed | ||
- type: ConnectionCheckOutFailed | ||
ignore: | ||
- ConnectionCheckedIn | ||
- ConnectionCheckedOut | ||
- ConnectionPoolCreated | ||
- ConnectionPoolReady |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the significance of 10 seconds here? Can we constantize this value akin to
waitForEventTimeout
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the specs: