fix(NODE-6370) response messages to large commands can be lost under load #4245
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
NODE-6370 describes scenarios where executing commands can hang indefinitely.
Debugging uncovered that the response messages were sometimes being lost because the 'data' listener on the socket was not active in time, which was caused by awaiting the 'drain' event before adding the 'data' listener (in readMany).
This changes when the drain event is handled. This would generally only be observable in these scenarios:
What is changing?
The drain event is not awaited when the response will be read. The response cannot arrive until after socket is drained, so is unnecessary in that case. This ensures that the 'data' event will be listened to.
Is there new documentation needed for these changes?
No
What is the motivation for this change?
Current driver is unsafe for use in production environments that experience high load and save large documents.
Release Highlight
Fill in title or leave empty for no highlight
Double check the following
npm run check:lint
scripttype(NODE-xxxx)[!]: description
feat(NODE-1234)!: rewriting everything in coffeescript
(this area has a lot of tests covering it. A test for the actual issue is impractical)