DRIVERS-716 Improved Bulk Write API #1534

isabelatkinson · 2024-03-01T17:32:24Z

Summary

Specifies the driver API for the new bulkWrite command. The API in the new specification was previously discussed and approved in WRITING-13533. I made a few changes to the syntax which are called out in inline comments.

An implementation in the Rust Driver of this spec is currently in review (mongodb/mongo-rust-driver#1034). A C driver implementation is also in progress.

Test Plan

I've added unified tests in the crud directory for bulk write options, errors, and results. There are also some prose tests for bulk write batching. This seemed preferable to trying to add functionality to the unified test runner to build very large documents based on hello response values.

Basic tests are also added for the following specs:

Transactions: executing a bulk write within a transaction
Retryable writes: executing bulk writes with/without multi: true operations
Stable API: appending an API version to a bulk write command

Please complete the following before merging:

Update changelog.
Make sure there are generated JSON files from the YAML test files.
Test changes in at least one language driver.
Test these changes against all server versions and topologies (including standalone, replica set, sharded
clusters, and serverless).

source/crud/bulk-write.md

source/crud/tests/README.md

source/crud/bulk-write.md

jmikola

Minor comments on spec files. I'm not reviewing the test files.

jmikola · 2024-04-29T18:30:23Z

source/client-side-operations-timeout/tests/README.md

+### 11. Multi-batch bulkWrites
+
+This test MUST only run against standalones on server versions 8.0 and higher. The `bulkWrite` call takes an
+exceedingly long time on replicasets and sharded clusters. Drivers MAY adjust the timeouts used in this test to allow


Why would an insert batch of 50 1MB documents take an "exceedingly long time"? Also, if a fail point is being used on each command, couldn't you also get by using fewer, larger documents that split into 2+ commands and trigger the timeout?

Good point, updated to insert a few large documents. I lifted most of this test from the existing insertMany test for consistency, but the actual writes happening shouldn't matter much as long as the same blockConnection is being used. Some basic local benchmarking of inserting a few large documents didn't show any time differences based on topology so I also removed that language.

Caveat: Kevin and I can't verify that this works as Rust and C both don't implement CSOT, so we'll need to wait for a driver that does have CSOT to implement this.

jmikola · 2024-04-29T18:38:05Z

source/crud/bulk-write.md

+successful operations and errors will be returned. This field is optional and defaults to false on
+the server.
+
+`errorsOnly` corresponds to the `verboseResults` option defined on `BulkWriteOptions`. If the user


Suggested change

`errorsOnly` corresponds to the `verboseResults` option defined on `BulkWriteOptions`. If the user

`errorsOnly` corresponds inversely to the `verboseResults` option defined on `BulkWriteOptions`. If the user

Same language used earlier in BulkWriteOptions definition.

jmikola · 2024-04-29T18:46:03Z

source/crud/bulk-write.md

+while writeModels.hasNext() {
+ ops = DocumentSequence {}
+ nsInfo = DocumentSequence {}
+ loop {


I inferred that Rust is being used here, but in the interest of readability across teams can we consider wrapping the while and if conditions in parentheses and replace loop with something like while (true)?

AFAIK, most of the other specs used something resembling Javascript for their code examples. I'm less concerned about this specific example here and more about setting a precedent for other authors using their own languages in specs.

If no one else cares, you can disregard this comment.

jmikola · 2024-04-29T18:48:44Z

source/crud/bulk-write.md

+Drivers MUST attempt to consume the contents of the cursor returned in the server's `bulkWrite`
+response before returning to the user. This is required regardless of whether the user requested
+verbose or summary results, as the results cursor always contains any write errors that occurred.
+If the cursor contains a nonzero cursor ID, drivers MUST perform `getMore`s until the cursor has


Alternatively: "MUST execute getMore until" if the "getMores" formatting appears strange.

jmikola · 2024-04-29T19:52:26Z

source/crud/bulk-write.md

+
+Unlike the other result types, `InsertOneResult` contains an `insertedId` field that is generated
+driver-side, either by recording the `_id` field present in the user's insert document or creating
+and adding one. Drivers MUST only record these `insertedId`s in a `BulkWriteResult` when a


Suggested change

and adding one. Drivers MUST only record these `insertedId`s in a `BulkWriteResult` when a

and adding one. Drivers MUST only record these `insertedId` fields in a `BulkWriteResult` when a

Related to my earlier suggestion about "getMores`. Feel free to disregard.

jmikola · 2024-04-29T19:58:09Z

source/crud/bulk-write.md

+
+The Command Batching [Total Message Size](#total-message-size) section uses a 1000 byte overhead
+allowance to approximate the number of non-`bulkWrite`-specific bytes contained in an `OP_MSG` sent
+for a `bulkWrite` batch. This number was determined by constructing `OP_MSG`s with various fields


Suggested change

for a `bulkWrite` batch. This number was determined by constructing `OP_MSG`s with various fields

for a `bulkWrite` batch. This number was determined by constructing `OP_MSG` messages with various fields

Feel free to disregard.

source/versioned-api/tests/crud-api-version-1.yml

source/etc/generate-handshakeError-tests.py

source/transactions/tests/unified/mongos-pin-auto-tests.py

… into bulk-write

kevinAlbs

LGTM. Great work!

jyemin

LGTM!

isabelatkinson added 20 commits February 29, 2024 09:45

initial work

fb27ce1

retryability

0ede8ae

changelog

afb1ab6

transactions

e2cb197

unified test runner changes

a24c7bd

add schema

5a98994

add batching prose tests

98fbfda

restructure spec

7f59210

various cleanup

dbb2083

schema version updates

bc888f8

move run on requirements to top level

b425c23

always use verbose results

d7294cd

clean up yaml anchors

82799a7

clean up command started events

905e2ee

move tests into crud directory

981e6c3

fix typo

03c3bde

update some language

01c9ae7

self review changes

fab1deb

language updates

7e1e897

minor updates

d907502

isabelatkinson commented Mar 4, 2024

View reviewed changes

source/crud/bulk-write.md Outdated Show resolved Hide resolved

source/crud/bulk-write.md Outdated Show resolved Hide resolved

isabelatkinson commented Mar 4, 2024

View reviewed changes

source/crud/bulk-write.md Outdated Show resolved Hide resolved

isabelatkinson marked this pull request as ready for review March 4, 2024 21:04

isabelatkinson requested review from a team as code owners March 4, 2024 21:04

isabelatkinson requested review from alcaeus and removed request for a team March 4, 2024 21:04

isabelatkinson requested a review from jmikola April 26, 2024 17:49

language

2c9702d

kevinAlbs reviewed Apr 29, 2024

View reviewed changes

jmikola approved these changes Apr 29, 2024

View reviewed changes

isabelatkinson added 3 commits April 30, 2024 12:39

improve CSOT test

3fdd543

jeremy language suggestions

7ef7c8c

add oid to calculation info

e133ed3

kevinAlbs mentioned this pull request May 1, 2024

CDRIVER-4363 add client bulk write mongodb/mongo-c-driver#1590

Merged

kevinAlbs reviewed May 1, 2024

View reviewed changes

source/versioned-api/tests/crud-api-version-1.yml Show resolved Hide resolved

kevinAlbs reviewed May 1, 2024

View reviewed changes

source/etc/generate-handshakeError-tests.py Show resolved Hide resolved

kevinAlbs reviewed May 1, 2024

View reviewed changes

source/transactions/tests/unified/mongos-pin-auto-tests.py Show resolved Hide resolved

isabelatkinson added 5 commits May 2, 2024 09:34

update batching tests

b9f39a6

remove auto-encryption support

fb659af

server versions

c12cf6c

Merge branch 'bulk-write' of github.com:isabelatkinson/specifications…

35b2250

… into bulk-write

update csot test formatting

a3d1652

isabelatkinson requested a review from kevinAlbs May 2, 2024 16:10

kevinAlbs approved these changes May 2, 2024

View reviewed changes

isabelatkinson added 8 commits May 8, 2024 09:44

1.20 -> 1.21

d23399c

save files

91bec49

Merge branch 'master' into bulk-write

ebe592e

fix lint errors

cd64072

fix schema versions

ae0d729

json

05852ad

bump schema version in makefile

ad57a1e

merge 1.20 and 1.21

5a6532f

jyemin approved these changes May 9, 2024

View reviewed changes

isabelatkinson merged commit 10919c9 into mongodb:master May 9, 2024
3 checks passed

kevinAlbs mentioned this pull request Aug 21, 2024

DRIVERS-716 skip bulkWrite tests on Atlas Serverless #1636

Merged

3 tasks

stIncMale mentioned this pull request Aug 28, 2024

Implement Java sync improved bulk write API and unified spec tests mongodb/mongo-java-driver#1486

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DRIVERS-716 Improved Bulk Write API #1534

DRIVERS-716 Improved Bulk Write API #1534

isabelatkinson commented Mar 1, 2024 •

edited

Loading

jmikola left a comment

jmikola Apr 29, 2024

isabelatkinson Apr 30, 2024

jmikola Apr 29, 2024

jmikola Apr 29, 2024

jmikola Apr 29, 2024

jmikola Apr 29, 2024

jmikola Apr 29, 2024

kevinAlbs left a comment

jyemin left a comment

	`errorsOnly` corresponds to the `verboseResults` option defined on `BulkWriteOptions`. If the user
	`errorsOnly` corresponds inversely to the `verboseResults` option defined on `BulkWriteOptions`. If the user

	and adding one. Drivers MUST only record these `insertedId`s in a `BulkWriteResult` when a
	and adding one. Drivers MUST only record these `insertedId` fields in a `BulkWriteResult` when a

	for a `bulkWrite` batch. This number was determined by constructing `OP_MSG`s with various fields
	for a `bulkWrite` batch. This number was determined by constructing `OP_MSG` messages with various fields

DRIVERS-716 Improved Bulk Write API #1534

DRIVERS-716 Improved Bulk Write API #1534

Conversation

isabelatkinson commented Mar 1, 2024 • edited Loading

Summary

Test Plan

jmikola left a comment

Choose a reason for hiding this comment

jmikola Apr 29, 2024

Choose a reason for hiding this comment

isabelatkinson Apr 30, 2024

Choose a reason for hiding this comment

jmikola Apr 29, 2024

Choose a reason for hiding this comment

jmikola Apr 29, 2024

Choose a reason for hiding this comment

jmikola Apr 29, 2024

Choose a reason for hiding this comment

jmikola Apr 29, 2024

Choose a reason for hiding this comment

jmikola Apr 29, 2024

Choose a reason for hiding this comment

kevinAlbs left a comment

Choose a reason for hiding this comment

jyemin left a comment

Choose a reason for hiding this comment

isabelatkinson commented Mar 1, 2024 •

edited

Loading