Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRIVERS-2862: Benchmark Collection and Client BulkWrite #1733

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

adelinowona
Copy link
Contributor

@adelinowona adelinowona commented Nov 19, 2024

The PR adds benchmarks for Collection::BulkWrite and Client::BulkWrite with insert-only operations and mixed operations. We already have a Small doc Bulk Insert and Large doc Bulk Insert benchmark in the benchmarking spec which may be sufficient for benchmarking Collection::BulkWrite with insert-only operations. However, the Small doc Bulk Insert and Large doc Bulk Insert benchmarks are implemented using insertMany and not all the drivers use Collection::BulkWrite in their implementation of insertMany. With that in mind, I have added explicit Collection::BulkWrite benchmarks to be implemented by all drivers who implement Collection::BulkWrite. This ensures we still maintain comprehensive performance testing for our batch-write performance.

Here are results of the new benchmarks as implemented on the C# driver:

Name MB/sec
SmallDocClientBulkWriteMixedOps 1.8424719941
SmallDocCollectionBulkWriteMixedOps 0.7477088677
LargeDocClientBulkWriteInsert 94.242845107
LargeDocCollectionBulkWriteInsert 96.493395532
LargeDocBulkInsert 96.5527236825
SmallDocClientBulkWriteInsert 36.6208512962
SmallDocCollectionBulkWriteInsert 41.0499232896
SmallDocBulkInsert 42.1881191706


Please complete the following before merging:

  • Update changelog.
  • Test changes in at least one language driver.
  • Test these changes against all server versions and topologies (including standalone, replica set, sharded
    clusters, and serverless).

@adelinowona adelinowona requested a review from a team as a code owner November 19, 2024 16:49
@adelinowona adelinowona requested review from JamesKovacs, BorisDog, dariakp and ShaneHarvey and removed request for a team November 19, 2024 16:49
Copy link
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial review


| Phase | Description |
| ----------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Setup | Construct a MongoClient object. Drop the `perftest` database. Load the SMALL_DOC dataset into memory as a language-appropriate document type (or JSON string for C). Make 10,000 copies of the document. DO NOT manually add an `_id` field; leave it to the driver or database. Construct a list of write models with insert, replace and delete operations for each copy of the document. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some inconsistency in specifying the following action "Make X copies, DO NOT add _id", it's part of Setup step for bulk insert and part of Do task in other places. Should this be part of same step everywhere?


| Phase | Description |
| ----------- |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Setup | Construct a MongoClient object. Drop the `perftest` database. Load the SMALL_DOC dataset into memory as a language-appropriate document type (or JSON string for C). Make 10,000 copies of the document. DO NOT manually add an `_id` field; leave it to the driver or database. Construct a list of write models with insert, replace and delete operations for each copy of the document. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need more details here: ordering and explicit count for each operation kind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants