mongodb · comandeo · Aug 25, 2023 · Aug 8, 2023 · Aug 11, 2023 · Aug 16, 2023
@@ -268,10 +268,11 @@ selecting a server for a retry attempt.
 3a. Selecting the server for retry
 ''''''''''''''''''''''''''''''''''
 
-If the driver cannot select a server for a retry attempt or the newly selected
-server does not support retryable reads, retrying is not possible and drivers
-MUST raise the previous retryable error. In both cases, the caller is able to
-infer that an attempt was made.
+The server on which the operation failed MUST be provided to the server selection
+mechanism as a deprioritized server. If the driver cannot select a server for
+a retry attempt or the newly selected server does not support retryable reads,
+retrying is not possible and drivers MUST raise the previous retryable error.
+In both cases, the caller is able to infer that an attempt was made.
 
 3b. Sending an equivalent command for a retry attempt
 '''''''''''''''''''''''''''''''''''''''''''''''''''''''
@@ -357,9 +358,17 @@ and reflects the flow described above.
  */
  function executeRetryableRead(command, session) {
  Exception previousError = null;
+ Server previousServer = null;
  while true {
  try {
- server = selectServer();
+ if (previousServer == null) {
+ server = selectServer();
+ } else {
+ // If a previous attempt was made, deprioritize the previous server
+ // where the command failed.
+ deprioritizedServers = [ previousServer ];
+ server = selectServer(deprioritizedServers);
+ }
  } catch (ServerSelectionException exception) {
  if (previousError == null) {
  // If this is the first attempt, propagate the exception.
@@ -416,9 +425,11 @@ and reflects the flow described above.
  } catch (NetworkException networkError) {
  updateTopologyDescriptionForNetworkError(server, networkError);
  previousError = networkError;
+ previousServer = server;
  } catch (NotWritablePrimaryException notPrimaryError) {
  updateTopologyDescriptionForNotWritablePrimaryError(server, notPrimaryError);
  previousError = notPrimaryError;
+ previousServer = server;
  } catch (DriverException error) {
  if ( previousError != null ) {
  throw previousError;

@@ -232,6 +232,56 @@ This test requires MongoDB 4.2.9+ for ``blockConnection`` support in the failpoi
 
 9. Disable the failpoint.
 
+Retrying Reads in a Sharded Cluster
+===================================
+
+These tests will be used to ensure drivers properly retry reads on a different
+mongos.
+
+Retryable Reads Are Retried on a Different Mongos if One Available
+------------------------------------------------------------------
+
+This test MUST be executed against a sharded cluster that has at least two
+mongos instances.
+
+1. Ensure that a test is run against a sharded cluster that has at least two
+ mongoses. If there are more than two mongoses in the cluster, pick two to
+ test against.
+
+2. Create a client per mongos using the direct connection, and configure fail
+ points on each of the picked mongoses, so that each mongos raises
+ a retryable error once.
+
+3. Create a client with ``retryReads=true`` that connects to the cluster,
+ providing the two selected mongoses as seeds.
+
+4. Enable command monitoring, and execute a read command that is
+ supposed to fail on both mongoses.
+
+5. Asserts that there were failed command events from each mongos.
+
+6. Disable the fail points.
+
+
+Retryable Reads Are Retried on the Same Mongos if No Other Available
+--------------------------------------------------------------------
+
+1. Ensure that a test is run against a sharded cluster. If there are multiple
+ mongoses in the cluster, pick one to test against.
+
+2. Create a client that connects to the mongos using the direct connection,
+ and configure a fail point so that the mongos raises a retryable error once.
+
+3. Create a client with ``retryReads=true`` that connects to the cluster,
+ providing the selected mongos as the seed.
+
+4. Enable command monitoring, and execute a read command that is
+ supposed to fail.
+
+5. Asserts that there was a failed command and a successful command event.
+
+6. Disable the fail point.
+
 
 Changelog
 =========

@@ -395,11 +395,12 @@ of the following conditions is reached:
  <../client-side-operations-timeout/client-side-operations-timeout.rst#retryability>`__.
 - CSOT is not enabled and one retry was attempted.
 
-For each retry attempt, drivers MUST select a writable server. If the driver
-cannot select a server for a retry attempt or the selected server does not
-support retryable writes, retrying is not possible and drivers MUST raise the
-retryable error from the previous attempt. In both cases, the caller is able
-to infer that an attempt was made.
+For each retry attempt, drivers MUST select a writable server. Server on which
+the operation failed MUST be provided to the server selection mechanism as
+a deprioritized server. If the driver cannot select a server for a retry attempt
+or the selected server does not support retryable writes, retrying is not
+possible and drivers MUST raise the retryable error from the previous attempt.
+In both cases, the caller is able to infer that an attempt was made.
 
 If a retry attempt also fails, drivers MUST update their topology according to
 the SDAM spec (see: `Error Handling`_). If an error would not allow the caller
@@ -492,11 +493,15 @@ The above rules are implemented in the following pseudo-code:
  }
  }
 
- /* If we cannot select a writable server, do not proceed with retrying and
+ /*
+ * We try to select server that is not the one that failed by passing the
+ * failed server as a deprioritized server.
+ * If we cannot select a writable server, do not proceed with retrying and
  * throw the previous error. The caller can then infer that an attempt was
  * made and failed. */
  try {
- server = selectServer("writable");
+ deprioritizedServers = [ server ];
+ server = selectServer("writable", deprioritizedServers);
  } catch (Exception ignoredError) {
  throw previousError;
  }

@@ -456,6 +456,50 @@ and sharded clusters.
  mode: "off",
  })
 
+#. Test that in a sharded cluster writes are retried on a different mongos if
+ one available
-#. Test that in a sharded cluster writes are retried on a different mongos if
- one available
+#. Test that in a sharded cluster writes are retried on a different mongos if
+ one is available
-#. Test that in a sharded cluster writes are retried on a different mongos if
- one available
+#. Test that in a sharded cluster writes are retried on a different mongos if
+ one is available
+
+ This test MUST be executed against a sharded cluster that has at least two
+ mongos instances.
+
+ 1. Ensure that a test is run against a sharded cluster that has at least two
+ mongoses. If there are more than two mongoses in the cluster, pick two to
+ test against.
+
+ 2. Create a client per mongos using the direct connection, and configure fail
+ points on each of the picked mongoses, so that each mongos raises
+ a retryable error once.
+
+ 3. Create a client with ``retryWrites=true`` that connects to the cluster,
+ providing the two selected mongoses as seeds.
+
+ 4. Enable command monitoring, and execute a write command that is
+ supposed to fail on both mongoses.
+
+ 5. Asserts that there were failed command events from each mongos.
+
+ 6. Disable the fail points.
+
+#. Test that in a sharded cluster on the same mongos if no other available
+
+ This test MUST be executed against a sharded cluster
+
+ 1. Ensure that a test is run against a sharded cluster. If there are multiple
+ mongoses in the cluster, pick one to test against.
+
+ 2. Create a client that connects to the mongos using the direct connection,
+ and configure a fail point so that the mongos raises a retryable error once.
+
+ 3. Create a client with ``retryWrites=true`` that connects to the cluster,
+ providing the selected mongos as the seed.
+
+ 4. Enable command monitoring, and execute a write command that is
+ supposed to fail.
+
+ 5. Asserts that there was a failed command and a successful command event.
+
+ 6. Disable the fail point.
+
 Changelog
 =========
 

@@ -843,7 +843,9 @@ For multi-threaded clients, the server selection algorithm is as follows:
 2. If the topology wire version is invalid, raise an error and log a
  `"Server selection failed" message`_.
 
-3. Find suitable servers by topology type and operation type
+3. Find suitable servers by topology type and operation type. In the case of
+ sharded clusters, a list of deprioritized servers may be provided;
+ these servers should be selected only if there are no other suitable servers.
 
 4. Filter the suitable servers by calling the optional, application-provided server
  selector.
@@ -915,7 +917,9 @@ as follows:
 5. If the topology wire version is invalid, raise an error and log a
  `"Server selection failed" message`_.
 
-6. Find suitable servers by topology type and operation type
+6. Find suitable servers by topology type and operation type. In the case of
+ sharded clusters, a list of deprioritized servers may be provided;
+ these servers should be selected only if there are no other suitable servers.
 
 7. Filter the suitable servers by calling the optional, application-provided
  server selector.