Skip to content

Commit

Permalink
Use shard-id of the master if the replica does not support shard-id (…
Browse files Browse the repository at this point in the history
…#12805)

If there are nodes in the cluster that do not support shard-id, they
will gossip shard-id. From the perspective of nodes that support shard-id,
their shard-id is meaningless (since shard-id is randomly generated when
we create a node.)

Nodes that support shard-id will save the shard-id information in nodes.conf.
If the node is restarted according to nodes.conf, the server will report a
corrupted cluster config file error. Because auxShardIdSetter will reject
configurations with inconsistent master-replica shard-ids.

A cluster-wide consensus for the node's shard_id is not necessary. The key
is maintaining consistency of the shard_id on each individual 7.2 node.
As the cluster progressively upgrades to version 7.2, we can expect the
shard_ids across all nodes to naturally converge and align.

In this PR, when processing the gossip, if sender is a replica and does not
support shard-id, set the shard_id to the shard_id of its master.
  • Loading branch information
enjoy-binbin authored Jan 7, 2024
1 parent 38f0234 commit 4cae66f
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 1 deletion.
1 change: 1 addition & 0 deletions src/cluster.h
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ int clusterNodeIsMaster(clusterNode *n);
char *clusterNodeIp(clusterNode *node);
int clusterNodeIsSlave(clusterNode *node);
clusterNode *clusterNodeGetSlaveof(clusterNode *node);
clusterNode *clusterNodeGetMaster(clusterNode *node);
char *clusterNodeGetName(clusterNode *node);
int clusterNodeTimedOut(clusterNode *node);
int clusterNodeIsFailing(clusterNode *node);
Expand Down
19 changes: 18 additions & 1 deletion src/cluster_legacy.c
Original file line number Diff line number Diff line change
Expand Up @@ -2597,11 +2597,23 @@ void clusterProcessPingExtensions(clusterMsg *hdr, clusterLink *link) {
/* We know this will be valid since we validated it ahead of time */
ext = getNextPingExt(ext);
}

/* If the node did not send us a hostname extension, assume
* they don't have an announced hostname. Otherwise, we'll
* set it now. */
updateAnnouncedHostname(sender, ext_hostname);
updateAnnouncedHumanNodename(sender, ext_humannodename);
/* If the node did not send us a shard-id extension, it means the sender
* does not support it (old version), node->shard_id is randomly generated.
* A cluster-wide consensus for the node's shard_id is not necessary.
* The key is maintaining consistency of the shard_id on each individual 7.2 node.
* As the cluster progressively upgrades to version 7.2, we can expect the shard_ids
* across all nodes to naturally converge and align.
*
* If sender is a replica, set the shard_id to the shard_id of its master.
* Otherwise, we'll set it now. */
if (ext_shardid == NULL) ext_shardid = clusterNodeGetMaster(sender)->shard_id;

updateShardId(sender, ext_shardid);
}

Expand Down Expand Up @@ -5544,7 +5556,7 @@ void addShardReplyForClusterShards(client *c, list *nodes) {
addReplyBulkCString(c, "slots");

/* Use slot_info_pairs from the primary only */
while (n->slaveof != NULL) n = n->slaveof;
n = clusterNodeGetMaster(n);

if (n->slot_info_pairs != NULL) {
serverAssert((n->slot_info_pairs_count % 2) == 0);
Expand Down Expand Up @@ -5805,6 +5817,11 @@ clusterNode *clusterNodeGetSlaveof(clusterNode *node) {
return node->slaveof;
}

clusterNode *clusterNodeGetMaster(clusterNode *node) {
while (node->slaveof != NULL) node = node->slaveof;
return node;
}

char *clusterNodeGetName(clusterNode *node) {
return node->name;
}
Expand Down

0 comments on commit 4cae66f

Please sign in to comment.