-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11243. SCM SafeModeRule Support EC. #7008
base: master
Are you sure you want to change the base?
Conversation
@@ -199,6 +201,11 @@ public void onMessage(final ContainerReportFromDatanode reportFromDatanode, | |||
// list | |||
processMissingReplicas(datanodeDetails, expectedContainersInDatanode); | |||
containerManager.notifyContainerReportProcessing(true, true); | |||
if (reportFromDatanode.isRegister()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the CONTAINER_REPORT
is completed, we send the message to CONTAINER_REGISTRATION_REPORT
to ensure that the container count is accurate.
@errose28 @siddhantsangwan Can you help review this pr? Thank you very much! |
@slfan1989 Thanks for taking this up, I was earlier thinking of fixing this myself. I'll review the PR soon. |
@siddhantsangwan Can you help review this pr? Thank you very much! The unit test errors are not caused by our changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slfan1989 I've reviewed this partly. Have some comments below.
if (replicationConfig != null && replicationConfig instanceof ECReplicationConfig) { | ||
ECReplicationConfig ecReplicationConfig = (ECReplicationConfig) replicationConfig; | ||
int data = ecReplicationConfig.getData(); | ||
if (uuids != null && uuids.size() > data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Ratis, just one replica per container is required. So for EC, data number of Datanodes should be sufficient. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. For EC, the amount of data we have is already sufficient. I will improve the code.
if (ratisContainerMap.containsKey(containerID)) { | ||
ratisContainerDNsMap.computeIfAbsent(containerID, key -> Sets.newHashSet()); | ||
ratisContainerDNsMap.get(containerID).add(datanodeUUID); | ||
if (!reportedConatinerIDSet.contains(containerID)) { | ||
Set<UUID> uuids = ratisContainerDNsMap.get(containerID); | ||
if (uuids != null && uuids.size() >= 1) { | ||
ratisContainerWithMinReplicas.getAndAdd(1); | ||
reportedConatinerIDSet.add(containerID); | ||
getSafeModeMetrics() | ||
.incCurrentContainersWithOneReplicaReportedCount(); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really understand this change. It seems correct, but is there any reason this logic isn't the same as before? Why do we need to track Datanodes in a set for Ratis containers? Is it because ratisContainerDNsMap
and reportedConatinerIDSet
are going to be used somewhere else as well? Or is it done this way just so it's similar to the EC logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the question!
I didn't really understand this change. It seems correct, but is there any reason this logic isn't the same as before?
The previous logic was correct. I made this modification for two reasons:
- To align with the EC's logic and improve code readability.
- To facilitate the retrieval of additional data in future pr. For example, this will allow users not only to understand the progress but also to identify which
containers
have not reported and whichDataNodes
are included in the reported containers.
Why do we need to track Datanodes in a set for Ratis containers? Is it because ratisContainerDNsMap and reportedConatinerIDSet are going to be used somewhere else as well?
The type of ratisContainerDNsMap
is Map<Long, Set<UUID>>
, where the key is the ContainerId
. The reason for using a Set
as the value is to avoid retaining duplicate DN information, as we may encounter the same DN registering multiple times.
Or is it done this way just so it's similar to the EC logic?
Here's one reason; it has already been explained in the previous comment.
Can we modify it this way? The original code contains some insufficient information.
...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/safemode/ContainerSafeModeRule.java
Outdated
Show resolved
Hide resolved
long ratisCutOff = (long) Math.ceil(ratisMaxContainer * safeModeCutoff); | ||
long ecCutOff = (long) Math.ceil(ecMaxContainer * safeModeCutoff); | ||
|
||
getSafeModeMetrics().setNumContainerWithOneReplicaReportedThreshold(ratisCutOff); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's set EC metrics as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will improve this part of the code.
private void reInitializeRule() { | ||
containerMap.clear(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like most of the code inside this method is the same as before. If possible, let's refactor this to avoid repetition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will improve this part of the code.
@@ -75,10 +79,18 @@ public void setNumContainerWithOneReplicaReportedThreshold(long val) { | |||
this.numContainerWithOneReplicaReportedThreshold.set(val); | |||
} | |||
|
|||
public void setNumContainerWithECDataReplicaReportedThreshold(long val) { | |||
this.numContainerWithECDataReplicaReportedThreshold.incr(val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use set() instead of incr().
Thank you very much for reviewing this PR! I will respond to your questions as soon as possible. |
…m/safemode/ContainerSafeModeRule.java Co-authored-by: Siddhant Sangwan <siddhantsangwan027@gmail.com>
What changes were proposed in this pull request?
We aim for SCM to immediately switch to leader once it exits safe mode. Currently, due to certain issues, we need to wait for at least one full container report from a DataNode before proceeding with the switch.
Currently, SCM SafeMode has the following issues:
DataNodeSafeModeRule
cannot effectively verify the registration status of DataNodes. In most cases, as long as there are more than one DataNode, this rule passes. Therefore, we need to strengthen this rule.ContainerSafeModeRule
does not support verification of EC (Erasure Coding) Containers.EC
Containers differ significantly fromRATIS/THREE
Containers because EC Containers require determining how many replicas are needed based on the EC type. For instance, forEC-6-3-1024K
, we need to ensure that the Container reports having all 6 replicas before it can provide services.This PR aims to enhance and improve the above two points.
For code Improve:
For the registration of Datanodes, we need to obtain the complete list of Datanodes from SCM. This list can be retrieved from the Pipeline. I pass PipelineManager as a parameter into DataNodeSafeModeRule to calculate the number of Datanodes.
Enhance replica validation for EC containers. Obtain the required replicas based on ECReplicationConfig. Consider container reporting complete only when sufficient replicas have been reported.
Modify the message sending location of ContainerSafeModeRule.
ozone/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMDatanodeProtocolServer.java
Lines 241 to 256 in 311245b
There are some issues in this part of the code. The handling of
NODE_REGISTRATION_CONT_REPORT
andCONTAINER_REPORT
is asynchronous. There is a scenario whereNODE_REGISTRATION_CONT_REPORT
processing completes, butCONTAINER_REPORT
processing does not. This still leads to insufficient EC replicas issue.I adjusted the sending position of NODE_REGISTRATION_CONT_REPORT (requiring the message to be sent only after CONTAINER_REPORT processing completes) and introduced a new type, CONTAINER_REGISTRATION_REPORT, to distinguish it.
Page display:
What is the link to the Apache JIRA
JIRA: HDDS-11243: SCM SafeModeRule Support EC.
How was this patch tested?
Junit Test
&Production environment validation