-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve data locality by considering Kubernetes topology #595
Comments
Summary of topology provider logicThe The topology is in the form:
are used to check that a topology has been created for a specific dataNode role group. Internally, This steps are as follows (ignoring caching logic):
We can use the information here 🟢 for this ticket. |
I'm going to leave a comment here which may be helpful or you may wish to remove. We have a CDH5 cluster that does a daily large Spark batch job to read Parquet and generate HFiles for bulk loading to HBase. On our new ST installation configured to run the job as equivalent as possible (600 cores, 36G executor memory, same code), but newer libraries (e.g. spark 2.3 -> 3) and hardware we see the job run at 7.5hrs. I'm still learning our setup but Grafana is logging at peaks of 640MiB/s on the Data Node transmit and receive graphs during this job. We might have a good environment for evaluating any fix (perhaps running patched code?) and we'd be happy to assist. We run 100s of Spark jobs like this daily. |
That's great to know - thank you! We'll definitely bear this in mind when we pick up this issue. |
Description
As users of the HDFS operator and a Stackable deployed HDFS we want it to ensure data locality by talking to a DataNode on the same Kubernetes node as the client first if one exists.
Value
HDFS tries to store the first copy of a block on a "local" machine before shipping data to remote machines over the network. This relies on a simple IP address comparison in the HDFS code which breaks due to the nature of Kubernetes where pods don't share the same IP even if they are on the same Kubernetes node.
I believe we can improve this situation by changing the HDFS code to consider the Kubernetes node while looking for a "local" machine.
We already have precedent with the
hdfs-topology-provider
which does something similar. I believe we can plug this logic into thechooseLocalOrFavoredStorage
method ofBlockPlacementPolicyDefault
.We want this because it will probably benefit all workloads that are using HDFS and locally attached storage and that are using things like Spark or HBase where processing can happen on the same Kubernetes node as the storage. The benefit is going to be less network traffic and a boost in performance.
Dependencies
It probably makes sense to reuse code from the
hdfs-topology-provider
project.Tasks
Acceptance Criteria
Release Notes
The HDFS NameNodes will now look at the Kubernetes topology when considering whether a
client
request is madelocally
or not. This means it will consider all clients "local" that are hosted on the same Kubernetes node as a DataNode.The text was updated successfully, but these errors were encountered: