-
Notifications
You must be signed in to change notification settings - Fork 61
Add fields to specify ray head/worker resources + additional config map #429
base: master
Are you sure you want to change the base?
Add fields to specify ray head/worker resources + additional config map #429
Conversation
Thank you for opening this pull request! 🙌 These tips will help get your PR across the finish line:
|
Signed-off-by: Guozhen La <guozhen.la@gmail.com>
5efa9fe
to
bae1d8b
Compare
Codecov Report
@@ Coverage Diff @@
## master #429 +/- ##
==========================================
+ Coverage 75.92% 78.48% +2.55%
==========================================
Files 18 18
Lines 1458 1250 -208
==========================================
- Hits 1107 981 -126
+ Misses 294 212 -82
Partials 57 57
Flags with carried forward coverage won't be shown. Click here to find out more. |
c33fb09
to
8d77c78
Compare
Signed-off-by: Guozhen La <guozhen.la@gmail.com>
8d77c78
to
5561ee9
Compare
Signed-off-by: Guozhen La <guozhen.la@gmail.com>
41b461a
to
8af47ad
Compare
Signed-off-by: Guozhen La <guozhen.la@gmail.com>
Signed-off-by: Guozhen La <guozhen.la@gmail.com>
Co-authored-by: Kevin Su <pingsutw@gmail.com> Signed-off-by: Guozhen La <68618788+guozhen-la@users.noreply.github.com>
@@ -19,13 +21,19 @@ message RayCluster { | |||
HeadGroupSpec head_group_spec = 1; | |||
// WorkerGroupSpecs are the specs for the worker pods | |||
repeated WorkerGroupSpec worker_group_spec = 2; | |||
// Namespace used to create the ray cluster | |||
string namespace = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make it more generic, could we add it to the task metadata here
// Namespace used to create the ray cluster | ||
string namespace = 3; | ||
// Kubernetes service account used by the ray cluster | ||
string k8s_service_account = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just remembered that we could override the service account by using pod template. That said, we can remove k8s_service_account
in the idl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pingsutw how does it look like in the flytekit UX? we added k8s_service_account
to the RayJobConfig
. Would we not need it anymore and instead expect the user to override it using pod template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may not be nice user experience if the user has to specify some attributes to override in RayJobConfig
and some through a pod template
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to make it general because others may want to override the service account for spark, Dask, TensorFlow tasks, etc. How about adding a namespace arg to the task decorator?
@task(rayJobConfig(...), service_account="")
def ray_task()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting just keeping the Ray-specific parameters in the RayJobConfig? Are we then going to create a parameter in the task decorator for each non-Ray configuration we want to override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Namespace is not a common config that people will override so we can keep it in the rayjob.
Are you suggesting just keeping the Ray-specific parameters in the RayJobConfig? Are we then going to create a parameter in the task decorator for each non-Ray configuration we want to override?
Yes, ideally. people may want to override the service account for the spark job. any other configs you want to override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pingsutw sorry we were away for a work event last week. So if I understand correctly, you're just suggesting moving k8s_sa
outside of rayjob
and allow users to configure it at the decorator level?
@task(rayJobConfig(...), service_account="")
def ray_task():
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the concern around generalizability (other tasks also configuring service_account
). Is this still not a better user-experience to configure everything in the task config object versus a decorator-level argument?
@task(rayJobConfig(namespace="", k8s="...",...))
def ray_task()
@task(sparkJobConfig(k8s="...",...))
def spark_task()
Adding these fields in support of the following features:
Type
Are all requirements met?
Complete description
How did you fix the bug, make the feature etc. Link to any design docs etc
Tracking Issue
Remove the 'fixes' keyword if there will be multiple PRs to fix the linked issue
fixes https://github.com/flyteorg/flyte/issues/
Follow-up issue
NA
OR
https://github.com/flyteorg/flyte/issues/