Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use env variable for the pytorch init-container image in case of usin… #2366

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

abhijeet-dhumal
Copy link

@abhijeet-dhumal abhijeet-dhumal commented Jan 6, 2025

What this PR does / why we need it:

  • In Disconnected environment, the images with tags/digests are supported but use of image digest is more preferred because of immutability, consistency and security.
  • The update suggested in this PR allows to pass a pytorch init container image using an environment variable which allows to use the image with digest to be used in case of disconnected environment.

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #

  • Added provision to use environment variable for the pytorch init container image in case of using other image reference or same image with digest in case of disconnected environment.

Checklist:

  • Docs included if any changes are user facing

…g other image reference or same image but with digest in case of disconnected environment
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign terrytangyuan for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@abhijeet-dhumal
Copy link
Author

abhijeet-dhumal commented Jan 6, 2025

Hello @jinchihe @kuizhiqing May I request your review on this PR ? :

Actually my usecase is particularly related to air-gapped environment where the ImageContentSourcePolicy(ICSP) and ClusterImageSet(CIS) are adjusted such as the images required must be referred to digest rather than tag.
The disconnected cluster pulls images from a specific set of mirrors instead of their original registry location. When ICSP is applied, the cluster uses image digest instead of tag to ensure consistency and immutability of images across different mirrors.

In this case, it seems the PytorchJob init container image is hardcoded. The need of this update is to allow user to provide init container image to be used for the training-operator. Is there any other alternative way to provide this init container image using some parameter or flag mentioned here !

Also additionally I would need your guidance for testing/verifying this update :) !!

@abhijeet-dhumal abhijeet-dhumal marked this pull request as ready for review January 8, 2025 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant