Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile/pressure analisys best practices #34

Open
bhack opened this issue Jun 1, 2023 · 5 comments
Open

Profile/pressure analisys best practices #34

bhack opened this issue Jun 1, 2023 · 5 comments
Labels
documentation Improvements or additions to documentation

Comments

@bhack
Copy link

bhack commented Jun 1, 2023

Can you add a clear list of best practices to test the sidecar resource allocation? From the current documentation it seems to be a little bit "black magic".
It is hard to understand on real workload if the sidecar is under pressure and it requires additional memory and CPU limits.

@bhack
Copy link
Author

bhack commented Jun 4, 2023

See also #35

@songjiaxun
Copy link
Collaborator

Thanks for the question. This is a great suggestion. Our team will discuss how to provide best practices guidance and update you here.

@songjiaxun songjiaxun added the documentation Improvements or additions to documentation label Jul 14, 2023
@bhack
Copy link
Author

bhack commented Nov 24, 2023

@songjiaxun Do you think that this could be solved by your comment at #61 (comment) ?

@bhack
Copy link
Author

bhack commented Jan 26, 2024

@songjiaxun As an alternative to unlimited resources do you think that we could occasionally log the the CPU occupancy so that we could fine-tune resource reservation?

@bhack
Copy link
Author

bhack commented Apr 19, 2024

As on autopilot we need to limit the CPU resources assigned to the sidecar how we could be sure that the sidecar is not a bottleneck?
On deep learning jobs we have dataloaders/dataworkers that require assigned CPU Resource on the main container. These need to cooperated with the sidecar resources to transfer file, preprocessing and feed GPUs.

We need to have a reliable way to understand when the sidecar is the bottleneck or instead it is on the dataloaders/datowrkers assigned resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants