-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add model_load_time metric #397
Conversation
Can you also add me to this PR? |
const uint64_t now_ns = | ||
std::chrono::duration_cast<std::chrono::nanoseconds>( | ||
std::chrono::steady_clock::now().time_since_epoch()) | ||
.count(); | ||
uint64_t time_to_load_ns = now_ns - loaded.second->load_start_ns_; | ||
std::chrono::duration<double> time_to_load = | ||
std::chrono::duration_cast<std::chrono::duration<double>>( | ||
std::chrono::nanoseconds(time_to_load_ns)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you time the load time here and directly storing the load time in model info. You can still put the metric update logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason I added in the metric update in
ModelLifeCycle::OnLoadFinal()
was because that is where we set the status to READY for the model
loaded.second->state_ = ModelReadyState::READY;
If I set the model load time here there is chance that ModelLifeCycle::OnLoadFinal()
exits without loading the model and we might have a incorrect metric.
Can you build with metrics disabled flags and make sure it works? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Make sure you addressed all the comments before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor comment on model_info->load_start_ns_
lifecycle. LGTM overall!
What does the PR do?
Add new metric nv_load_time per model to metrics
Added load time gauge metric per model.
New metric added example
Checklist
<commit_type>: <Title>
Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
triton-inference-server/server#7697
Where should the reviewer start?
ReportModelLoadTime func use-age.
Test plan:
Added in server PR