Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve observability for procedure #3999

Closed
3 tasks
waynexia opened this issue May 21, 2024 · 4 comments · Fixed by #4675
Closed
3 tasks

improve observability for procedure #3999

waynexia opened this issue May 21, 2024 · 4 comments · Fixed by #4675
Labels
A-metasrv Involves code in the meta server C-feature Category Features C-user-experience Category User Experience help wanted Extra attention is needed

Comments

@waynexia
Copy link
Member

What problem does the new feature solve?

Currently we are not able to check the state of each procedure, how long it takes, which resources it keeps, or which state it at. It's hard to debug or verify things related to one specific procedure.

This ticket proposes to improve the observability of procedures, by exposing the above things somewhere. Like an HTTP API or a table in information_schema.

Useful links

  • procedure state

    pub enum ProcedureState {
    /// The procedure is running.
    #[default]
    Running,
    /// The procedure is finished.
    Done { output: Option<Output> },
    /// The procedure is failed and can be retried.
    Retrying { error: Arc<Error> },
    /// The procedure is failed and commits state before rolling back the procedure.
    PrepareRollback { error: Arc<Error> },
    /// The procedure is failed and can be rollback.
    RollingBack { error: Arc<Error> },
    /// The procedure is failed and cannot proceed anymore.
    Failed { error: Arc<Error> },
    }

  • acquiring locks

    for key in self.meta.lock_key.keys_to_lock() {
    // Acquire lock for each key.
    let key_guard = match key {
    StringKey::Share(key) => self.manager_ctx.key_lock.read(key.clone()).await.into(),
    StringKey::Exclusive(key) => {
    self.manager_ctx.key_lock.write(key.clone()).await.into()
    }
    };
    guard.key_guards.push(key_guard);
    }

  • implement cluster_info table in information_schema feat: adds information_schema cluster_info table #3832

What does the feature do?

  • Improve ProcedureState::Running to contain a String. This string shows a procedure's current stage.
  • Count the procedure's execution time and store it.
  • Expose those information to information_schema

Implementation challenges

No response

@waynexia waynexia added help wanted Extra attention is needed C-feature Category Features C-user-experience Category User Experience A-metasrv Involves code in the meta server labels May 21, 2024
@Kelvinyu1117
Copy link
Contributor

I would like to work on it.
How can I verify my change locally for this improvement?

@evenyag
Copy link
Contributor

evenyag commented Jun 4, 2024

Improve ProcedureState::Running to contain a String. This string shows a procedure's current stage.

For this feature, one approach is adding a debug or trace log in the runner to print the current state of the procedure.

let state = self.meta.state();
match state {
ProcedureState::Running => {}

@WenyXu Do you have any suggestions?

@Kelvinyu1117
Copy link
Contributor

Kelvinyu1117 commented Jun 4, 2024

For the execution time, it should be measuring how long does execute_procedure_in_loop() take right?
What time precision (milliseconds, microseconds, nanoseconds) do you want for the execution_time?

@evenyag
Copy link
Contributor

evenyag commented Jun 4, 2024

Milliseconds should be enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-metasrv Involves code in the meta server C-feature Category Features C-user-experience Category User Experience help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants