Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment termination leads to a flood of messages #535

Open
MichaelRoeder opened this issue Sep 15, 2021 · 0 comments
Open

Experiment termination leads to a flood of messages #535

MichaelRoeder opened this issue Sep 15, 2021 · 0 comments

Comments

@MichaelRoeder
Copy link
Contributor

Description

A huge amount of log messages is generated when an experiment is forced to terminate by the platform controller. This leads to several drawbacks:

  1. It is harder to find the cause of the termination in the logs because it is filled with a lot of exceptions (e.g., RabbitMQ exceptions); even the controller reports errors that it caused by its own behavior
  2. The ELK stack is forced to process all these log messages.

The majority of these messages could be avoided by changing the behavior of the platform controller.

Reproducability

Start an experiment. Terminate it. Check the logs.

Expected behavior

The following changes should be implemented:

  1. Mark an experiment as forced to stop. This allows the controller to check whether it makes sense to send additional messages (e.g., the message that a container stopped) to the command queue. The other containers do not have to be informed about the termination since they will be terminated as well.
  2. Terminate containers in the right order. At the moment, it seems like the platform controller starts with the top element of the tree of containers of an experiment. However, in many cases, this container contains the RabbitMQ message broker of the experiment. This leads to a lot of exceptions in all connected containers, which can be easily avoided by changing the order of termination.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant