-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telegraf Generating Orphaned DBus Processes on RHEL Servers #13481
Comments
IMO, telegraf itself should unconditionally disable the kwallet integration. The integration, AIUI, was an unintentional side-effect of using this library. |
@crflanigan I have put up #13489 can you download an artifact and verify this no longer crashes? Thanks |
@powersj Sure thing! |
Our initial testing shows that this fix doesn't cause the issue. We will keep testing and let you know what (if anything) we find. |
@powersj |
Brilliant, thanks for the quick turn around on testing |
You bet buddy! |
Let's create a new issue and if you could please get logs from 1.27.2 I would appreciate it. |
Relevant telegraf.conf
The Telegraf configuration appears to be irrelevant as the problem is related to the Telegraf Secret Store which initializes when the agent starts regardless if you are using it or not.
Logs from Telegraf
System info
Telegraf 1.25.2 - RHEL 6, 7, 8
Docker
No response
Steps to reproduce
Reproducing has been tricky as it doesn't always appear to occur, but on systems that were impacted (hundreds+) reverting Telegraf to an earlier version, stopping the Telegraf service and removing the orphaned process, or performing the below actions resolved the issue.
What we have seen:
Upgrading the Telegraf version 1.14 to 1.25.2 on RHEL servers seems to create an issue where DBus generates many orphaned processes. This eventually causes the system to hit the ceiling of available PIDs. Rolling back to 1.14 seems to clear the problem.
Example from one of our systems:
What we found:
The Telegraf Secret Store appears to have a dependency called github.com/99designs/keyring, which is loaded by plugins/secretstores/all/os.go, which then points to telegraf/plugins/secretstores/os/os.go, which imports the keyring/kwallet.go which runs the following init() function:
From here we found that we can bypass this DBus activity by creating an environment variable
DISABLE_KWALLET=1
in the Telegraf startup script, though setting it through the terminal should also work.Investigating deeper it appears this behavior is a known issue for this package and has yet to be solved.
As an aside, it looks like the keyring application isn't actively being maintained, with the last release being in December of 2022.
Expected behavior
Telegraf works as expected.
Actual behavior
Telegraf inadvertantly creates thousands of orphaned DBus processes which eventually causes the available PID's to hit the maximum ceiling, which causes system degradation.
Additional info
No response
The text was updated successfully, but these errors were encountered: