-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Policy controller noisy connection logging #12755
Comments
It seems the stream is getting torn every few minutes. Can you tell how often that happens, and whether that interval of time is consistent? Besides the log entries, is this causing changes in policy resources (Server, HTTPRoute, AuthorizationPolicy, etc) to not being detected? |
The errors are logged every 5 or 10 minutes (either of those, not a range) for about an hour. Sometimes, all the resource's watchers fail around the same time. Sometimes, they take turns: HTTP Routes fail for one hour, then MeshTlsAuthentications fail for another hour. The last log entry was six hours ago, though. I don't directly use those resources; we only have the default resources created during installation. Any way to test this out? |
Are you basing this only off of the INFO log messages, or is there some Linkerd behavior that is not working? The log messages do not necessarily indicate an error. They are logged at INFO and not WARN. |
Based on the log. Even though the log level is INFO, the message looks like an error: I don't see any obvious indication that it is not working properly, but I want to make sure that this does not come back to bite me in production in the future. |
I understand that the logging is noisy, so I'll leave this issue open, but I would not expect this to indicate operational problems. |
What is the issue?
The policy controller fails when it tries to watch some k8s resources (I think all of them). There is not a single package dropped according to cilium (I used hubble to check this) but the controller says the connection was dropped. Using curl within the container I can make the same GET request to the API server and get a response, so the CNI is not dropping this connection.
How can it be reproduced?
Using terraform:
values-ha.yaml
Logs, error output, etc
Every few minutes, the policy controller logs:
Running it with a debug log level, I can see this:
output of
linkerd check -o short
Environment
Possible solution
No response
Additional context
I modified the policy controller container image by adding:
ls
,wget
,sh
andcurl
Would you like to work on fixing this bug?
None
The text was updated successfully, but these errors were encountered: