Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPM] [Linux] race condition when editing NetPol with "except" CIDR #2840

Closed
huntergregory opened this issue Jul 10, 2024 · 0 comments · Fixed by #2841
Closed

[NPM] [Linux] race condition when editing NetPol with "except" CIDR #2840

huntergregory opened this issue Jul 10, 2024 · 0 comments · Fixed by #2841
Labels
bug linux npm Related to NPM.

Comments

@huntergregory
Copy link
Contributor

huntergregory commented Jul 10, 2024

A race condition occurs when editing a NetworkPolicy with an "except" CIDR block. If there are roughly 5 or more CIDR blocks, one of which has an "except" section, then if the NetworkPolicy is edited (or if the policy is deleted and later recreated), then there is a chance that an NPM Pod enters an incapacitated state where it can't enforce this policy and future policy changes.

Symptoms

There will be a repeating error log about an unknown argument "nomatch":

2024/07/10 23:28:41 [1] [DataPlane] [BACKGROUND] failed to add policy one at a time. default/policy-with-cidr-except. err: [DataPlane] [ADD-NETPOL] error while applying IPSets: ipset restore failed when applying ipsets: Operation [RunCommandWithFile] failed with error code [999], full cmd [], full error after 5 tries, failed to run command [ipset restore] with error: error running command [ipset restore] with err [exit status 2] and stdErr [ipset v7.5: Unknown argument: `nomatch'
Try `ipset help' for more information.
]

Prevention and Mitigation

If the issue occurs, restart NPM Pods to mitigate.

The issue can be avoided by:

  1. Not editing NetworkPolicy with an "except" CIDR block
  2. If a NetworkPolicy exists with an "except" CIDR block and then the policy is deleted, do not create another NetworkPolicy with the same name and namespace.

Cause

If the race condition is met, then NPM tries to delete the CIDR "except" members from an IPSet, which causes a non-retriable syntax error. The command used is like:

ipset -D 10.0.0.0/32 nomatch

but it must instead be

ipset -D 10.0.0.0/32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug linux npm Related to NPM.
Projects
None yet
1 participant