Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better error handling of "unexpected EOF" #89

Open
bendikp opened this issue Jan 18, 2023 · 2 comments
Open

Better error handling of "unexpected EOF" #89

bendikp opened this issue Jan 18, 2023 · 2 comments
Assignees

Comments

@bendikp
Copy link
Member

bendikp commented Jan 18, 2023

When something is wrong in the parsing of the scan Job logs, we get a "unexpected EOF" error in the controller logs and the scan Jobs is never retried.

{
  "level": "error",
  "ts": "2023-01-17T20:06:40.568Z",
  "msg": "Reconciler error",
  "controller": "job",
  "controllerGroup": "batch",
  "controllerKind": "Job",
  "Job": {
    "name": "deployment-vuln-app-app-eccf8-89288",
    "namespace": "image-scanner-jobs"
  },
  "namespace": "image-scanner-jobs",
  "name": "deployment-vuln-app-app-eccf8-89288",
  "reconcileID": "b13fa5da-bb8d-4a65-9463-8d5e0f7b68ab",
  "error": "unexpected EOF",
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:235"
}
@erikgb
Copy link
Member

erikgb commented Jan 18, 2023

I am thinking we should start by just improving the error handling and update the CIS status with the error. Then we can see how to improve this further. WDYT?

@erikgb erikgb self-assigned this Jan 18, 2023
@padlar
Copy link
Contributor

padlar commented Jan 18, 2023

I agree to temporarily update the CIS status with error. As I see it, the problem is triggered from outside and the best we can do is to fail and retry the scan job -
-- if the trivy-server is sending different json output in the failed scenario? If yes, then the above solution is good enough.
-- if the problem is with the transporation layer or k8s log bytes limitation, we gotta figure out the root cause

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants