-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert some jobs to Ginkgo --label-filter #32648
Conversation
@@ -1067,7 +1068,7 @@ presubmits: | |||
- '--node-test-args=--container-runtime-endpoint=unix:///var/run/crio/crio.sock --container-runtime-process-name=/usr/local/bin/crio --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/crio.service --kubelet-cgroups=/system.slice/kubelet.service" --extra-log="{\"name\": \"crio.log\", \"journalctl\": [\"-u\", \"crio\"]}"' | |||
- --node-tests=true | |||
- --provider=gce | |||
- --test_args=--nodes=8 --focus="\[NodeConformance\]|\[NodeFeature:.+\]|\[NodeFeature\]" --skip="\[Flaky\]|\[Slow\]|\[Serial\]" | |||
- --test_args=--nodes=8 --label-filter='(NodeConformance || !(NodeFeature: isEmpty)) && !Flaky && !Slow && !Serial' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ran 353 tests with old and new args.
@@ -702,7 +702,7 @@ presubmits: | |||
- '--node-test-args=--feature-gates=SidecarContainers=true --service-feature-gates=SidecarContainers=true --container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime-process-name=/usr/bin/containerd --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/containerd.service" --extra-log="{\"name\": \"containerd.log\", \"journalctl\": [\"-u\", \"containerd*\"]}"' | |||
- --node-tests=true | |||
- --provider=gce | |||
- --test_args=--nodes=1 --timeout=4h --focus="\[Serial\].*\[NodeFeature:SidecarContainers\]|\[NodeFeature:SidecarContainers\].*\[Serial\]" --skip="\[Flaky\]|\[Benchmark\]|\[NodeSpecialFeature:.+\]|\[NodeSpecialFeature\]|\[NodeFeature:Eviction\]" | |||
- --test_args=--nodes=1 --timeout=4h --label-filter='Serial && NodeFeature: containsAny SidecarContainers && !Flaky && !Benchmark && NodeSpecialFeature: isEmpty && !(NodeFeature: containsAny Eviction)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 tests with old and new flags:
kubernetes$ _output/bin/ginkgo --dry-run --silence-skips -v --label-filter='Serial && NodeFeature: containsAny SidecarContainers && !Flaky && !Benchmark && NodeSpecialFeature: isEmpty && !(NodeFeature: containsAny Eviction)' ./test/e2e_node/
...
[sig-node] Device Plugin [Feature:DevicePluginProbe] [NodeFeature:DevicePluginProbe] [Serial] DevicePlugin [Serial] [Disruptive] Can schedule a pod with a restartable init container [NodeFeature:SidecarContainers] [sig-node, Feature:DevicePluginProbe, NodeFeature:DevicePluginProbe, Serial, Disruptive, NodeFeature:SidecarContainers]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e_node/device_plugin_test.go:610
• [0.000 seconds]
------------------------------
[sig-node] CPU Manager [Serial] [Feature:CPUManager] With kubeconfig updated with static CPU Manager policy run the CPU Manager tests should not reuse CPUs of restartable init containers [NodeFeature:SidecarContainers] [sig-node, Serial, Feature:CPUManager, NodeFeature:SidecarContainers]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e_node/cpu_manager_test.go:713
• [0.000 seconds]
------------------------------
[sig-node] [NodeFeature:SidecarContainers] [Serial] Containers Lifecycle should restart the containers in right order after the node reboot [sig-node, NodeFeature:SidecarContainers, Serial]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e_node/container_lifecycle_test.go:3128
• [0.000 seconds]
------------------------------
[sig-node] POD Resources [Serial] [Feature:PodResources] [NodeFeature:PodResources] with SRIOV devices in the system with CPU manager Static policy should return the expected responses [NodeFeature:SidecarContainers] [sig-node, Serial, Feature:PodResources, NodeFeature:PodResources, NodeFeature:SidecarContainers]
/nvme/gopath/src/k8s.io/kubernetes/test/e2e_node/podresources_test.go:901
• [0.000 seconds]
------------------------------
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - this is much easier to read and grok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quoting turned out to be a bit tricky: without quoting the string, YAML treats it as an object because of the ":".
Not a big problem, just something to remember...
- name: SKIP | ||
value: \[Slow\]|\[Disruptive\]|\[Flaky\]|\[Feature:.+\]|PodSecurityPolicy|LoadBalancer|load.balancer|In-tree.Volumes.\[Driver:.nfs\]|PersistentVolumes.NFS|Network.should.set.TCP.CLOSE_WAIT.timeout|Simple.pod.should.support.exec.through.an.HTTP.proxy|subPath.should.support.existing|should.provide.basic.identity | ||
value: PodSecurityPolicy|LoadBalancer|load.balancer|In-tree.Volumes.\[Driver:.nfs\]|PersistentVolumes.NFS|Network.should.set.TCP.CLOSE_WAIT.timeout|Simple.pod.should.support.exec.through.an.HTTP.proxy|subPath.should.support.existing|should.provide.basic.identity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small difference is that tests with just alpha or beta feature gate dependency are allowed to run. However, we currently don't have any such test because any such test also has to add a Feature:<feature gate name>
label to be skipped in normal jobs.
This runs 2322 tests with the old and new flags.
- name: FOCUS | ||
value: "." | ||
- name: LABEL_FILTER | ||
value: !Slow && !Disruptive && !Flaky && Feature: isSubsetOf Alpha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubernetes-sigs/kind#3582 got merged. Does that mean that it is usable now?
In other words, can we merge this PR?
/cc @aojea @BenTheElder
The new --label-filter expression is easier.
One small difference is that tests with just alpha or beta feature gate dependency are allowed to run. However, we currently don't have any such test because any such test also has to add a Feature:<feature gate name> label to be skipped in normal jobs.
b716288
to
3d87520
Compare
/lgtm @pohly do you mind sending an update to the mailing list with the existing changes and a brief of explanation of the benefits? |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@pohly: Updated the
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@@ -32,7 +32,7 @@ periodics: | |||
kind build node-image --image=dra/node:latest . && | |||
trap 'kind export logs "${ARTIFACTS}/kind"; kind delete cluster' EXIT && | |||
kind create cluster --retain --config test/e2e/dra/kind.yaml --image dra/node:latest && | |||
KUBERNETES_PROVIDER=local KUBECONFIG=${HOME}/.kube/config GINKGO_PARALLEL_NODES=8 E2E_REPORT_DIR=${ARTIFACTS} hack/ginkgo-e2e.sh -ginkgo.focus=DynamicResourceAllocation -ginkgo.skip=\[Serial\] | |||
KUBERNETES_PROVIDER=local KUBECONFIG=${HOME}/.kube/config GINKGO_PARALLEL_NODES=8 E2E_REPORT_DIR=${ARTIFACTS} hack/ginkgo-e2e.sh -ginkgo.filter='Feature: containsAny DynamicResourceAllocation && !Serial' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, now that I look at this one last time after merging I am spotting a typo! 🥵
s/-ginkgo.filter/-gingko.label-filter/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -188,7 +188,7 @@ periodics: | |||
- '--node-test-args=--feature-gates="DynamicResourceAllocation=true" --service-feature-gates="DynamicResourceAllocation=true,SchedulerQueueingHints=true" --runtime-config=resource.k8s.io/v1alpha2=true --container-runtime-endpoint=unix:///var/run/crio/crio.sock --container-runtime-process-name=/usr/local/bin/crio --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/crio.service --kubelet-cgroups=/system.slice/kubelet.service" --extra-log="{\"name\": \"crio.log\", \"journalctl\": [\"-u\", \"crio\"]}"' | |||
- --node-tests=true | |||
- --provider=gce | |||
- --test_args=--focus="\[Feature:DynamicResourceAllocation\]" --skip="\[Flaky\]" | |||
- "--test_args=--label-filter='Feature: containsAny DynamicResourceAllocation && !Flaky'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e2e-node tests do not preserve white space properly 😢
I0617 08:30:21.206725 9333 ssh.go:146] Running the command ssh, with args: [-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no -o ServerAliveInterval=30 -o LogLevel=ERROR -i /workspace/.ssh/google_compute_engine core@34.83.230.94 -- sudo /bin/bash -c 'cd /tmp/node-e2e-20240617T083009 && set -o pipefail; timeout -k 30s 3900.000000s ./ginkgo --label-filter='Feature: containsAny DynamicResourceAllocation && !Flaky' --no-color -v ./e2e_node.test -- --system-spec-name= --system-spec-file= --extra-envs= --runtime-config= --v 4 --node-name=n1-standard-4-fedora-coreos-40-20240519-3-0-gcp-x86-64-54709c74 --report-dir=/tmp/node-e2e-20240617T083009/results --report-prefix=fedora --image-description="fedora-coreos-40-20240519-3-0-gcp-x86-64" --feature-gates="DynamicResourceAllocation=true" --service-feature-gates="DynamicResourceAllocation=true" --runtime-config=resource.k8s.io/v1alpha2=true --container-runtime-endpoint=unix:///var/run/crio/crio.sock --container-runtime-process-name=/usr/local/bin/crio --container-runtime-pid-file= --kubelet-flags="--cgroup-driver=systemd --cgroups-per-qos=true --cgroup-root=/ --runtime-cgroups=/system.slice/crio.service --kubelet-cgroups=/system.slice/kubelet.service" --extra-log="{\"name\": \"crio.log\", \"journalctl\": [\"-u\", \"crio\"]}" 2>&1 | tee -i /tmp/node-e2e-20240617T083009/results/n1-standard-4-fedora-coreos-40-20240519-3-0-gcp-x86-64-54709c74-ginkgo.log']
E0617 08:30:21.848227 9333 ssh.go:149] failed to run SSH command: out: �[38;5;9m�[1mginkgo run�[0m �[38;5;9mfailed�[0m
Found no test suites
, err: exit status 1
In this case, it's a limitation of ssh, but that is something that the e2e_node tests should be aware of. The solution for "run complex shell commands via ssh" is to invoke ssh /bin/sh
and then feed it the commands on stdin.
Will prepare a fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except that this occurs so deep down that properly composing the input script has the same problems.
I guess it boils down to "use double quotes" for --test_args
- let's try with that: #32774
The new --label-filter expression is easier.
Depends on: