Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump go 1.22.2->1.22.5 and add E2E tests #639

Merged
merged 33 commits into from
Aug 5, 2024

Conversation

carreter
Copy link
Contributor

@carreter carreter commented Jul 8, 2024

Made some E2E tests that:

  • spin up a kind cluster
  • deploy konnectivity servers and agents
  • get metrics to confirm all servers are connected to all agents

Currently do this for two scenarios: 1 agent + 1 server and 3 agents + 3 servers.

Had to run go mod tidy and go mod vendor along the way to pull in some dependencies, couldn't figure out how to do this without bumping the go patch version. Hope this is ok, I can try and rework it if not!

@k8s-ci-robot
Copy link
Contributor

Hi @carreter. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 8, 2024
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 8, 2024
@carreter
Copy link
Contributor Author

carreter commented Jul 8, 2024

@avrittrohwer is on vacation this week, is anyone else available to review?

@carreter
Copy link
Contributor Author

carreter commented Jul 8, 2024

Blocks #635 FYI

@cheftako
Copy link
Contributor

cheftako commented Jul 9, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 9, 2024
@carreter
Copy link
Contributor Author

carreter commented Jul 9, 2024

@tallclair for visibility. Working on cleaning up the PR to see if we can avoid upgrading the go version and minimize the vendored dependencies!

@carreter
Copy link
Contributor Author

Seems like sigs.k8s.io/e2e-fromework v0.4.0 requires go 1.22.3 so it is impossible to downgrade:

$ go mod tidy -go 1.22.2
go: sigs.k8s.io/e2e-framework@v0.4.0 requires go@1.22.3, but 1.22.2 is requested

Here are the new packages that are getting pulled into vendor/:

$ git diff master -- vendor/modules.txt | grep -e "+# " | sed -e "s/\+# //" -e "/.*=>.*/d"
github.com/evanphx/json-patch/v5 v5.9.0
github.com/gorilla/websocket v1.5.0
github.com/imdario/mergo v0.3.15
github.com/moby/spdystream v0.2.0
github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f
github.com/vladimirvivien/gexe v0.2.0
sigs.k8s.io/controller-runtime v0.18.2
sigs.k8s.io/e2e-framework v0.4.0
sigs.k8s.io/yaml v1.4.0

@cheftako Where can I check if these are ok to include?

@@ -1,7 +1,6 @@
module sigs.k8s.io/apiserver-network-proxy

go 1.22
toolchain go1.22.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, is there a reason for removing the go1.22.4 toolchain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go mod tidy did this automatically. Reverted.

Makefile Outdated
@@ -80,6 +80,10 @@ test:
test-integration: build
go test -mod=vendor -race ./tests -agent-path $(PWD)/bin/proxy-agent

.PHONY: test-e2e
test-e2e: docker-build
go test -mod=vendor ./e2e -agent-image ${AGENT_FULL_IMAGE}-$(TARGETARCH):${TAG} -server-image ${SERVER_FULL_IMAGE}-$(TARGETARCH):${TAG}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason not to run with -race enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, just didn't know about it. Added the -race flag.

)

// renderServerTemplate renders a template from e2e/templates/server into a kubernetes object.
func renderServerTemplate(file string, params any) (client.Object, *schema.GroupVersionKind, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on having just a single renderTemplate where callers pass a file path relative to templates like "agent/service.yaml"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like this would be significantly easier to use. Will do!

@@ -0,0 +1,65 @@
package e2e

import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on putting this logic in main_test.go? When reading through main_test.go it was not immediately obvious where these funcs were defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

e2e/main_test.go Outdated
envfuncs.CreateCluster(kindCluster, kindClusterName),
envfuncs.LoadImageToCluster(kindClusterName, *agentImage),
envfuncs.LoadImageToCluster(kindClusterName, *serverImage),
func(ctx context.Context, cfg *envconf.Config) (context.Context, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on making this a named renderAndApplyManifests() func?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -0,0 +1,82 @@
apiVersion: apps/v1
kind: StatefulSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for using a StatefulSet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatefulSets give the pods a unique identifier we can reach them with.

metricsFamilies, err := getMetrics(fmt.Sprintf("%v-%v:%v/metrics", agentPod.Name, agentServiceHost, adminPort))

Replicas: 1,
Image: *serverImage,
Args: []KeyValue{
{Key: "log-file", Value: "/var/log/konnectivity-server.log"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could write these without Key and Value e.g. {"log-file", "/var/log/konnectivity-server.log"}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

e2e/main_test.go Outdated
os.Exit(testenv.Run(m))
}

func getMetrics(url string) (map[string]*io_prometheus_client.MetricFamily, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on having this return a friendlier type? Current usage always fetches a gauge so we could have this be getMetrics(url, name) (int, error). That would keep the int(connectionsMetric.GetMetric()[0].GetGauge().GetValue()) complexity all in this func

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a getMetricsGaugeValue() func


for _, agentPod := range agentPods.Items {

metricsFamilies, err := getMetrics(fmt.Sprintf("%v-%v:%v/metrics", agentPod.Name, agentServiceHost, adminPort))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought on pulling this into a agentsAreConnected() func? That way we could reuse in singleserver_singleagent_test as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@carreter
Copy link
Contributor Author

@avrittrohwer Addressed your comments. Working on including this in the GitHub workflows now.

@avrittrohwer
Copy link
Contributor

/retest

@carreter
Copy link
Contributor Author

Excluded the e2e tests from the main test runners (fast-test and test) in the Makefile. Hopefully that fixes this.

@carreter
Copy link
Contributor Author

/retest

@carreter
Copy link
Contributor Author

As per @cheftako and @ipochi 's suggestion, I have changed the StatefulSet out for a Deployment and address the pods directly by their IP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a short README.md to the e2e directory that goes over what maint_test.go sets up (RBAC and service) and what each specific test is expected to set up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, could you take a look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

e2e/README.md Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't see 'KinD' capitalization used on the linked docs, keep it 'kind' for consistency

@carreter
Copy link
Contributor Author

Hmm, unsure why the tests are failing suddenly. They were working locally. Will do some digging.

@carreter
Copy link
Contributor Author

/retest

go.mod Outdated
@@ -21,7 +21,9 @@ require (
k8s.io/client-go v0.30.2
k8s.io/component-base v0.30.2
k8s.io/klog/v2 v2.130.0
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.0.0
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.29.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pleaser restore v0.0.0 to avoid confusion. Note there is a replace directive below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@carreter carreter requested a review from jkh52 July 30, 2024 16:46

metricsParser := &expfmt.TextParser{}
metricsFamilies, err := metricsParser.TextToMetricFamilies(resp.Body)
defer resp.Body.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not critical but I think the defer should really be right after the http.Get error check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

func getMetricsGaugeValue(url string, name string) (int, error) {
resp, err := http.Get(url)
if err != nil {
return 0, fmt.Errorf("could not get metrics: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debuggability is easier if you include the url in with the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

var serverPods *corev1.PodList
err := client.Resources().List(ctx, serverPods, resources.WithLabelSelector("k8s-app=konnectivity-server"))
if err != nil {
t.Fatalf("couldn't get server pods: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again debuggability goes up if you have a reference to relevant request (server pods) with the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a reference to the label selector being used to discover the pods.

for _, serverPod := range serverPods.Items {
numConnections, err := getMetricsGaugeValue(fmt.Sprintf("%v:%v/metrics", serverPod.Status.PodIP, adminPort), "konnectivity_network_proxy_server_ready_backend_connections")
if err != nil {
t.Fatalf("couldn't get agent metric 'konnectivity_network_proxy_server_ready_backend_connections' for pod %v", serverPod.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nout found is the obvious error but there could be other reasons we got an error here and we did not check the error type. I would suggest logging it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

if numConnections != expectedConnections {
t.Errorf("incorrect number of connected agents (want: %v, got: %v)", expectedConnections, numConnections)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%v can mask some interesting errors in a case like this. They may be unequal because they are of different types but their toString method generates the same string. If you believe they should both be ints then it is better to use %d.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{"server-port", "8090"},
{"agent-port", "8091"},
{"health-port", "8092"},
{"admin-port", strconv.Itoa(adminPort)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not wrong but its not clear to me why your type converting what appears to be a constant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler complains that adminPort is an int if I don't conver it.

@cheftako
Copy link
Contributor

cheftako commented Aug 5, 2024

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 5, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: avrittrohwer, carreter, cheftako, ipochi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 5, 2024
@k8s-ci-robot k8s-ci-robot merged commit 4171b34 into kubernetes-sigs:master Aug 5, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants