-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wasm binary served from kuadrant operator workload #686
Conversation
@@ -23,10 +23,6 @@ on: | |||
description: DNS Operator bundle version | |||
default: latest | |||
type: string | |||
wasmShimVersion: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heads up @didierofrivia
The wasm shim is no longer part of the operator bundle. Instead, it is added as part of the kuadrant operator image build process.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #686 +/- ##
==========================================
+ Coverage 80.20% 82.92% +2.72%
==========================================
Files 64 77 +13
Lines 4492 5776 +1284
==========================================
+ Hits 3603 4790 +1187
- Misses 600 653 +53
- Partials 289 333 +44
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Makefile
Outdated
@@ -305,15 +295,26 @@ test-unit: clean-cov generate fmt vet ## Run Unit tests. | |||
|
|||
##@ Build | |||
|
|||
WASM_SHIM = $(PROJECT_PATH)/kuadrant-ratelimit-wasm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a more generic name for this, that perhaps continues to work if we, say, add something like ext authz to the functions performed by the component as well?
WASM_SHIM = $(PROJECT_PATH)/kuadrant-ratelimit-wasm | |
WASM_SHIM = $(PROJECT_PATH)/kuadrant-wasm-shim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are planning to have just one wasm shim to do everything, it makes sense @guicassolato suggestion... however, if the name is meant to only rate limiting purposes, it's OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go for the more generic name too... unless we need to discriminate one day, which I doubt, this is the most portable option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what's better between one wasm shim per function (RL, auth) or a single one for all. I can see pros and cons for both, although today I'd probably be more inclined to a single one, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial idea was that if ext authz was being done by wasm, it would be called kuadrant-authz-wasm
. And would be a different wasm binary. Discrimination comes from the fact that they are essentially speaking different languages. While rate limiting uses RLS, ext auth uses external authorization gRPC protocol. So configuration of each wasm module, I pre-asume, would be different. Not even speaking about potentially different release stream.
But if you prefer to go for a generic name for now, I am happy with it. It can always be changed in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that RLS and ext_authz are indeed what make the two wasm-shims (two functions of a single wasm-shim) different one another.
On the other hand, other than that, arguably the two wasm-shims are practically identical:
- both need to decide whether a request matches;
- both need to evaluate well-known attributes;
- both perform a grpc call to a service, basically saying "should I let traffic go through? decide based on this payload.";
- both expect a boolean response, maybe with some metadata that typically become HTTP headers.
Moreover, if the auth layer needs to propagate data to the RL layer, with one wasm-shim, that can be done "over-the-wire", while with two, one wasm-shim needs to inject Envoy Dynamic Metadata so the other one can retrieve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a design pattern, I always pick one module does one thing. For multiple reasons. But this is a discussion we do not need to have now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one module does one thing
Not wrong. Neither we should repeat ourselves, another design principle 😜
Maybe this is one of those cases where following all the "rules" we have in life becomes impractical, like "The early bird catches the worm" but also "Good things come to those who wait."
.gitignore
Outdated
@@ -31,5 +31,7 @@ tmp | |||
/catalog/kuadrant-operator-catalog.Dockerfile | |||
/coverage/ | |||
|
|||
/kuadrant-ratelimit-wasm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe call it kuadrant-ratelimit-shim
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be kuadrant-wasm-shim
Dockerfile
Outdated
@@ -16,12 +25,16 @@ COPY controllers/ controllers/ | |||
COPY pkg/ pkg/ | |||
|
|||
# Build | |||
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o manager main.go | |||
RUN WASM_SHIM_SHA256=$(cat /opt/kuadrant/wasm-shim/kuadrant-ratelimit-wasm.sha256) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case we follow what Gui suggested, this var makes sense... if not, probably something like RATELIMIT_SHIM_SHA256
would make more sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeping as WASM_SHIM_SHA256
make/wasm-shim.mk
Outdated
} | ||
|
||
.PHONY: wasm-shim | ||
wasm-shim: $(WASM_SHIM) ## Download opm locally if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Download the wasm-shim locally if necessary
Applied ready for a new review |
config/deploy/kustomization.yaml
Outdated
resources: | ||
- ../default | ||
- ../dependencies | ||
patchesStrategicMerge: | ||
- manager_debug_mode.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we defaulting to debug
on make deploy
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! we can no longer run make run
to run locally (an actual process in your local machine) which was configured with debug level to see all details.
So, now, we (developers) have only "make deploy" dev/Testing deployment option so I decided to deploy with debug to have all available logs.
This is not affecting OLM deployments for which LOG_LEVEL
defaults to "INFO" level LOG_MODE
to production
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. IDK. I think debug is OK maybe for local-deploy
, but deploy
is a legit target that blindly applies the deploy manifests to the kubectl context. It seems risky to enable debug by default. If this eventually ends up propagated to internal components config such as auth, it would leak sensitive data to the logs by default.
Maybe commenting this patch and let devs decide when to enable it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! we can no longer run
make run
to run locally (an actual process in your local machine) which was configured with debug level to see all details.
Are we OK with this? I would pretty much always run controllers locally when developing, and a development environment where we are forced to re-deploy an image is not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Envoy is configured to download some file from the operator. If envoy runs in kubernetes and the operator in your local machine.... what can we do to make that download happen?
I was also heavily using make run
. But the decision to serve the binary from the operator makes it very hard to keep that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to keep if you want.. but does not work for all the use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. IDK. I think debug is OK maybe for local-deploy, but deploy is a legit target that blindly applies the deploy manifests to the kubectl context. It seems risky to enable debug by default. If this eventually ends up propagated to internal components config such as auth, it would leak sensitive data to the logs by default.
I do not think make deploy
will ever be used outside development. Anyway, it is faster to update than discuss. I made the changes and make deploy
does not change. make local-deploy
patches the deployment to setup debug/development mode
make local-deploy patches the deployment to have debug log level and development log mode.
Back to draft. Offline, enhancements were asked:
|
There is currently no easy way to push the wasm binary into Envoy container using Istio API or EnvoyGateway API. Istio approachWhen using Istio Wasm OCI image API, providing a image url, Istio's proxy (aka istio-agent which is a xds proxy living in the same container as Envoy) will fetch the image and validate the sha256 checksum and then, cache it locally. The envoy configuration will look like: config:
name: istio-system.kuadrant-istio-ingressgateway
vmConfig:
runtime: envoy.wasm.runtime.v8
code:
local:
filename: /var/lib/istio/data/81221938ebcbc4550eb35f72aac65d0939d89335832b5a022226fb2526806e9e/676b5f025bb67d0993fa37dc6b4de18ca515938a5d30e7451ba745c3098e485c.wasm
configuration: {} Envoy Gateway approachRecently merged envoyproxy/gateway#3564 with the Wasm OCI feature, describes as:
Kuadrant's available optionsTherefore, the available options for kuadrant to distribute the wasm binary:
|
What
part of #325
Follow-up work after discarding #593
The rate limiting wasm binary is embedded in the kuadrant operator at build time. The kuadrant operator exposes the wasm binary in port
8082
(as a configuration parameter) at the endpoint/kuadrant-wasm-shim
. The operator's build process requires the wasm binary to be available locally. Then, the SHA256 checksum of the wasm binary is computed and stored internally in a Golang variable. In order to have the wasm binary locally available at build time, there is a new makefile target to fetch the specified version of the wasm binary from the Github release assets.The kuadrant operator exposes the wasm binary deploying a new kubernetes service called
kuadrant-operator-controller-manager-wasm-shim-service
. It looks like this (some fields were removed for simplicity)The wasm module integrates with the gateway in the data plane via
the Wasm Network filter.
The source code of the compiled Wasm binaries is hosted at
Kuadrant's Wasm-Shim project.
Currently, at runtime, the istio control plane downloads an oci wasm image. Usually from cluster external image repo like quay.io. This clearly opens a risky door to inject malicious code.
This architecture enables so-called offline or disconnected installs,
which allow having the entire cluster disconnected from the internet,
at least regarding the Wasm module.
How
Istio
The following sequence diagram shows the workflow when Envoy is managed by Istio
Verification Steps
Create a HTTPRoute to route traffic to the service via Istio Ingress Gateway:
Export the gateway hostname and port:
Verify the route works:
/kuadrant-wasm-shim
kubectl get wasmplugin kuadrant-istio-ingressgateway -n istio-system -o jsonpath="{.spec.url}"
It should return
Note that the url follows the kubernetes format
http://<service-name>.<namespace>.svc.cluster.local:<port>/<endpoint>
.kubectl get wasmplugin kuadrant-istio-ingressgateway -n istio-system -o jsonpath="{.spec.sha256}"
It should return some sha256 value (may not be the same)
The sha256 checksum value should match the one shown in the operator logs
which gives
There is no need to stop the http client... it should not be affected and rate limiting should work all the time
First delete cached wasm binary
Undeploy running kuadrant operator
Download the new wasm version v0.4.0-alpha.4
It should report the new sha256 checksum
The next command will build a new operator image with the new wasm binary and deploy it
kubectl get wasmplugin kuadrant-istio-ingressgateway -n istio-system -o jsonpath="{.spec.sha256}"
It should return some sha256 value (may not be the same)
The sha256 checksum value should match the one shown in the operator logs
which gives