You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We had a K8ssandra cluster with Medusa enabled deployed in AKS 1.29.
Cluster admin upgraded AKS to 1.30, and after that k8ssandra-operator was in CrashLoopBackOff and the K8ssandra pods were not starting.
K8ssandra pods started finally 4h30 after the AKS upgrade.
Did you expect to see something different?
I expect a Kubernetes upgrade to not causes k8ssandra unavailibity.
How to reproduce it (as minimally and precisely as possible):
Deploy an AKS cluster 1.29
Deploy k8ssandra-operator v1.14
Deploy a K8ssandra cluster with at least 3 replicas and Medusa backup enabled
Create a MedusaBackupSchedule or create MedusaBackupJobs
Delete the nodepool containing Cassandra pods, so that all Cassandra pods will be evicted
k8ssandra-operator should crash when it attempts to reconcile a MedusaBackupJob for this k8ssandra cluster
Error:
15m Warning FailedCreate statefulset/cs-1855ea8cc2-cs-1855ea8cc2-default-sts create Pod cs-1855ea8cc2-cs-1855ea8cc2-default-sts-0 in StatefulSet cs-1855ea8cc2-cs-1855ea8cc2-default-sts failed error: Internal error occurred: failed calling webhook "mpod.kb.io": failed to call webhook: Post "https://c3aiops-k8ssandra-operator-webhook-service.c3-opsadmin.svc:443/mutate-v1-pod-secrets-inject?timeout=10s": no endpoints available for service "c3aiops-k8ssandra-operator-webhook-service"
We didn't find any pod with the backupSummary, but we also didn't encounter any errors. So the createMedusaBackup is called with a nil pointer for backupSummary.
What happened?
We had a K8ssandra cluster with Medusa enabled deployed in AKS 1.29.
Cluster admin upgraded AKS to 1.30, and after that k8ssandra-operator was in
CrashLoopBackOff
and the K8ssandra pods were not starting.K8ssandra pods started finally 4h30 after the AKS upgrade.
Did you expect to see something different?
I expect a Kubernetes upgrade to not causes k8ssandra unavailibity.
How to reproduce it (as minimally and precisely as possible):
MedusaBackupJob
for this k8ssandra clusterEnvironment
K8ssandra Operator version:
1.14.0
Kubernetes version information:
1.29 -> 1.30
Kubernetes cluster kind:
AKS
Manifests:
K8ssandraCluster manifest (after recovery):
MedusaBackupSchedule:
Logs once k8ssandra-operator reached a Running state after the above crash:
Anything else we need to know?:
I checked the code and I believe all k8ssandra-operator versions are impacted, not only 1.14
┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: K8OP-294
The text was updated successfully, but these errors were encountered: