Releases: kbst/terraform-kubestack
v0.16.1-beta.0
- GKE: Provide configuration variable to enable Cloud TPU feature #249
- EKS: Provision per AZ NAT gateways when opting for private IP nodes #250
- GKE: Allow configuring K8s pod and service CIDRs #251
- EKS: Add ability to specify additional tags for nodes #253 - thanks @mark5cinco
- AKS: Allow setting availability zones for the default node pool's nodes #254 - thanks @ajrpayne
- Sign images #255 and quickstart artifacts #257 with sigstore/cosign - thanks @cpanato
- EKS: Prevent provider level tags to constantly show up as changes for node pools #262
Upgrade Notes
No special steps required. Update the module versions and the image tag in the Dockerfile. Then run the pipeline to apply.
v0.16.0-beta.0
- GKE: Allow controlling NAT gateway IP addresses and ports per VM #213 - thanks PentoHQ and @Spazzy757
- AKS: Remove deprecated end_date_relative attribute #220
- GKE: Fix preemptible, auto_repair and auto_upgrade node pool attributes #221 - thanks @gullitmiranda
- EKS: Default to separate control plane and node pool subnets #222
- EKS: Add convenience node pool module #224
- GKE: Add convenience node pool module #229
- GKE: Fix deprecated provider attributes #231
- EKS: Support NLB in ingress-dns module #228 - thanks @markszabo
- AKS: Allow configuring whether log analytics is enabled or not #232 - thanks @to266
- AKS: Expose only critical addons taint option for default node pool #238 - thanks @to266
- EKS: Allow setting cluster endpoint access controls #234 - thanks @markszabo
- GKE: Allow setting cluster endpoint access controls #239 - thanks @markszabo
- Update CLI versions, including Terraform to v1.0.11, als fixes az CLI jsmin error #240
- AKS: Add convenience node pool module #235 - thanks @to266
- AKS: Remove in-cluster-module provider configuration #244
- AKS: Expose cluster upgrade opt-in and make channel configurable #241 - thanks @to266
- Update CLI versions, most importantly Terraform to
v1.0.11
#242 - AKS: Remove in-cluster azurerm provider configuration (features attribute) #244
- EKS: Fix DNS host recreation due to depends_on #246
Upgrade Notes
Provider versions
Both the azurerm
and the google
Terraform providers have breaking changes in their respective Kubernetes cluster resource attributes. This Kubestack release updates the modules to adhere to the latest provider version. As such, you should run terraform init --upgrade
to update the locked provider versions in your terraform.lock.hcl
file.
Once you do, you will also get a new version of the kustomization
provider, which switches the default ID format. This switch was necessary, to allow the provider to handle apiVersion
updates like required for Ingress without recreating the resources. But the change of the ID format means the ID changes in the Terraform state, which causes an unnecessary destroy-and-recreate plan.
You have two options:
- You can follow the instructions in the Kustomization provider docs to
terraform state mv
the resources, after which, the plan will not try to recreate them anymore. - If you prefer to stay in the old ID format for now, you can set
legacy_id_format = true
in thekustomization
provider blocks, usually located in the*_providers.tf
files.
EKS
Split control plane and node pool subnets
Previously, the default node pool and the control plane used the same subnets. Additionally, those subnets use a /24
CIDR, which means there is only a low number of IP addresses for the control plane and the default node pool to share.
This release by default switches to separate control plane and node pool subnets. The subnet change requires the node pool to be recreated. When recreating the default node pool is not an option at this point in time, you can retain the previous behavior by setting cluster_vpc_legacy_node_subnets = true
.
See the following subnet visualization for the default setup: https://www.davidc.net/sites/default/subnets/subnets.html?network=10.18.0.0&mask=16&division=23.ff4011
DNS host recreation
While not strictly required, this change is a fix for a bug that recreates the Route53 DNS hosts unnecessarily. The depends_on
on the module propagates to the data sources, which causes them to only be read on apply, and causes the zone_id
in the DNS hosts to be known-after-apply and cause a recreate plan.
To avoid this, remove the depends_on
line from the eks_zero_dns_zone
module in your eks_zero_ingress.tf
file.
AKS
In-module provider configuration
Previously, the AKS modules included an in-module provider configuration to set the provider's required features attribute. However, in-module provider configurations are highly discouraged and in this release Kubestack removed this for the azurerm provider in the AKS module.
This means, during upgrade AKS users have to add the following to their aks_zero_providers.tf
:
provider "azurerm" {
features {}
}
Azure log analytics
Making the Azure log analytics configurable for the AKS module required making the azurerm_log_analytics_workspace
and azurerm_log_analytics_solution
resources conditional using count
. This change may show up as a one-time outside of Terraform
change when upgrading existing AKS configurations to Kubestack version v0.16.0-beta.0. Log analytics can be enabled using enable_log_analytics = true
, the default, or disabled using enable_log_analytics = false
.
v0.15.1-beta.1
EKS: Bugfix release: Update EKS-D image to Kubernetes v1.21.1
v0.15.1-beta.0
- EKS: Support for
aws-cn
oraws-us-gov
partitions #201 thanks @jdmarble - Remove cluster services module after it had been removed in v0.15.0 #203
- GKE: Replace data external/sh with dedicated data source to get user #204
- GKE: Truncate
metadata_name
to meetgoogle_service_account
account_id
max length #205 thanks @Spazzy757 - Add tests for non string value inheritance in configuration module #206
- AKS: Allow enabling Azure policy agent, disabled by default #207 thanks @feend78
- EKS: Allow custom CIDRs for the VPC and its per AZ default subnets #208
- GKE: Allow customizing location and taints for additional GKE node pools #209
- AKS: Remove deprecated
azuread_service_principal_password
value attribute #210
Upgrade Notes
AKS
When upgrading an AKS cluster, the Azure policy agent block, even if disabled by default, requires applying with --target
.
terraform apply --target module.aks_zero
v0.15.0-beta.0
- Deprecate cluster-manifests in favor of cluster-service-modules #199
- EKS: Allow setting Kubernetes version - thanks @Spazzy757 #188
- AKS: Allow setting Kubernetes version #200
Upgrade Notes
Like any Kubestack upgrade, change the version of your cluster module(s) and the image tag in the Dockerfiles. The depreciation of the cluster-manifests additionally requires manual changes. While we strive to avoid high effort migrations like this, in this case the benefits outweigh the downsides drastically. Because Kubestack is still in beta, we also decided to not take on the long term maintenance effort of providing a backwards compatible alternative for such a significant change.
The previous approach of having Kustomize overlays defined under manifests/
and have the cluster modules implicitly provision the resulting Kubernetes resources had two major drawbacks:
- It required every team member to learn about both Terraform and Kustomize. And their opposing paradigms caused significant mental overhead for every change.
- It also meant that, because manifests were defined using YAML, it was not easily possible to customize the Kubernetes resources based on values coming from Terraform.
With this release, Kubestack cluster modules do not provision the YAML in manifests/
implicitly anymore. Instead, all catalog services are now available as first class Terraform modules. In addition, there is a new custom-manifests module, that can be used to provision your bespoke YAML in the same way as the modules from the catalog.
This change simplifies the mental model by both clusters and cluster services simply being Terraform modules now. At the same time, because cluster-service-modules still use Kustomize under the hood, the benefit of low effort maintenance to follow new upstream releases is preserved. But because the overlay is now defined dynamically using HCL, you can now fully customize all Kubernetes resources from Terraform values.
To learn more about how these modules allow full customization of the Kubernetes resources from Terraform, check the detailed documentation.
Overview
There are three cases to consider for this upgrade:
- For services from the catalog, migrate to using the dedicated module. Usage instructions are provided on each service's catalog page.
- For bespoke YAML, consider using the custom-manifests module or use the Kustomization provider directly. The module uses the explicit
depends_on
approach internally and simplifies the Terraform configuration in your root module. It is possible to use the custom-manifest module to apply an entire overlay frommanifests/overlays
as is. But it's recommended to call the module once for each base instead, to clearly separate independent Kubernetes resources from each other in the Terraform state. - For the ingress setup, migrating to the dedicated module requires using the nginx ingress cluster service module and setting it up to integrate with the default IP/DNS ingress setup. Refer to the AKS, EKS and GKE specific migration instructions for the required code changes.
Migration strategies
In all three cases, Terraform will generate destroy and recreate plans for all affected Kubernetes resources, because even though the Kubernetes resources don't change, their location in the Terraform state changes. You have two options here. You can either plan a maintenance window and run the destroy and recreate apply. Or you can manually terraform state mv
resources in the state until Terraform does not generate a destroy and recreate plan anymore. And only then run the apply.
Ingress specifics
For the ingress migration, you will in any case have a small service disruption, because the service type loadbalancer needs to be switched to fix the issue with two cloud loadbalancers being created. This destroys the old LB, creates a new and switches DNS over. During testing, this caused 5-10 minutes of downtime. For critical environments and least disruption, we suggest lowering the DNS TTL ahead of time, and let the decrease propagate before starting the migration during a maintenance window. In AWS' case, the new ELB gets a new CNAME. Since Kubestack uses Route53 Alias records, the disruption is kept to a minimum. For both Azure and Google Cloud, Kubestack uses reserved IP addresses. These will be reused by the new cloud loadbalancers, leaving DNS unchanged. But for all three providers, it will take a bit of time until the new loadbalancers are back in service.
v0.14.1-beta.0
- Updates Terraform to v0.15.1 to use rotated key for validating providers #190
Upgrade Notes
For v0.14.x users
Update the version in Dockerfile, Dockerfile.loc and clusters.tf to v0.14.1-beta.0
.
For users of previous Kubestack releases
Users of Kubestack releases before v0.14
who are not ready to update to this latest Kubestack release, can manually update the Terraform versions used to a Terraform patch releases that matches the minor release bundled with their current Kubestack release.
First, select the updated Terraform version for your Kubestack release:
- Kubestack
v0.10.0-beta.0
update Terraform tov0.12.31
- Kubestack
v0.11.0-beta.0
tov0.12.1-beta.0
update Terraform tov0.13.7
- Kubestack
v0.12.2-beta.0
tov0.13.0-beta.1
update Terraform tov0.14.11
Then edit your Dockerfile to download the new Terraform binary in a temporary stage and copy the binary over to your final stage.
# Add before FROM kubestack/framework
FROM python:3.8-slim as tfupgrade
RUN apt-get update && apt-get install -y curl unzip
ARG TERRAFORM_VERSION=___DESIRED_TERRAFORM_VERSION___
RUN echo "TERRAFORM_VERSION: ${TERRAFORM_VERSION}" \
&& curl -LO https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip \
&& unzip terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /tmp \
&& chmod +x /tmp/terraform \
&& /tmp/terraform version
# Add after FROM kubestack/framework
COPY --from=tfupgrade /tmp/terraform /opt/bin/terraform
v0.14.0-beta.1
- Upgrade to Terraform v0.15.0 and refactor provider handling #179
- Migrate from strict versioning inside modules to new
terraform.hcl.lock
approach - Remove vendored providers from container images deb7e56
- Updated CLI versions in container image
- Migrate from strict versioning inside modules to new
- AKS: Fix AKS vnet names to support multiple clusters and environments in the same resource group #175
- Evaluate testing for configuration and metadata modules using new
terraform test
feature #185
Upgrade Notes
AKS
For clusters provisioned using the Azure network plugin (network_plugin = "azure"
) the changed virtual network (vnet) and subnet names will trigger a destroy and recreate plan. To keep the legacy names for existing clusters that you do not wish to recreate set legacy_vnet_name = true
.
v0.13.0-beta.1
v0.13.0-beta.0
- EKS: Provision OIDC provider to support IAM roles for service accounts
- GKE: Enable workload identity support by default
- AKS: Use managed identities instead of service principals by default
- AKS: Add support to set the
sku_tier
- thanks @to266 - Preparations towards a stable interface to extend Kubestack with bespoke modules
- All cluster modules now provide
current_config
,current_metadata
andkubeconfig
outputs - The inheritance module now supports maps of objects not just maps of strings
- All cluster modules now provide
Upgrade Notes
Please note, Pod-managed identities, the Azure equivalent to Google workload identity and AWS IAM roles for service accounts, is only in preview and not available through the Azure Terraform provider yet. As such, this release only adds support for improved IAM integration on GKE and EKS.
EKS
Migrate to IAM roles for service accounts (IRSA)
The EKS module now provisions an OIDC provider by default to enable IAM roles for service accounts. Users that wish to not provision the OIDC provider can disable this by setting disable_openid_connect_provider = true
.
Using IRSA, in addition to the OIDC provider, requires configuring IAM roles and Kubernetes annotations. Refer to the EKS documentation for more details. This feature alone, does not prevent workloads on the cluster from assuming node roles.
GKE
Migrate to workload identities
The GKE module now by default enables workload identities support on the cluster and sets the workload metadata config for the default node pool to GKE_METADATA_SERVER
.
Terraform generates an in-place update but the migration may cause disruptions if workloads currently depend on node roles. Please refer to the GKE documentation for more information.
To retain the previous behaviour, Kubestack users can set disable_workload_identity = false
or alternatively leave workload identity enabled but set the workload metadata config explicitly, e.g. node_workload_metadata_config = "EXPOSE"
.
AKS
Migrate from service principals to managed identities
AKS requires a service principal to grant the Kubernetes control plan access to the Azure API, to e.g. provision loadbalancers. AKS has moved from manual service principals to managed identities. From the Azure documentation:
Clusters using service principals eventually reach a state in which the service principal must be renewed to keep the cluster working. Managing service principals adds complexity, which is why it's easier to use managed identities instead.
The change in Terraform triggers a destroy-and-recreate plan for the cluster and its dependencies.
Users may be able to migrate by following the Azure documentation to manually migrate from service principals to managed identities using the az
CLI, before running the Terraform apply. At the time of writing the upgrade is marked as a preview feature. Please refer to the upstream documentation but the following worked during release testing:
az extension add --name aks-preview
# needs Owner permissions
az aks update -g RESOURCE_GROUP -n CLUSTER_NAME --enable-managed-identity
az aks nodepool upgrade --cluster-name CLUSTER_NAME --name default --node-image-only --resource-group RESOURCE_GROUP
After that, a Terraform plan will merely destroy the old service principal related resources, but not recreate the cluster.
Kubestack AKS users that wish to not migrate and retain the previous configuration using manually maintained service principals can set disable_managed_identities = true
instead of doing the manual migration.
v0.12.2-beta.0
- Updated Terraform to v0.14.5 #160
- Updated CLI and Terraform provider versions #158
- Clarified incorrect variable descriptions #152, #154 thanks @soulshake