TKG 1.4 with NSX-ALB (vSphere Network)

TKG 1.4 with NSX-ALB (vSphere Network)

Important note

==Please make sure to read the official VMware documentation as well. Especially the one related to the specific version you're going to deploy.Tanzu Kubernetes Grid is a work in progress and lots of procedures change between each release==

NSX ALB deployment

NSX ALB (also known as AVI Networks) is a distributed load balancer that can be used for Tanzu environments. It's going to be used to provide VIPs for Kubernetes Control Plane as well as for any application that requires a service of type "Load Balancer". It's a replacement of both HA-Proxy and MetalLB for Tanzu with vSphere network configuration.

NSX Advanced Load Balancer includes the following components:

Avi Controller manages VirtualService objects and interacts with the vCenter Server infrastructure to manage the lifecycle of the service engines (SEs). It is the portal for viewing the health of VirtualServices and SEs and the associated analytics that NSX Advanced Load Balancer provides. It is also the point of control for monitoring and maintenance operations such as backup and restore.
Avi Kubernetes Operator (AKO) is a Kubernetes controller that each cluster runs on one of its nodes. Each AKO pod uses its cluster's Kubernetes API to watch for changes in the cluster's LoadBalancer and Ingress specifications, or other relevant custom resource definitions. When the AKO detects a change, it calls the Avi Controller API to make the change in the Avi resources, for example create a new load balancer VirtualService object and connect it with pods running in the cluster.
AKO Operator on the management cluster manages the lifecycle and configuration of the AKO on each workload cluster, and can make runtime changes to the AKO configuration.
Service Engines (SE) implement the data plane in a VM.
SE Groups group Service Engines into isolated sets, for example to dedicate them to specific namespaces. This lets you control SEs collectively and set maximum SE counts for different resource types, such as CPU and Memory.

Network Configuration Planning

As a picture is worth thousand words, here's a diagram of the network configuration as we deployed in our lab :

The following networks are configured on our vSphere / NSX environment to host Tanzu Kubernetes Grid :

Name	Role	Type	CIDR	DHCP
mgmt-k8s-ingress	Front-End Network	NSX-T Overlay	10.30.230.64/27	No
demo-tkg-12	Management K8s	NSX-T Overlay	10.30.231.0/24	Yes
Management Network	Management VMware	VLAN	10.30.224.0/25	No

It's important to enable DHCP on the Management K8s network (demo-tkg-12 here). It will allow Kubernetes nodes (both master and worker) as well as AVI Service Engines to receive an IP address.

Network Configuration

This section describes the configuration of each segment created for TKG deployment (including NSX ALB requirements)

Front-End Network

The front-end network is going to host every VIPs related to TKG. It can either be for the control plane of Kubernetes clusters, or for applications deployed on our Kubernetes clusters.

Here, we are using mgmt-k8s-ingress as this segment / port group.

Management K8s Network

The Management K8s network will be used as the default network for each VMs that tanzu deploys (Kubernetes Master nodes, Kubernetes Worker Nodes, AVI Service Engines (SE)).

As VMs on this network will be provisioned using Cluster API (CAPV), it requires a DHCP with both DNS and NTP options configured.

On our lab, we are using demo-tkg-12 NSX-T segment for this purpose. And the DHCP configuration :

Management Network

The Management Network is used to provide management IP adresse to AVI controllers as well as AVI Service Engines (SE).

In our LAB environment, this network is the VLAN where we also have vCenter and ESX management adress, the portgroup is named Management Network-08877285-62bb-4fa2-9ff2-e63c81af33a3.

We need at least 6 IP addresses available on this network:

4 for AVI Controllers (3 controllers + 1 VIP)
2 for AVI SE (a pair of SEs is created for each Kubernetes cluster deployed)

Controller Deployment

Before deploying, create DNS records. It's mandatory to have a working cluster, especially when you want to upgrade it.

Create the DNS records
Download AVI Controller OVAs from VMware site (TKG download page). Make sure to download the version specified on the TKG Documentation (version 20.1.6 or 20.1.3 for TKG 1.4)
Download the latest patch corresponding to the release of NSX ALB you selected
Deploy 3 times the same OVA to form a cluster
Power On each controller

Initial Configuration

Using your browser, connect to one controller
Set the new admin password
Enter the passphrase, DNS resolver, DNS Search Domain, SMTP information
On the Multi-Tenant part, let the default settings

Make sure to do this for only one controller before going to the next step

Form the cluster

Connect to the controller you previously configured via its web admin interface
Go to Administration > Controller > Nodes and click the Edit button
Fill the form to create the cluster
Wait 5 minutes for the process of cluster creation to start

Patching

Once the cluster is running, we can patch it with the latest version we found on the NSX ALB download site.

Go to Administration > Controller > Software. And click Upload From Computer
Select the patch you previously downloaded
Once the transfert is complete, go to Administration > Controller > System Update
Select the patch and click on Upgrade
Leave all options by default and click on Continue and Confirm

Cloud Configuration

The next step is to configure the Cloud. In our case, our vCenter and networks related to it. The Management Network we are going to configure in this wizzard will be used as the Management interface for all AVI Service Engines (SE).

Go to Infrastructure > Clouds. Click on Create > VMware vCenter/vSphere ESX
Fill in the name (the name of the vCenter in our case)
Fill in the login informations and let all other options by default.
Select the Datacenter and leave all other options by default
Select the Management Network, the Default Service Engine Group and fill in the information for the management IP adresses that will be used for AVI SE.

==For each subnet to be configured on NSX ALB, use the real CIDR instead of the one you can see on NSX-T interface (which is the GW address instead of the CIDR)==

IPAM and DNS profiles

In this step, we are going to create IPAM and DNS profiles. The IPAM profile will be used to provide VIPs addresses in the front-end network (mgmt-k8s-ingress).

Go to Templates > Profiles > IPAM/DNS profiles and click on Create > IPAM Profile
Fill in the name and click on + Add Usable Network and select the front-end Network
Go to Templates > Profiles > IPAM/DNS profiles and click on Create > DNS Profile
Give a name to the template and insert the domain name you want to use
Go back to Infrastructure > Clouds and click on the Cloud we previously created.
On the Infrastructure tab, select the IPAM and DNS profiles we just created.
Go to Infrastructure > Networks > Select Cloud and click on the edit button of the Front-End network (mgmt-k8s-ingress)
Click on +Add Subnet, and fill in the CIDR of the Front-End network (10.30.230.64/27)
Then click on +Add Static IP Address Pool, and fill the IP range you want for your Front-End IP addresses (10.30.230.66-10.30.230.90)
Go back to Infrastructure > Networks > Select Cloud and click on the edit button of the Management K8s network (demo-tkg-12)
Tick the "DHCP Enabled Box"

SE Configuration

AVI Service Engines will be deployed automatically, through AKO. We need to configure some settings to be sure that the SE will be deployed on the correct datastore, with a name we chose.

Go to Infrastructure > Service Engine Group, select the Cloud (at the top of the page) and edit the Default-Group
Click on the Advanced Tab
Select the prefix you want to use, the vSphere Cluster, and the datastore

Certificate

Follow VMware's documentation for the certificate. There is no trap for this one

License

You can either have the essentials or the Enterprise license Tier.

If you have the Enterprise license, you'll be able to use those features :

DNS Delegation
L7 Ingress through NSX ALB
GSLB
...

Follow VMware's documentation for License. There is no trap for this one

Routing

Go to Infrastructure > Routing > Select Cloud : Your Cloud and click Create
Insert 0.0.0.0/0 as the Gateway subnet and insert the gateway of the Front-End subnet as the Next Hop

Preparing an Ubuntu VM

Docker installation

First we need to update our repositories and install prerequisites

sudo apt-get update
sudo apt-get install \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

Add Docker’s official GPG key:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

Use the following command to set up the stable repository

echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Update the apt package index, and install the latest version of Docker Engine and containerd
```
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
```
Add your user to the docker group
```
sudo usermod -aG docker $USER
```
Log out and log back in so that your group membership is re-evaluated
Verify that you can run docker commands without sudo
```
docker run hello-world
```

Helm Installation

Install helm following the official documentation : https://helm.sh/docs/intro/install/

Kubectl, Tanzu & Helm completion

To allow autocompletion of kubectl, tanzu and helm commands, run the following commands (be sure to have executed kubectl and tanzu commands at least once before) :

vmware@tkg-jump:~$ sudo apt install bash-completion

vmware@tkg-jump:~$ kubectl completion bash > ~/.kube/completion.bash.inc
  printf "
  # Kubectl shell completion
  source '$HOME/.kube/completion.bash.inc'
  " >> $HOME/.bash_profile
  source $HOME/.bash_profile

vmware@tkg-jump:~$ tanzu completion bash > ~/.kube-tkg/completion.bash.inc
  printf "
  # Tanzu shell completion
  source '$HOME/.kube-tkg/completion.bash.inc'
  " >> $HOME/.bash_profile
  source $HOME/.bash_profile

vmware@tkg-jump:~ mkdir ~/.helm
vmware@tkg-jump:~$ helm completion bash > ~/.helm/completion.bash.inc
  printf "
  # Helm shell completion
  source '$HOME/.helm/completion.bash.inc'
  " >> $HOME/.bash_profile
  source $HOME/.bash_profile

Tanzu Kubernetes Grid Deployment

To deploy your first TKG management cluster, you have to have :

NSX ALB deployed and configured
A bootstrap VM with all the required tools installed (docker, kubectl...). We use an Ubuntu VM in our scenario

First we are going to use the UI installer, as it will help us fill the YAML file to configure the management cluster. Before finishing and running the installation with the UI, we are going to use the command line provided to effectively launch the deployment. It will allow us to have a better verbosity on what happens on the background.

Prerequisites

SSH Port Forwarding

If you run the UI installer on the VM or a Linux that doesn't have a browser, it can be helpful to redirect the port 8080 of the bootstrap VM to your machine. For this, execute the following command on your machine.

ssh -L 8080:localhost:8080 vmware@10.30.228.17

SSH Key

To create an SSH Key pair that is going to be used for tanzu, follow these steps on a Linux machine :

ssh-keygen -t ed25519 -C "vmware@tkg" -f ~/tkg

Both public and private keys are created on the home folder under the name tkg and tkg.pub

Deploy the management cluster

On the bootstrap VM, execute the following command :

tanzu management-cluster create --ui

Access the UI with a web browser http://localhost:8080/#/ui (either on your machine or on the bootstrap VM. See Section SSH Port Forwarding)
Fill in the vCenter information and credentials, as well as the public key you created on section SSH Key
1. Select Deploy TKG Management Cluster
Click on the tile you want (either Development or Production), and select the Instance Type. Select NSX Advanced Load Balancer as the Control Plane Endpoint Provider
Insert the FQDN of the AVI Controller VIP, the username and password. Add the Root CA certificate that was used to sign the AVI controller certificate in section Certificate
1. Click on Verify
2. Select the Cloud we created, the Default SE and the Front-End Network
Add Metadata (not mandatory)
Select the VM Folder where the K8s VMs will be deployed, as we as the Datastore and resource pool
Select the Network that will be used for Kubernetes VMs. It's our Management K8s network here, the one that requires a DHCP
Choose if you want to have LDAP or OIDC authentication here.
1. For LDAPS, find the instruction bellow :
  1. Fill in the LDAPS Endpoint, BIND DN, BIND Password, Username (UserPrincipalName), User Attribute (UserPrincipalName), and RootCA. Other are optional
Select the OS image you want to use
(Optional) Register with TMC
Click on Review Configuration

For LDAPS Identity Management, It's important to specify UserPrincipalName, even if optional. If you don't do it, the deployment of Pinniped won't work

When you clicked on Review Configuration, it automatically created the YAML file that contains all the information you provided via the UI. We are going to use it to deploy the management cluster with more verbosity. Copy the CLI command at the bottom of the page and paste it and execute it on your bootstrap VM.

Example :

tanzu management-cluster create --file /home/vmware/.config/tanzu/tkg/clusterconfigs/201puronh8.yaml -v 6

Once you execute this command, it's going to ask you to questions:

Reply N for the first one (Do you want to configure vSphere with Tanzu?)
Reply Y for the second (Would you like to deploy a non-integrated Tanzu Kubernetes Grid management cluster on vSphere 7.0? [y/N])

Tanzu Kubernetes Grid Post Deployment

After the management cluster has been deployed, you need to execute some commands to be able to connect to it. Especially when you have deployed it with OIDC or LDAPS (Pinniped)

Authenticate as admin on the Management Cluster

Get the admin context

tanzu management-cluster kubeconfig get tkg-mgmt --admin

The command line will return the command to authenticate to the management cluster as admin (creation of a Kubernetes context)
Execute this command, e.g.

kubectl config use-context tkg-mgmt-vsphere-20220106172239-admin@tkg-mgmt-vsphere-20220106172239

Verify that all apps where deployed successfully

After connecting to the management cluster as admin (Connect to the Management Cluster)
Verify that all apps are reconciled successfully

kubectl get apps -A

Pinniped

After deploying the management cluster, we need to create a load balancer service for Pinniped.

Create a file pinniped-supervisor-svc-overlay.yaml with the following content:

#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind": "Service", "metadata": {"name": "pinniped-supervisor", "namespace": "pinniped-supervisor"}})
---
#@overlay/replace
spec:
  type: LoadBalancer
  selector:
    app: pinniped-supervisor
  ports:
    - name: https
      protocol: TCP
      port: 443
      targetPort: 8443

#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind": "Service", "metadata": {"name": "dexsvc", "namespace": "tanzu-system-auth"}}), missing_ok=True
---
#@overlay/replace
spec:
  type: LoadBalancer
  selector:
    app: dex
  ports:
    - name: dex
      protocol: TCP
      port: 443
      targetPort: https

Convert the file into a base64-encoded string:

cat pinniped-supervisor-svc-overlay.yaml | base64 -w 0

Get the name of the pinniped-addon secret :

kubectl get secrets -n tkg-system | grep pinniped-addon

Patch the mgmt-pinniped-addon secret, which contains the Pinniped configuration values, with the overlay values (replace mgmt-pinniped-addon with the result of step 3 ; replace OVERLAY-BASE64 with the output of the step 2):

kubectl patch secret mgmt-pinniped-addon -n tkg-system -p '{"data": {"overlays.yaml": "OVERLAY-BASE64"}}'

After a few seconds, list the pinniped-supervisor (and dexsvc if using LDAP) services to confirm that they now have type LoadBalancer:

kubectl get services -n pinniped-supervisor

Delete pinniped-post-deploy-job to re-run it:

kubectl delete jobs pinniped-post-deploy-job -n pinniped-supervisor

Wait for the Pinniped post-deploy job to re-create, run, and complete, which may take a few minutes. You can check status by kubectl get job:

kubectl get job pinniped-post-deploy-job -n pinniped-supervisor

Configure DNS Delegation (NSX ALB or External-DNS)

NSX ALB (Enterprise Edition)

If you use NSX ALB with Enterprise edition, you can enable the DNS service on it, and delegate part of the domain. The purpose is that it'll automatically create the DNS record required when using Ingress services.

Follow these steps to enable the DNS Service on AVI

Go to Administration > Settings > DNS Service > Create Virtual Service
Specify a name, the network where the VIP of the DNS server will reside (mgmt-k8s-ingress), and its IPv4 subnet. Check the "Ignore network reachability constraints for the server pool" box
Click on Next on all other pages without changing the default configuration
Verify the VIP that was choosen for the DNS Service, Applications > VS VIPs

External-DNS

This section needs to be completed

Windows DNS

Once the DNS server is created on AVI, or that external-dns is configured, we need to create a new zone with delegation on windows DNS :

Open the DNS Manager
Extend the Forward Look Zone where you want to create you subzone, right-click on it and select New Delegation
Fill in the name of the subzone you want to create
Add the FQDN and the IP address of the DNS server which will manage this zone (for example, the AVI VIP)

Do not pay attention at the error message for failed validation on step 4. VIP address for DNS on AVI controller is not pingable

Configure NSX ALB for Workload Clusters

You can configure some parameters of NSX ALB for when it will be automatically deployed on TKG Guest clusters. The main scenario where you'll want to do that is when you want to use NSX ALB as the Ingress of your K8s clusters.

L7 Ingress controller (NSX ALB Enterprise Edition)

There are 3 ways to configure NSX ALB as the default L7 ingress for Kubernetes :

L7 Ingress in ClusterIP Mode
L7 Ingress in NodePortLocal Mode (Not yet supported by VMware)
L7 Ingress in NodePort Mode

As the 2nd method using NodePortLocal mode is not supported by VMware, we won't describe it.

Methods 1 and 3 have their pros and cons.

L7 Ingress in ClusterIP Mode

Pros

ClusterIP used, so no need to change parameters when deploying apps through helm charts (default mode)
Performance (an AVI SE for each Workload Cluster)

Cons

Each SE group can only be used by one workload cluster, so you need a dedicated AKODeploymentConfig per cluster for AKO to work in this mode.

L7 Ingress in NodePort Mode

Pros

An AVI SE Group can host multiple TKG Guest Clusters

Cons

The services of your workloads must be set to NodePort instead of ClusterIP even when accompanied by an ingress object. This ensures that NodePorts are created on the worker nodes and traffic can flow through the SEs to the pods via the NodePorts.

L7 Ingress in ClusterIP Mode

Make sure you use an Enterprise License on your AVI Controller
Create an AVI SE Group for each TKG Guest Cluster that will require AVI as the Ingress controller
1. Go to your AVI controller, then Infrastructure > Service Engine Group > Select Cloud > Create

Set the context of kubectl to your management cluster

kubectl config use-context tkg-mgmt-vsphere-20220106172239-admin@tkg-mgmt-vsphere-20220106172239

Create an AKODeploymentConfig YAML file for the new configuration. Set the parameters as shown in the following sample:

apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1
kind: AKODeploymentConfig
metadata:
  name: ako-shared-svc                                          # Change name
spec:
  adminCredentialRef:
    name: avi-controller-credentials
    namespace: tkg-system-networking
  certificateAuthorityRef:
    name: avi-controller-ca
    namespace: tkg-system-networking
  cloudName: vcf-lab-m1-vc                                      # Change Cloud
  clusterSelector:
    matchLabels:
      ako-l7-shared-svc: "true"                                   # Change LABEL
  controller: vcf-lab-lb-ctr-vip.sddc.cce.ge                    # Change IP
  dataNetwork:
    cidr: 10.30.230.64/27                                       # Use CIDR of Front-End Network
    name: mgmt-k8s-ingress                                      # Use Front-End Network
  extraConfigs:
    disableStaticRouteSync: false                               # required
    image:
      pullPolicy: IfNotPresent
      repository: projects.registry.vmware.com/tkg/ako
      version: v1.3.2_vmware.1
    ingress:
      disableIngressClass: false                                # required
      nodeNetworkList:                                          # required
        - cidrs:
            - 10.30.231.0/24                                    # Use CIDR of K8s Management Network
          networkName: demo-tkg-12                              # Use K8s Management Network
      serviceType: ClusterIP                                    # required
      shardVSSize: MEDIUM                                       # required
  serviceEngineGroup: tkg-shared-svc-se                         # Change SE Group

Where LABEL and LABEL-VALUE define the label and value needed to assign this configuration to a workload cluster in a later step. For example, ako-l7-clusterip-01: "true" 5. Create the AKO Service

kubectl apply -f ./ako-shared-svc.yaml

Label one of your workload clusters to match the selector. Do not label more than one workload cluster.

kubectl label cluster tkg-shared-svc ako-l7-shared-svc="true"

Set the context of kubectl to the workload cluster.

kubectl config use-context tkg-shared-svc-admin@tkg-shared-svc

Run the following command to ensure that NodePort changed to ClusterIP. (It can take few minutes)

watch "kubectl get cm avi-k8s-config -n avi-system -o=jsonpath='{.data.serviceType}'"

Delete the AKO pod so it redeploys and reads the new configuration file.

kubectl delete pod ako-0 -n avi-system

In the Avi Controller UI, go to Applications > Virtual Services to see an L7 virtual service similar to the following:

L4 LB Only (NSX ALB Essentials Edition)

Configure a wildcard certificate for Ingress Controller

NSX-ALB as Ingress

Gather a wildcard certificate for the subzone you created on Configure DNS Delegation (NSX ALB or External-DNS)

Convert the certificate and the key as base64

Extracting the private key from a PFX

openssl pkcs12 -in wild-tkg-mgmt.pfx -nocerts -out wild.key

Decrypting the private key

openssl rsa -in wild.key -out wild.key

Extracting the certificate from the PFX

openssl pkcs12 -in wild-tkg-mgmt.pfx -clcerts -nokeys -out wild.pem

Convert to bas64 string to be used in the YAML file

cat wild.pem | base64
cat wild.key | base64

Create a secret with the name router-certs-default in the same namespace where the AKO pod is running (avi-system). Ensure that the secret has tls.crt and tls.key fields in the data section.

apiVersion: v1
kind: Secret
metadata:
  name: router-certs-default
  namespace: avi-system
type: kubernetes.io/tls
data:
  tls.crt: LS0tLS1CRUdJTiB...LS0=                                      
    
  tls.key: LS0tLS1CRUdJTi...tCg==

Add the annotation ako.vmware.com/enable-tls in the required Ingresses and set its value to true

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
README.md		README.md

tkelkermans/tanzu

Folders and files

Latest commit

History

Repository files navigation