Important: If any commands require sudo privileges and your user don't have passwordless sudo enabled, copy the commands from makefile and run in your favorite shell.
Important: This was setup as proof-of-concept of a production system. For local development purpose please use 1 control-plane and 2 worker node configuration and RAM usage will be under control (assuming the system has atleast 5gb ram available).
Important: For any custom configurations rename
.env.template
to.env
and use it withMakefile
. Look for commands withcustom
name.
Important: The gitlab omnibus dockerized installation provided here is for demonstration purpose only, not suitable for production use. Harden
gitlab.rb
settings and ensure encryption and ssl before setting up a standalone gitlab omnibus deployment.
Install Docker with,
make install-docker
Install go with,
make install-go
Set path in bashrc or zshrc,
export PATH=$PATH:/usr/local/go/bin
export GOPATH=$HOME/go
Install KinD with,
make install-kind
If you are bootstrapping KinD cluster with more than 100 docker.io image pulls in a span of 6 hours, you'll hit docker pull limit (since image is being pulled anonymously so 200 pulls per logged in session won't apply). Another case is you may want to load custom images directly into cluster without going through a docker image registry. Note imagePullPolicy
settings and it shouldn't be Always
or images shouldn't use latest
tag.
In this case easiest solution is pull all docker images to local pc and load into kind cluster with,
kind load docker-image IMAGE_NAME:TAG
The longest and safest (I trust you to NOT
use self signed cert and distribute them using kind-config.yaml
in any kind of production environment) in long run is to host a private registry with Harbor and host all necessary images in it. If you have patience to upload all necessary images for your cluster to run in private registry then Congratulations!!
you are one step closer to creating an air-gapped secure cluster. Add the domain name for Harbor setup against your ip (not localhost) in /etc/hosts
file.
The commands are given in order from Harbor folder, please update with your own value if needed,
make harbor-cert
make harbor-download
make harbor-yml
make harbor-prepare
make harbor-install
Stop and start Harbor containers if needed,
make harbor-down
make harbor-up
Update private_repo
variable in ```.env`` and run following command to pull, tag and push necessary docker images to your private registry,
make cluster-private-images
To use images from private image registry, look for commands with custom
mode.
Update nfs_share
variable in ```.env`` and copy certificates to nfs share so pods like jenkins can trust it for docker related operations,
make create-cert-nfs-dir
make copy-cert-nfs
Download kubernetes source code with, (will take few minutes)
make download-k8s-source
Build custom node image and tag it with your private image registry. (optional TODO) Add more packages to your node images if necessary.
make build-node-image
If you are not using private image registries like harbor, create KinD cluster with,
make cluster-create
For any custom settings, private image registry, copy cluster/kind-config-custom.yaml.template
to cluster/kind-config-custom.yaml
and update it with your certificate and key name, mount point. Update apiServerAddress and apiServerPort with your current ip address and any port to expose the cluster.
Important: KinD cluster should not be exposed publicly. This settings are not suitable for any production environment. Please be aware of security concerns before exposing local KinD cluster publicly.
networking:
disableDefaultCNI: true
apiServerAddress: "YOUR_IP"
apiServerPort: YOUR_PORT
extraMounts:
- containerPath: /etc/ssl/certs
hostPath: harbor/certs
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."harbor.localdomain.com:9443"]
endpoint = ["https://harbor.localdomain.com:9443"]
[plugins."io.containerd.grpc.v1.cri".registry.configs."harbor.localdomain.com".tls]
cert_file = "/etc/ssl/certs/harbor.localdomain.com.cert"
key_file = "/etc/ssl/certs/harbor.localdomain.com.key"
Run following first,
make custom-mode
Create cluster with,
make cluster-create-custom
Destroy KinD cluster with, (NFS storage contents won't be deleted)
delete-cluster
If you are lazy like me and don't want to go through reading all these commands
For private image registry (setup harbor first), copy custom cluster config and certificates in above steps, setup everything except gitlab and jenkins,
make all-custom -i
For dockerhub and public image repositories, setup everything except gitlab jenkins,
TODO
With every system reboot the exposed api server endpoint and certificate in kubeconfig will change. Regenerate kubeconfig of current cluster for kubectl with,
This will not work for HA settings. The haproxy loadbalancer container don't get certificate update this way. Copying api address ip and certificate over to loadbalancer docker container process is still
TODO
. For HA KinD cluster you have to destroy cluster every time before shutdown and recreate it later.
make kubectl-config
If cluster creation process is taking a long time at "Starting control-plane" step and exits with error similar to,
The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
It means you probably have some physical or virtual network settings that KinD is not working with. For example kvm bridge network requires you to use a bridge network and bridge slave network based on the physical network interface. KinD does not support this scenario. After reverting to default network connection based on physical network device it completed the setup process.
Kubernetes version 1.21 node images were used to setup this cluster. Provide your customized name in makefile commands for create and delete cluster section. Clusters with same name can't exist.
If you need to use different version kubernetes node image, be aware of kubernetes feature gates and their default value according to version. If any feature gate default value is true, KinD config doesn't support setting it true again using cluster config yaml files. For example TTLAfterFinished
is true
by default in 1.21 but false in previous versions. So specifying it as true
again for 1.21 cluster version in featureGates
section in cluster/kind-config.yaml
won't work.
If docker restarts for any reason please look if loadbalancer container is autostarted. Otherwise you can't regenerate kubeconfig for kubectl in case it is unable to connect to kind cluster.
Create cluster network using CNI manifests,
make cluster-network
Here Calico manifest is used with BGP peering and pod CIDR 192.168.0.0/16
settings. For updated version or any change in manifest, download from,
curl https://docs.projectcalico.org/manifests/calico.yaml -O
All Calico pods must be running before installing other components in cluster. If you want to use different CNI, download the manifest and replace filename in makefile.
Run following command to let calico manifest pull from private registry,
make cluster-network-custom
If pod description shows error like x509: certificate signed by unknown authority
make sure your domain and ca certificates are available inside KinD nodes (docker containers) and containerd CRI can access them.
If pod description shows error like liveness and readiness probes failed
make sure any pod ip is not overlapping your LAN network ip range.
Delete Calico CNI with,
make cluster-network-delete
On custom mode,
cluster-network-custom-delete
If NFS server isn't installed run command to install and configure NFS location,
make install-nfs-server
Add your location with this format in /etc/exports
file,
YOUR_NFS_PATH *(rw,sync,no_root_squash,insecure,no_subtree_check)
Restart NFS server to apply changes,
sudo systemctl restart nfs-server.service
k8s-sigs.io/nfs-subdir-external-provisioner
storage provisioner is used to better simulate production scenario where usually log, metric, data storage are centralized and retained even if containers get destroyed and rescheduled.
Rename nfs-deploy.yaml.template
to nfs-deploy.yaml
and update following values with your own, (make sure folder write permission is present)
YOUR_NFS_SHARE_PATH
YOUR_NFS_SERVER_IP
Metallb loadbalancer is used to simulate production scenario where different services will be assigned ip addresses or domain names from cloud based loadbalancer services. On premises this is generally handled by a loadbalancer like haproxy which loadbalances and routes traffic to appropriate nodes. Metallb loadbalancer is not strictly required to run the stack, simple nodeport service will work for development purpose as well.
Rename metallb-config.yaml.template
to metallb-config.yaml
and update following values with your own,
IP_RANGE_START
IP_RANGE_END
Kubernetes dashboard, metrics server and cluster admin role serviceaccount manifests are added. Please don't use this serviceacount for anything remotely related to production systems.
Apply the manifest files using kustomization,
make cluster-config
For custom images, assuming you already pushed images with proper tags in your private registry, (make sure you made updates to custom
folder files)
make cluster-config-custom
Access dashboard using proxy and service account token,
kubectl proxy
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
make get-token
Delete the manifest files using kustomization,
make cluster-config-delete
On custom mode,
make cluster-config-custom-delete
EFK stack is used without ssl configuration and custom index, filter, tag rewrite rules. This is to simulate logging scenario in production environment. Custom configration can be applied to fluentd daemonset using configmap. Maybe in future a generic config file will be included. Elasticsearch runs as statefulset and as long as they are not deleted using manifest files from cluster, they will retain data in NFS share location and persist between any pod restart or reschedule. Kibana runs on nodeport 30003
so make sure to enable the port from any control-plane node in KinD cluster config.
Apply manifests with,
make cluster-logging
On custom mode with private image registry
make cluster-logging-custom
Delete EFK with, (Persistant volume will be renamed with prefix archieved
and data will not be available unless copied manually to new volumes)
make cluster-logging-delete
On custom mode,
make cluster-logging-custom-delete
Prometheus, grafana, alertmanager and custom CRDs associated with them are exactly taken as is from kube-prometheus
project (https://github.com/prometheus-operator/kube-prometheus
). Please note the kubernetes compatibility matrix and download appropriate release for your version. This system uses release-0.8. Before applying manifests, go to manifests/grafana-service.yaml
and add nodeport to service.
Rename pv.yaml.template
to pv.yaml
and update following values with your own, (make sure folder write permission is present)
YOUR_NFS_SHARE_PATH
YOUR_NFS_SERVER_IP
Create a new file with following contents manifests/grafana-credentials.yaml
to have persistent admin:admin@123
credentials applied if grafana pod is restarted.
apiVersion: v1
kind: Secret
metadata:
name: grafana-credentials
namespace: monitoring
data:
user: YWRtaW4=
password: YWRtaW5AMTIz
Add env in manifests/grafana-deployment.yaml
to use persistent credentials,
env:
- name: GF_SECURITY_ADMIN_USER
valueFrom:
secretKeyRef:
name: grafana-credentials
key: user
- name: GF_SECURITY_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: grafana-credentials
key: password
Replace following section in manifests/grafana-deployment.yaml
with next one,
- emptyDir: {}
name: grafana-storage
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-storage-pv-claim
Apply setup prerequisites with,
make cluster-monitoring-setup
Apply manifests with,
make cluster-monitoring
For private image registry in custom mode,
make cluster-monitoring-setup-custom
make cluster-monitoring-custom
Delete prometheus, grafana, alertmanager and custom CRDs with,
make cluster-monitoring-delete
make cluster-monitoring-uninstall
For custom mode,
make cluster-monitoring-custom-delete
make cluster-monitoring-custom-uninstall
Istio is used as service mesh. Install istioctl operator with,
make cluster-istioctl-install
Create istio-system namespace and install istio core components with demo profile. Modify istioctl install
for enabling any other modules or configurations,
make cluster-istio-install
Install istio components with private image registry,
make cluster-istio-custom-install
Apply grafana, prometheus, kiali, jaeger manifests to trace service communication and see service mesh metrics. Expose dashboards grafana and kiali dashboards to nodeport. In istio/samples/addons/grafana.yaml
update grafana service with following,
spec:
type: NodePort
ports:
- name: service
port: 3000
protocol: TCP
targetPort: 3000
nodePort: 30004
In istio/samples/addons/kiali.yaml
update kiali service with following,
spec:
type: NodePort
ports:
- name: http
protocol: TCP
port: 20001
nodePort: 30005
- name: http-metrics
protocol: TCP
port: 9090
Apply manifests with, (if any error comes up for first run, please run it again)
make cluster-istio-addons
For private image registries, apply the service port changes in files first, then run following,
make custom-mode
make cluster-istio-custom-addons
If any error comes up for first run, apply manifests again with,
make cluster-istio-custom-addons-apply
Delete istio components, addons and custom CRDs with,
make cluster-istio-delete
For custom mode,
make cluster-istio-custom-delete
To install Helm v3 run the following to install the operator and then run helm repo add repo_name repo_address
to add repo and helm install name repo_name
,
make cluster-helm-install
Set gitlab CE dockerized installation environment variable by renaming docker-compose.yml.template
to docker-compose.yml
in gitlab folder. Set your ip, ssh, http ports. From gitlab folder run following,
make gitlab-up
To view progress log,
make log
To check status, running processes, enter container,
make check-status
make check-system
make shell
To stop, start, restart gitlab,
make stop
make start
make restart
To get initial admin account root
password or set it up,
make get-root-pass
make set-root-pass
Log into gitlab with http://YOUR_IP:3080
and user root
and password from above commands. If it is hosted on non-private ip, disable new user sign up from admin panel.
To take full backup,
make before-backup take-backup after-backup
Delete and remove gitlab with
make gitlab-down
sudo rm -rf YOUR_NFS_SHARE_PATH/gitlab/*
Rename pv.yaml.template
to pv.yaml
in jenkins folder and update following values with your own, (make sure folder write permission is present)
YOUR_NFS_SHARE_PATH
YOUR_NFS_SERVER_IP
If you don't want to execute docker related operation (build, run etc), remove container docker
from containers section in jenkins/jenkins-deploy.yaml
.
You can also declare separate docker:dind
based deployment and service manifest, mention the cluster dns for that DinD service in DNS:
env variable and replace DOCKER_HOST
value with it in jenkins/jenkins-deploy.yaml
.
Important: Docker DinD as a separate service randomly closes connection as testing is found. If no other pod is going to use DinD service, it is recommended to attach it as sidecar of Jenkins container.
Install jenkins in cluster with,
make cluster-jenkins
Install jenkins in cluster with custom image derived from LTS version in private image registry,
In Jenkins folder,
make build
In root folder,
make custom-mode
make cluster-jenkins-custom
Get jenkins initial password to login with admin
account,
make get-jenkins-token
Get jenkins service account access token for jenkins pipeline,
make get-jenkins-sa-token
If you want to install jenkins standalone with docker-compose,
In Jenkins folder,
make build
make jenkins-up
make log
make get-token
Detail steps on how to create CI pipelines in jenkins are in readme of jenkins folder.
Delete jenkins from cluster,
make cluster-jenkins-delete
For custom mode,
make cluster-jenkins-custom-delete
For docker-compose in jenkins folder,
make jenkins-down
To delete jenkins data,
sudo rm -rf YOUR_NFS_SHARE_PATH/jenkins/*
AWS S3 like object bucket storage is provided by MinIO. Storage class is k8s-nfs
with NFS share backend with MinIO operator and custom tenants.minio.min.io
CRD. The old way and probably useful for most development cases can be found at https://github.com/kubernetes/examples/tree/master/staging/storage/minio
which provides both standalone and statefulset examples. For this repo the MinIO operator is used from https://github.com/minio/operator
. The init.yaml
is generated first and applied and then tenant.yaml
file is generated.
kubectl minio init --namespace minio-operator -o > minio/init.yaml
kubectl create namespace minio
kubectl minio tenant create minio --servers 1 --volumes 4 --capacity 200Gi --namespace minio --storage-class k8s-nfs -o > minio/tenant.yaml
You can go ahead and modify login credentials in minio-creds-secret
secret for local use in tenant.yaml
. Deploy manifests with,
make cluster-minio
For custom install with private image registry,
make cluster-minio-custom
Start minio console in localhost temporarily with, (exposing tenant crd console service to nodeport permanently TODO)
kubectl port-forward service/minio-console 9443:9443 --namespace minio
You can also generate manifest files from helm chart with,
helm template minio --namespace minio-operator --create-namespace minio/minio-operator --output-dir minio
Delete MinIO from cluster,
make cluster-minio-delete
For custom mode,
make cluster-minio-custom-delete
Apply dnsutils manifest with,
make dnsutils
For custom install with private image registry,
make dnsutils-custom
Use dnsutils to check service availability for cluster internal or external dns addresses,
kubectl exec -i -t dnsutils -- nslookup kubernetes.default
kubectl exec -i -t dnsutils -- nslookup jenkins.jenkins.svc.cluster.local
Delete dnsutils from cluster,
make dnsutils-delete
For custom mode,
make dnsutils-custom-delete