Skip to content

My home Kubernetes (k3s) cluster managed by GitOps (Flux). Built on Proxmox using Terraform.

License

Notifications You must be signed in to change notification settings

prettoandre/pve-terraform-k3s-cluster

 
 

Repository files navigation

My home Kubernetes cluster ⛵

... managed by Flux and serviced with RenovateBot 🤖




Discord k3s pre-commit renovate


📖  Overview

This repository is my home Kubernetes cluster in a declarative state. Flux watches my cluster folder and makes the changes to my cluster based on the YAML manifests.

Feel free to open a Github issue or join the k8s@home Discord if you have any questions.

This repository is built off the k8s-at-home/template-cluster-k3s repository.


✨  Cluster setup

This cluster consists of VMs provisioned on PVE via the Terraform Proxmox provider. These run k3s provisioned overtop Ubuntu 20.10 using the Ansible galaxy role ansible-role-k3s. This cluster is not hyper-converged as block storage is provided by the underlying PVE Ceph cluster using rook-ceph-external.

See my server/ansible directory for my playbooks and roles, and server/terraform for infrastructure provisioning.

🎨  Cluster components

  • kube-vip: Uses BGP to load balance the control-plane API, making it highly availible without requiring external HA proxy solutions.
  • calico: For internal cluster networking using BGP.
  • traefik: Provides ingress cluster services.
  • rook-ceph: Provides persistent volumes, allowing any application to consume RBD block storage from the underlying PVE cluster.
  • SOPS: Encrypts secrets which is safe to store - even to a public repository.
  • external-dns: Creates DNS entries in a separate coredns deployment which is backed by my clusters etcd deployment.
  • cert-manager: Configured to create TLS certs for all ingress services automatically using LetsEncrypt.
  • kasten-k10: Provides disaster recovery via snapshots and out-of-band backups.

📂  Repository structure

The Git repository contains the following directories under cluster and are ordered below by how Flux will apply them.

  • base directory is the entrypoint to Flux
  • crds directory contains custom resource definitions (CRDs) that need to exist globally in your cluster before anything else exists
  • core directory (depends on crds) are important infrastructure applications (grouped by namespace) that should never be pruned by Flux
  • apps directory (depends on core) is where your common applications (grouped by namespace) could be placed, Flux will prune resources here if they are not tracked by Git anymore
./cluster
├── ./apps
├── ./base
├── ./core
└── ./crds

🤖  Automate all the things!


🕸️  Networking

In my network Calico is configured with BGP on my Brocade ICX 6610. With BGP enabled, I advertise a load balancer using externalIPs on my Kubernetes services. This makes it so I do not need Metallb. Another benefit to this is that I can directly hit any pods IP directly from any device on my local network. All physical hardware (including local clients) are interconnected with 10gig networking, with a seperate dedicated 10gig network for Ceph traffic.

Name CIDR
Management 10.75.10.0/24
Physical Servers 10.75.30.0/24
CoroSync0 10.75.31.0/24
CoroSync1 10.75.32.0/24
Ceph Cluster 10.75.33.0/24
Virtual Servers 10.75.40.0/24
K8s external services (BGP) 10.75.45.0/24
K8s pods 172.22.0.0/16
K8s services 172.24.0.0/16

🤷‍♂️  DNS

(this section blindly copied from Devin Buhl as I could never attempt to explain this in a better way)

To prefix this, I should mention that I only use one domain name for internal and externally facing applications. Also this is the most complicated thing to explain but I will try to sum it up.

On pfSense under Services: DNS Resolver: Domain Overrides I have a Domain Override set to my domain with the address pointing to my in-cluster-non-cluster service CoreDNS load balancer IP. This allows me to use Split-horizon DNS. external-dns reads my clusters Ingress's and inserts DNS records containing the sub-domain and load balancer IP (of traefik) into the in-cluster-non-cluster service CoreDNS service and into Cloudflare depending on if an annotation is present on the ingress. See the diagram below for a visual representation.


⚙️  Hardware

Device Count OS Disk Size Data Disk Size Ram Purpose
Intel R1208GL4DS 4 120GB SSD 2x480GB SSD
4x900GB 10.6k SAS
64GB Proxmox hypervisors
and Ceph cluster
Intel R1208GL4DS 1 120GB SSD 2x900GB 10.6k SAS 32GB Backup cold spare
NAS (franxx) 1 120GB SSD 16x8TB RAIDZ2
6x4TB ZFS Mirror
192GB Media and shared file storage

🔧  Tools

Tool Purpose
direnv Sets KUBECONFIG environment variable based on present working directory
go-task Alternative to makefiles, who honestly likes that?
pre-commit Enforce code consistency and verifies no secrets are pushed
stern Tail logs in Kubernetes

🤝  Thanks

A lot of inspiration for my cluster came from the people that have shared their clusters over at awesome-home-kubernetes

About

My home Kubernetes (k3s) cluster managed by GitOps (Flux). Built on Proxmox using Terraform.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 42.9%
  • HCL 34.1%
  • Jinja 23.0%