Skip to content

Smana/cloud-native-ref

Repository files navigation

Reference Repository for Building a Cloud Native Platform

This is an opinionated set of configurations for managing a Cloud Native platform using GitOps principles.

This repository provides a comprehensive guide and set of tools for building, managing, and maintaining a Cloud Native platform. It includes configurations for Kubernetes, Crossplane, Flux, OpenBao, and more, with a focus on security, scalability, and best practices.

Table of Contents

☑️ Curated Toolset and Use Cases

overview

Technology Domain What it is used for?
Kubernetes Infrastructure Container orchestration, core platform on which applications are deployed
Crossplane Infrastructure Framework to compose application and infrastructure components, providing proper abstraction levels
OpenTofu Infrastructure Open-source alternative to Terraform for provisioning and managing infrastructure
Harbor Application Secure container image registry with scanning and signing capabilities
Headlamp Application Web-based GUI for Kubernetes cluster management
CloudNativePG Data Kubernetes operator managing PostgreSQL clusters with high availability and failover support
Valkey Data Redis-like key-value data store
Dagger Continuous Integration CI/CD tool used to define and run pipelines as code
Flux Continuous Delivery GitOps engine ensuring that what is defined in the GitHub repository is deployed on Kubernetes
VictoriaMetrics Observability High-performance monitoring solution for collecting and querying metrics
Tailscale Networking VPN solution for secure connections between Kubernetes clusters and other resources
Gateway API Networking Defines standard APIs for configuring Kubernetes ingress and traffic routing
Cilium Networking Advanced networking, security, and observability for Kubernetes using eBPF
External DNS Networking Synchronizes Kubernetes resources with DNS providers like Route 53, Cloudflare, and others
OpenBao Security Open-source fork of Vault for secure secret storage, encryption, and access management
Cert-manager Security Automates the creation and renewal of TLS certificates
ZITADEL Security Cloud-native identity and access management system
ExternalSecrets Operator Security Synchronizes secrets from external secret managers (e.g., Vault, AWS Secrets Manager) into Kubernetes
Managed Services Managed Services Cloud Services such as DNS (Route53), IAM, Load Balancing, KMS (Encrypt sensitive data) and Storage (S3)

🚀 Getting started

There are basically 3 things to run when deploying the whole stack:

  1. 📡 Install the network requirements
  2. 🔒 Deploy a OpenBao instance
  3. ☸️ Bootstrap the EKS cluster and Flux components

🔄 Flux Dependencies Matter

Flux is a foundational component responsible for deploying all resources as soon as the Kubernetes cluster becomes operational. Below is a diagram that highlights the key dependencies in our setup:

graph TD;
    Namespaces-->CRDs;
    CRDs-->Crossplane;
    Crossplane-->EPIs["EKS Pod Identities"];
    EPIs["EKS Pod Identities"]-->Security;
    EPIs["EKS Pod Identities"]-->Infrastructure;
    EPIs["EKS Pod Identities"]-->Observability;
    Observability-->Apps["Other apps"];
    Infrastructure-->Apps["Other apps"];
    Security-->Infrastructure;
    Security-->Observability
Loading

This diagram can be hard to understand so these are the key information:

  • Namespaces - Namespaces are the foundational resources in Kubernetes. All subsequent resources can be scoped to namespaces.

  • Custom Resource Definitions (CRDs) - CRDs extend Kubernetes' capabilities by defining new resource types. These must be established before they can be utilized in other applications.

  • Crossplane - Used to provision the necessary infrastructure components from Kubernetes.

  • EKS Pod Identities - Created using Crossplane, these IAM roles are necessary to grant specific AWS API permissions to certain applications.

  • Security - Among other things, this step deploys external-secrets which is essential to use sensitive data into our applications

🏗️ Crossplane Configuration

Requirements and Security Concerns

When the cluster is initialized, we define the permissions for the Crossplane controllers using Opentofu. This involves attaching a set of IAM policies to a role. This role is crucial for managing AWS resources.

We prioritize security by adhering to the principle of least privilege. This means we only grant the necessary permissions, avoiding any excess. For instance, although Crossplane allows it, I have chosen not to give the controllers the ability to delete stateful services like S3, IAM or Route53. This decision is a deliberate step to minimize potential risks.

Additionally, I have put a constraint on the resources the controllers can manage. Specifically, they are limited to managing only those resources which are prefixed with xplane-. This restriction helps in maintaining a more controlled and secure environment.

How is Crossplane Deployed?

Crossplane allows provisioning and managing Cloud Infrastructure (and even more) using native Kubernetes features. It needs to be installed and set up in three successive steps:

  1. Installation of the Kubernetes operator.
  2. Deployment of the AWS provider, which provides custom resources, including AWS roles, policies, etc.
  3. Installation of compositions that will generate AWS resources.

🏷️ Related blog posts:

📦 OCI Registry with Harbor

The Harbor installation follows best practices for high availability. It leverages recent Crossplane features such as Composition functions:

  • A CloudNativePG instance.
  • Valkey cluster using the Bitnami Helm chart
  • Storing artifacts in S3

🏷️ Related blog post: Going Further with Crossplane: Compositions and Functions

🔗 VPN connection using Tailscale

The VPN configuration is done within the opentofu/network directory. You can follow the steps described in this README in order to provision a server that allows to access to private resources within AWS.

Most of the time we don't want to expose our resources publicly. For instance our platform tools such as Grafana, Harbor should be access through a secured wire. The risk becomes even more significant when dealing with Kubernetes' API. Indeed, one of the primary recommendations for securing a cluster is to limit access to the API.

Anyway, I intentionnaly created a distinct directory that allows to provision the network and a secured connection. So that there are no confusion with the EKS provisionning.

🏷️ Related blog post: Beyond Traditional VPNs: Simplifying Cloud Access with Tailscale

🔑 Private PKI with OpenBao

ℹ️ OpenBao is an opensource fork of the Hashicorp Vault solution.

The OpenBao instance creation is made in 2 steps:

  1. Create the cluster as described here
  2. Then configure it using this directory

The provided code outlines the setup and configuration of a highly available, secure, and cost-efficient OpenBao cluster. It describes the process of creating a OpenBao instance in either development or high availability mode, with detailed steps for initializing the OpenBao, managing security tokens, and configuring a robust Public Key Infrastructure (PKI) system. The focus is on balancing performance, security, and cost, using a multi-node cluster, ephemeral nodes with SPOT instances, and a tiered CA structure for digital security.

🏷️ Related blog post: TLS with Gateway API: Efficient and Secure Management of Public and Private Certificates

👁️ Observability

To effectively identify issues and optimize performance, a comprehensive monitoring stack is essential. Several tools are available to provide detailed insights into system health, covering key areas such as metrics, logs, tracing, and profiling. Here's an overview of our current setup:

  • Metrics: We’ve implemented a combination of VictoriaMetrics and Grafana operators to collect, visualize, and analyze metrics. This stack enables real-time monitoring, custom dashboards, and the ability to configure alerts and notifications for proactive issue management.

  • Logs: (Coming soon)

🏷️ Related blog posts:

🧪 CI

Overview

We leverage Dagger for all our CI tasks. Here's what is currently run:

  • Validation of Kubernetes and Kustomize manifests using kubeconform
  • Validation of Terraform/Opentofu configurations using the pre-commit-terraform

🏠 Using Self-Hosted Runners

Deploying This feature can be enabled within the tooling kustomization. By leveraging Self-Hosted GitHub Runners, we achieve:

  • Access to Private Endpoints: Directly interact with internal resources that are not publicly accessible.
  • Increased Security: Run CI tasks within our secure internal environment.

For detailed information on setting up and using GitHub Self-Hosted Runners, please refer to this documentation.

🏷️ Related blog post: Dagger: The missing piece of the developer experience

💬 Chating and contributing

  • 🗨️ Slack Channel: Feel free to come and chat with us if you have any issue, ideas or questions.
  • 💡 Discussions: Explore improvement areas, define the roadmap, and prioritize issues.
  • 🛠️ Issues: Track tasks and report bugs to ensure prompt resolution.
  • 📅 Project: Detailed project planning and prioritization information.

About

Opiniated Cloud Native Platform Reference

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •