Skip to content

Commit

Permalink
[IBCDPE-935] Setting up declerative defintion of TF resources (#11)
Browse files Browse the repository at this point in the history
* Start the process of organizing the tf resources
  • Loading branch information
BryanFauble authored Jul 18, 2024
1 parent 28a6d2b commit beeac99
Show file tree
Hide file tree
Showing 58 changed files with 1,357 additions and 561 deletions.
164 changes: 101 additions & 63 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,102 @@
# EKS-stack

Leveraging spot.io, we spin up an EKS stack behind an existing private VPC that has scale-to-zero capabilities. To deploy this stack:

TODO: Instructions need to be re-writen. Deployment is occuring through spacelift.io

<!-- 1. log into dpe-prod via jumpcloud and export the credentials (you must have admin)
2. run `terraform apply`
3. This will deploy the terraform stack. The terraform backend state is stored in an S3 bucket. The terraform state is stored in the S3 bucket `s3://dpe-terraform-bucket`
4. The spot.io account token is stored in AWS secrets manager: `spotinst_token`
5. Add `AmazonEBSCSIDriverPolicy` and `SecretsManagerReadWrite` to the IAM policy -->
# Purpose

This repo is used to deploy an EKS cluster to AWS. CI/CD is managed through Spacelift.

# Directory Structure
```
.: Contains references to all the "Things" that are going to be deployed
├── common-resources: Resources that are environment independent
│ ├── contexts: Contexts that we'll attach across environments
│ └── policies: Rego policies that can be attached to 0..* spacelift stacks
├── dev: Development/sandbox environment
│ ├── spacelift: Terraform scripts to manage spacelift resources
│ │ └── dpe-sandbox: Spacelift specific resources to manage the CI/CD pipeline
│ └── stacks: The deployable cloud resources
│ ├── dpe-sandbox-k8s: K8s + supporting AWS resources
│ └── dpe-sandbox-k8s-deployments: Resources deployed inside of a K8s cluster
└── modules: Templatized collections of terraform resources that are used in a stack
├── apache-airflow: K8s deployment for apache airflow
│ └── templates: Resources used during deployment of airflow
├── sage-aws-eks: Sage specific EKS cluster for AWS
├── sage-aws-k8s-node-autoscaler: K8s node autoscaler using spotinst ocean
└── sage-aws-vpc: Sage specific VPC for AWS
```

This root `main.tf` contains all the "Things" that are going to be deployed.
In this top level directory you'll find that the terraform files are bringing together
everything that should be deployed in spacelift declerativly. The items declared in
this top level directory are as follows:

1) A single root administrative stack that is responsible for taking each and every resource to deploy it to spacelift.
2) A spacelift space that everything is deployed under called `environment`.
3) Reference to the `terraform-registry` modules directory.
4) Reference to `common-resources` or reusable resources that are not environment specific.
5) The environment specific resources such as `dev`, `staging`, or `prod`

This structure is looking to https://github.com/antonbabenko/terraform-best-practices/tree/master/examples for inspiration.

## AWS VPC + AWS EKS
This section describes the VPC (Virtual Private Cloud) that the EKS cluster is deployed
to.

### AWS VPC

The VPC used in this project is created with the [AWS VPC Terraform module](https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/latest).
It contains a number of defaults for our use-case at sage. Head on over to the module
definition to learn more.

### AWS EKS

[AWS EKS](https://aws.amazon.com/eks/) is a managed kubernetes cluster that handles
many of the tasks around running a k8s cluster. On-top of it we are providing the
configurable parameters in order to run a number of workloads.

#### EKS API access
API access to the kubernetes cluster endpoint is set to `Public and private`.

##### Public
This allows one outside of the VPC to connect via `kubectl` and related tools to
interact with kubernetes resources. By default, this API server endpoint is public to
the internet, and access to the API server is secured using a combination of AWS
Identity and Access Management (IAM) and native Kubernetes Role Based Access Control
(RBAC).

##### Private
You can enable private access to the Kubernetes API server so that all communication
between your worker nodes and the API server stays within your VPC. You can limit the
IP addresses that can access your API server from the internet, or completely disable
internet access to the API server.


#### EKS VPC CNI Plugin
This section describes the VPC CNI (Container Network Interface) that is being used
within the EKS cluster. The plugin is responsible for allocating VPC IP addresses to
Kubernetes nodes and configuring the necessary networking for Pods on each node.


#### Security groups for pods
Allows us to assign EC2 security groups directly to pods running in AWS EKS clusters.
This can be used as an alternative or in conjunction with `Kubernetes network policies`.

#### Kubernetes network policies
Controls network traffic within the cluster, for example pod to pod traffic.

Further reading:
- https://docs.aws.amazon.com/eks/latest/userguide/cni-network-policy.html
- https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html
- https://aws.amazon.com/blogs/containers/introducing-security-groups-for-pods/
- https://kubernetes.io/docs/concepts/services-networking/network-policies/


#### EKS Autoscaler

Us use spot.io to manage the nodes attached to each of the EKS cluster. This tool has
scale-to-zerio capabilities and will dynamically add or removes nodes from the cluster
depending on the required demand. The autoscaler is templatized and provided as a
terraform module to be used within an EKS stack.


#### Connecting to an EKS cluster for kubectl commands

To connect to the EKS stack running in AWS you'll need to make sure that you have
SSO setup for the account you'll be using. Once setup run the commands below:
Expand All @@ -21,55 +109,5 @@ aws sso login --profile dpe-prod-admin
# AWS using my SSO session for the profile `dpe-prod-admin`. After authenticated
# assuming that we want to use the `role/eks_admin_role` to connect to the k8s
# cluster". This will update your kubeconfig with permissions to access the cluster.
aws eks update-kubeconfig --region us-east-1 --name dpe-k8 --role-arn arn:aws:iam::766808016710:role/eks_admin_role --profile dpe-prod-admin
```

## Future work

1. Create a separate VPC dedicated to the K8 cluster
2. Create CI/CD to deploy this stack
3. Push this entire stack behind a module
4. Create a module for the node groups so we can attach node groups to EKS cluster


## Adding a node group (WIP)

1. Add an EKS node group

```
two = {
name = "seqera"
desired_size = 1
min_size = 0
max_size = 10
instance_types = ["t3.large"]
capacity_type = "SPOT"
}
```

2. Add an AWS IAM instance profile

```
data "aws_iam_instance_profiles" "profile2" {
depends_on = [module.eks]
role_name = module.eks.eks_managed_node_groups["two"].iam_role_name
}
```

3. Add an ocean virtual node group

```
module "ocean-aws-k8s-vng_gpu" {
source = "spotinst/ocean-aws-k8s-vng/spotinst"
name = "seqera" # Name of VNG in Ocean
ocean_id = module.ocean-aws-k8s.ocean_id
subnet_ids = var.subnet_ids
iam_instance_profile = tolist(data.aws_iam_instance_profiles.profile2.arns)[0]
# instance_types = ["g4dn.xlarge","g4dn.2xlarge"] # Limit VNG to specific instance types
# spot_percentage = 50 # Change the spot %
tags = var.tags
}
```
aws eks update-kubeconfig --region us-east-1 --name dpe-k8 --role-arn arn:aws:iam::766808016710:role/eks_admin_role --profile dpe-prod-admin
```
27 changes: 27 additions & 0 deletions common-resources/contexts/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# infracost integration
resource "spacelift_context" "k8s-kubeconfig" {
description = "Hooks used to set up the kubeconfig for connecting to the K8s cluster"
name = "Kubernetes Deployments Kubeconfig"
space_id = "root"

before_init = [
"aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME"
]

before_plan = [
"aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME"
]

before_apply = [
"aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME"
]

before_perform = [
"aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME"
]

before_destroy = [
"aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME"
]
}

File renamed without changes.
7 changes: 7 additions & 0 deletions common-resources/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
module "policies" {
source = "./policies"
}

module "contexts" {
source = "./contexts"
}
File renamed without changes.
9 changes: 9 additions & 0 deletions common-resources/policies/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
output "enforce_tags_on_resources_id" {
value = spacelift_policy.enforce-tags-on-resources.id
description = "The ID for this spacelift_policy. Checks that a cost center tag is added."
}

output "check_estimated_cloud_spend_id" {
value = spacelift_policy.cloud-spend-estimation.id
description = "The ID for this spacelift_policy"
}
File renamed without changes.
File renamed without changes.
22 changes: 0 additions & 22 deletions data.tf

This file was deleted.

4 changes: 1 addition & 3 deletions deployments/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1 @@
## Deployments

These are the different deployments that are within the kubernetes cluster
This directory is not actively used and will be removed in the future
11 changes: 11 additions & 0 deletions dev/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
resource "spacelift_space" "development" {
name = "development"
parent_space_id = var.parent_space_id
description = "Contains all the resources to deploy out to the dev enviornment."
inherit_entities = true
}

module "dpe-sandbox-spacelift" {
source = "./spacelift/dpe-sandbox"
parent_space_id = spacelift_space.development.id
}
125 changes: 125 additions & 0 deletions dev/spacelift/dpe-sandbox/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
resource "spacelift_space" "dpe-sandbox" {
name = "dpe-sandbox"
parent_space_id = var.parent_space_id
description = "Contains resources for the DPE team for sandbox testing."
inherit_entities = true
}

resource "spacelift_stack" "k8s-stack" {
github_enterprise {
namespace = "Sage-Bionetworks-Workflows"
id = "sage-bionetworks-workflows-gh"
}

administrative = false
autodeploy = true
branch = "ibcdpe-935-vpc-updates"
description = "Infrastructure to support deploying to an EKS cluster"
name = "DPE DEV Kubernetes Infrastructure"
project_root = "dev/stacks/dpe-sandbox-k8s"
repository = "eks-stack"
terraform_version = "1.7.2"
terraform_workflow_tool = "OPEN_TOFU"
space_id = spacelift_space.dpe-sandbox.id
}

resource "spacelift_stack" "k8s-stack-deployments" {
github_enterprise {
namespace = "Sage-Bionetworks-Workflows"
id = "sage-bionetworks-workflows-gh"
}

administrative = false
autodeploy = true
branch = "ibcdpe-935-vpc-updates"
description = "Deployments internal to an EKS cluster"
name = "DPE DEV Kubernetes Deployments"
project_root = "dev/stacks/dpe-sandbox-k8s-deployments"
repository = "eks-stack"
terraform_version = "1.7.2"
terraform_workflow_tool = "OPEN_TOFU"
space_id = spacelift_space.dpe-sandbox.id
}

resource "spacelift_context_attachment" "k8s-kubeconfig-hooks" {
context_id = "kubernetes-deployments-kubeconfig"
stack_id = spacelift_stack.k8s-stack-deployments.id
}

resource "spacelift_stack_dependency" "k8s-stack-to-deployments" {
stack_id = spacelift_stack.k8s-stack-deployments.id
depends_on_stack_id = spacelift_stack.k8s-stack.id
}

resource "spacelift_stack_dependency_reference" "vpc-id-reference" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "vpc_id"
input_name = "TF_VAR_vpc_id"
}

resource "spacelift_stack_dependency_reference" "private-subnet-ids-reference" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "private_subnet_ids"
input_name = "TF_VAR_private_subnet_ids"
}

resource "spacelift_stack_dependency_reference" "security-group-id-reference" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "node_security_group_id"
input_name = "TF_VAR_node_security_group_id"
}

resource "spacelift_stack_dependency_reference" "vpc-cidr-block-reference" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "vpc_cidr_block"
input_name = "TF_VAR_vpc_cidr_block"
}

resource "spacelift_stack_dependency_reference" "region-name" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "region"
input_name = "REGION"
}

resource "spacelift_stack_dependency_reference" "cluster-name" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "cluster_name"
input_name = "CLUSTER_NAME"
}

resource "spacelift_stack_dependency_reference" "cluster-name-tfvar" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "cluster_name"
input_name = "TF_VAR_cluster_name"
}

# resource "spacelift_policy_attachment" "policy-attachment" {
# policy_id = each.value.policy_id
# stack_id = spacelift_stack.k8s-stack.id
# }

resource "spacelift_stack_destructor" "k8s-stack-deployments-destructor" {
depends_on = [
spacelift_stack.k8s-stack,
]

stack_id = spacelift_stack.k8s-stack-deployments.id
}

resource "spacelift_stack_destructor" "k8s-stack-destructor" {
stack_id = spacelift_stack.k8s-stack.id
}

resource "spacelift_aws_integration_attachment" "k8s-aws-integration-attachment" {
integration_id = "01HXW154N60KJ8NCC93H1VYPNM"
stack_id = spacelift_stack.k8s-stack.id
read = true
write = true
}

resource "spacelift_aws_integration_attachment" "k8s-deployments-aws-integration-attachment" {
integration_id = "01HXW154N60KJ8NCC93H1VYPNM"
stack_id = spacelift_stack.k8s-stack-deployments.id
read = true
write = true
}
7 changes: 7 additions & 0 deletions dev/spacelift/dpe-sandbox/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
output "k8s_stack_id" {
value = spacelift_stack.k8s-stack.id
}

output "k8s_stack_deployments_id" {
value = spacelift_stack.k8s-stack-deployments.id
}
Loading

0 comments on commit beeac99

Please sign in to comment.