Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IBCDPE-939] Correcting some issues #6

Merged
merged 7 commits into from
May 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @Sage-Bionetworks-Workflows/dpe
19 changes: 19 additions & 0 deletions .github/workflows/tfsec_pr_commenter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: tfsec-pr-commenter
on:
pull_request:
jobs:
tfsec:
name: tfsec PR commenter
runs-on: ubuntu-latest

permissions:
contents: read
pull-requests: write

steps:
- name: Clone repo
uses: actions/checkout@master
- name: tfsec
uses: aquasecurity/tfsec-pr-commenter-action@v1.2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very neat action!

with:
github_token: ${{ github.token }}
24 changes: 18 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,27 @@
# EKS-stack

Leveraging spot.io, we spin up an EKS stack behind an existing private VPC that has scale-to-zero capabilities. To deploy this stack
Leveraging spot.io, we spin up an EKS stack behind an existing private VPC that has scale-to-zero capabilities. To deploy this stack:

1. log into dpe-prod via jumpcloud and export the credentials (you must have admin)
TODO: Instructions need to be re-writen. Deployment is occuring through spacelift.io

<!-- 1. log into dpe-prod via jumpcloud and export the credentials (you must have admin)
2. run `terraform apply`
3. This will deploy the terraform stack. The terraform backend state is stored in an S3 bucket. The terraform state is stored in the S3 bucket `s3://dpe-terraform-bucket`
4. The spot.io account token is stored in AWS secrets manager: `spotinst_token`
5. Add `AmazonEBSCSIDriverPolicy` and `SecretsManagerReadWrite` to the IAM policy

```
aws eks update-kubeconfig --name tyu-spot-ocean
5. Add `AmazonEBSCSIDriverPolicy` and `SecretsManagerReadWrite` to the IAM policy -->

To connect to the EKS stack running in AWS you'll need to make sure that you have
SSO setup for the account you'll be using. Once setup run the commands below:
```
# Login with the profile you're using to authenticate. For example mine is called
# `dpe-prod-admin`
aws sso login --profile dpe-prod-admin

# Update your kubeconfig with the proper values. This is saying "Authenticate with
# AWS using my SSO session for the profile `dpe-prod-admin`. After authenticated
# assuming that we want to use the `role/eks_admin_role` to connect to the k8s
# cluster". This will update your kubeconfig with permissions to access the cluster.
aws eks update-kubeconfig --region us-east-1 --name dpe-k8 --role-arn arn:aws:iam::766808016710:role/eks_admin_role --profile dpe-prod-admin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, these instructions worked for me!

➜  ~ kubectl get namespace
NAME              STATUS   AGE
airflow           Active   3d3h
default           Active   3d18h
kube-node-lease   Active   3d18h
kube-public       Active   3d18h
kube-system       Active   3d18h
spot-system       Active   3d14h```

```

## Future work
Expand Down
40 changes: 8 additions & 32 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ resource "aws_iam_role" "admin_role" {

resource "aws_iam_role_policy_attachment" "admin_policy" {
role = aws_iam_role.admin_role.name
policy_arn = "arn:aws:iam::aws:policy/AdministratorAccess"
policy_arn = "arn:aws:iam::aws:policy/PowerUserAccess"
}


Expand Down Expand Up @@ -132,44 +132,20 @@ module "eks" {

policy_associations = {
eks_admin_role = {
policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAdminPolicy"
policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
access_scope = {
type = "cluster"
}
}
}
}
# https://docs.aws.amazon.com/eks/latest/userguide/access-policies.html#access-policy-permissions
# TODO: Additional roles that need to be created:
# AmazonEKSAdminViewPolicy?
# AmazonEKSEditPolicy
# AmazonEKSViewPolicy

}
tags = var.tags
}

module "ocean-controller" {
source = "spotinst/ocean-controller/spotinst"

# Credentials.
spotinst_token = data.aws_secretsmanager_secret_version.secret_credentials.secret_string
spotinst_account = var.spotinst_account

# Configuration.
cluster_identifier = var.cluster_name
}

module "ocean-aws-k8s" {
source = "spotinst/ocean-aws-k8s/spotinst"
version = "1.2.0"

depends_on = [module.eks, module.vpc]

# Configuration
cluster_name = var.cluster_name
region = var.region
subnet_ids = module.vpc.private_subnets
worker_instance_profile_arn = tolist(data.aws_iam_instance_profiles.profile.arns)[0]
security_groups = [module.eks.node_security_group_id]
is_aggressive_scale_down_enabled = true
max_scale_down_percentage = 33
# Overwrite Name Tag and add additional
# tags = {
# "kubernetes.io/cluster/tyu-spot-ocean" = "owned"
# }
}
21 changes: 21 additions & 0 deletions modules/internal-k8-infra/data.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,24 @@ data "aws_secretsmanager_secret" "spotinst_token" {
data "aws_secretsmanager_secret_version" "secret_credentials" {
secret_id = data.aws_secretsmanager_secret.spotinst_token.id
}

# TODO: This should search for the VPC using some other value as ID would change
# on first startup and teardown/restart
data "aws_subnets" "node_subnets" {
filter {
name = "vpc-id"
values = ["vpc-0f30cfca319ebc521"]
}
}

# TODO: This should dynamically search for the node group
data "aws_eks_node_group" "profile" {
cluster_name = var.cluster_name
node_group_name = "airflow-node-group-20240517054615841200000009"
}

data "aws_security_group" "eks_cluster_security_group" {
tags = {
Name = "${var.cluster_name}-node"
}
}
16 changes: 16 additions & 0 deletions modules/internal-k8-infra/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,22 @@ module "kubernetes-controller" {
cluster_identifier = var.cluster_name
}


module "ocean-aws-k8s" {
source = "spotinst/ocean-aws-k8s/spotinst"
version = "1.2.0"

# Configuration
cluster_name = var.cluster_name
region = var.region
subnet_ids = data.aws_subnets.node_subnets.ids
worker_instance_profile_arn = data.aws_eks_node_group.profile.node_role_arn
security_groups = [data.aws_security_group.eks_cluster_security_group.id]
is_aggressive_scale_down_enabled = true
max_scale_down_percentage = 33
tags = var.tags
}

resource "kubernetes_namespace" "airflow" {
metadata {
name = "airflow"
Expand Down
4 changes: 2 additions & 2 deletions modules/internal-k8-infra/provider.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ provider "spotinst" {
}

provider "kubernetes" {
config_path = "~/.kube/config"
config_path = var.kube_config_path
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.cluster.token
}

provider "helm" {
kubernetes {
config_path = "~/.kube/config"
config_path = var.kube_config_path
}
}
59 changes: 31 additions & 28 deletions modules/internal-k8-infra/variables.tf
Original file line number Diff line number Diff line change
@@ -1,28 +1,31 @@
variable "cluster_name" {
description = "Name of K8 cluster"
type = string
default = "dpe-k8"
}

variable "region" {
description = "AWS region"
type = string
default = "us-east-1"
}

variable "spotinst_account" {
description = "Spot.io account"
type = string
default = "act-ac6522b4"
}

variable "tags" {
description = "AWS Resource Tags"
type = map(string)
default = {
"CostCenter" = "No Program / 000000"
# "kubernetes.io/cluster/tyu-spot-ocean" = "owned",
# "key" = "kubernetes.io/cluster/tyu-spot-ocean",
# "value" = "owned"
}
}
variable "cluster_name" {
description = "Name of K8 cluster"
type = string
default = "dpe-k8"
}

variable "kube_config_path" {
description = "Kube config path"
type = string
default = "~/.kube/config"
}

variable "region" {
description = "AWS region"
type = string
default = "us-east-1"
}

variable "spotinst_account" {
description = "Spot.io account"
type = string
default = "act-ac6522b4"
}

variable "tags" {
description = "AWS Resource Tags"
type = map(string)
default = {
"CostCenter" = "No Program / 000000"
}
}
17 changes: 9 additions & 8 deletions modules/internal-k8-infra/versions.tf
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
terraform {
required_providers {
spotinst = {
source = "spotinst/spotinst"
version = "1.172.0" # Specify the version you wish to use
}
}
}
terraform {
required_version = "<= 1.5.7"
required_providers {
spotinst = {
source = "spotinst/spotinst"
version = "1.172.0" # Specify the version you wish to use
}
}
}
1 change: 1 addition & 0 deletions provider.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ provider "spotinst" {
}

provider "kubernetes" {
config_path = var.kube_config_path
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
Expand Down
Loading