Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IBCDPE-1007] Monitoring and security scanning #14

Merged
merged 164 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from 163 commits
Commits
Show all changes
164 commits
Select commit Hold shift + click to select a range
c2c74de
Create a spacelift private workerpool
BryanFauble Jul 18, 2024
84c36db
Add the private workerpool module
BryanFauble Jul 18, 2024
3c08ae6
Allow conditional create of the workerpool
BryanFauble Jul 18, 2024
324ce7f
skip creating worker pool
BryanFauble Jul 18, 2024
4546b72
Add missed variable
BryanFauble Jul 18, 2024
d224c1b
increment workerpool
BryanFauble Jul 18, 2024
0e9324b
Correct version of helm chart
BryanFauble Jul 18, 2024
d3d5c24
Increment workerpool module version
BryanFauble Jul 18, 2024
c928d8a
Create the k8s worker pool
BryanFauble Jul 18, 2024
5b39416
Add warning for drift detection
BryanFauble Jul 18, 2024
83b96f1
Set to private worker pool id
BryanFauble Jul 18, 2024
63a6868
Enable drift detection via tf
BryanFauble Jul 18, 2024
f6378bb
correct resource name
BryanFauble Jul 18, 2024
ba2dbb4
Remove drift detection from stack
BryanFauble Jul 18, 2024
97f0412
Remove note
BryanFauble Jul 18, 2024
ebd8b3d
Comment out already imported block
BryanFauble Jul 18, 2024
b4517ec
Add module back for 2 step removal process
BryanFauble Jul 18, 2024
d53caa6
Remove private workerpool module
BryanFauble Jul 18, 2024
d60047a
Leave helm provider
BryanFauble Jul 18, 2024
dbf0d45
Merge branch 'ibcdpe-935-private-worker-pool' into ibcdpe-935-vpc-upd…
BryanFauble Jul 18, 2024
910b4e5
hacking around to get the helm_release out of state
BryanFauble Jul 18, 2024
98efa20
Leave module in to remove resources
BryanFauble Jul 18, 2024
3ada360
Remove module
BryanFauble Jul 18, 2024
846b1c2
Update to specify provider required versions in modules instead of pr…
BryanFauble Jul 18, 2024
90f38ef
Updating modules
BryanFauble Jul 18, 2024
b20890d
Remove provider that is not actually required
BryanFauble Jul 18, 2024
a6229bd
Try setting load bal ip ranges
BryanFauble Jul 18, 2024
589de1d
Capture flow logs
BryanFauble Jul 19, 2024
d1a8d28
Catpure flow logs
BryanFauble Jul 19, 2024
113e816
Add to documentation
BryanFauble Jul 19, 2024
200821a
Allow cloud watch logs to be toggled for the EKS module
BryanFauble Jul 19, 2024
9f7e206
Set cloudwatch retention to 1
BryanFauble Jul 19, 2024
a9cdfc3
Set log group class
BryanFauble Jul 19, 2024
00df837
Update to use new vpc module
BryanFauble Jul 19, 2024
5940683
Enable flow log
BryanFauble Jul 19, 2024
788c531
Increment module
BryanFauble Jul 19, 2024
1946b29
Change which port the frontend is running on
BryanFauble Jul 19, 2024
67d06ae
correct which port front-end is listening on
BryanFauble Jul 19, 2024
28d01db
update ports to 80 across the board
BryanFauble Jul 19, 2024
ffef0b2
Add security enforcement for pod
BryanFauble Jul 19, 2024
2d6694d
Leave enforcement on standard
BryanFauble Jul 19, 2024
169b977
set enforcement mode to strict
BryanFauble Jul 19, 2024
fb63178
Create a security group for client
BryanFauble Jul 19, 2024
45eb37a
Leave security group out
BryanFauble Jul 19, 2024
a6b5ff7
Leave out SG
BryanFauble Jul 20, 2024
33d841c
Leave out SG
BryanFauble Jul 20, 2024
fe22b73
Create aws integration for aws dev account
BryanFauble Jul 22, 2024
41d926f
Update integration ID for AWS
BryanFauble Jul 22, 2024
fda882c
Allow setting AWS account in EKS module
BryanFauble Jul 22, 2024
93ead31
Set AWS account to use for EKS module
BryanFauble Jul 22, 2024
d4e3f72
Change which spotinst account to connect to
BryanFauble Jul 22, 2024
984e83f
Apply pod level security group
BryanFauble Jul 22, 2024
3d84150
Add security groups to all pods
BryanFauble Jul 22, 2024
cf2e616
Single security group block
BryanFauble Jul 22, 2024
1b8fb87
rm tag
BryanFauble Jul 22, 2024
41bc5d9
Allow all ports
BryanFauble Jul 22, 2024
72faf17
egress from self
BryanFauble Jul 22, 2024
db0fb39
Allow self
BryanFauble Jul 22, 2024
6d51e35
Allow traffic from the EKS control plane
BryanFauble Jul 22, 2024
de781ff
Test allow egress to the control plane
BryanFauble Jul 22, 2024
c603640
Update to remove some testing
BryanFauble Jul 22, 2024
80e6308
Allow pod to node port 53 for DNS
BryanFauble Jul 22, 2024
a11ac32
Pass along and use the pod->node SG
BryanFauble Jul 22, 2024
14cda29
Increment EKS module used
BryanFauble Jul 22, 2024
6fbb444
Set type for node SG
BryanFauble Jul 22, 2024
092901b
Increment EKS module being used
BryanFauble Jul 22, 2024
ead32c0
Use private subnet cidrs in DNS rule
BryanFauble Jul 22, 2024
8ad2b53
increment eks module
BryanFauble Jul 22, 2024
ad16193
Correct var name
BryanFauble Jul 22, 2024
c9e6435
Correct definition
BryanFauble Jul 22, 2024
5e48a57
Update module
BryanFauble Jul 22, 2024
fa1d9f2
no array value
BryanFauble Jul 22, 2024
99a4e37
increment
BryanFauble Jul 22, 2024
dbb6403
Add ELB SG to pod
BryanFauble Jul 23, 2024
af78ec5
Allow inbound kubelet port from nodes
BryanFauble Jul 23, 2024
9d40ef8
Test allowing traffic from ELB
BryanFauble Jul 23, 2024
d15aa67
Try allowing all ports
BryanFauble Jul 23, 2024
40309d2
Swap over to standard enforcement
BryanFauble Jul 23, 2024
d70351c
default deny stars and client ns
BryanFauble Jul 23, 2024
bbf9574
Add more allowed connections
BryanFauble Jul 23, 2024
f3d4875
New policies
BryanFauble Jul 23, 2024
1c7cfdd
Capture CW
BryanFauble Jul 23, 2024
2f5947b
Increment module
BryanFauble Jul 23, 2024
3ce93d5
Allow cw logs to be created
BryanFauble Jul 23, 2024
106f75a
increment autoscaler
BryanFauble Jul 23, 2024
e8f82b5
Allow kube system traffic
BryanFauble Jul 23, 2024
e2301d2
correct port
BryanFauble Jul 23, 2024
d573cf7
Add egress policies as well
BryanFauble Jul 23, 2024
c5294f2
Set egress policy for client
BryanFauble Jul 23, 2024
9707f0b
Set NS and pod selector
BryanFauble Jul 23, 2024
11b5848
correct selector
BryanFauble Jul 23, 2024
045e143
Correct NS selectors
BryanFauble Jul 23, 2024
a307ea9
Adding docs and pushing changes to stand alone modules
BryanFauble Jul 23, 2024
63c5804
Point to main branch
BryanFauble Jul 23, 2024
89f7e84
Merge branch 'main' into ibcdpe-935-vpc-updates
BryanFauble Jul 25, 2024
e147349
Default to standard
BryanFauble Jul 25, 2024
07d593a
Add VPC diagram
BryanFauble Jul 25, 2024
525627e
Create VM and point to branch
BryanFauble Jul 26, 2024
62bb601
Delete bad copy
BryanFauble Jul 26, 2024
c8e3906
Remove notes
BryanFauble Jul 26, 2024
4bc4065
Set keepers on version
BryanFauble Jul 26, 2024
b808676
a
BryanFauble Jul 26, 2024
706894a
Deploy VM
BryanFauble Jul 26, 2024
4539387
Correct ID that changed for some reason
BryanFauble Jul 26, 2024
75705d6
Comment out other helm repo
BryanFauble Jul 26, 2024
e30b6cd
Increment
BryanFauble Jul 26, 2024
6c0ccaa
Correct cluster name
BryanFauble Jul 26, 2024
6baea65
Correct ver
BryanFauble Jul 26, 2024
cf9d591
Increment
BryanFauble Jul 26, 2024
b0bab3e
Allow desired capacity to be set
BryanFauble Jul 26, 2024
f34dca6
Bump capacity for testing
BryanFauble Jul 26, 2024
3fdc694
Create otel-collector
BryanFauble Jul 26, 2024
1e29e36
Deploy otel collector
BryanFauble Jul 26, 2024
2989cfe
Correct values interpolation
BryanFauble Jul 26, 2024
18c9a50
Increment
BryanFauble Jul 26, 2024
971cb39
Update values
BryanFauble Jul 26, 2024
a6ef47c
Deploy updated otel collector
BryanFauble Jul 26, 2024
588d36e
Create cert-manager deployment
BryanFauble Jul 26, 2024
96c3c70
Create trivy-operator
BryanFauble Jul 29, 2024
93d03ac
Enabled trivy service scrape
BryanFauble Jul 29, 2024
e29e990
Create a service scrape for the trivy operator
BryanFauble Jul 29, 2024
3e2f867
Create vulnarability dashboard in VM
BryanFauble Jul 29, 2024
7b62255
Remove stack dependency
BryanFauble Jul 29, 2024
cde5502
Exclude amazon specific images
BryanFauble Jul 29, 2024
e13cd42
Increment
BryanFauble Jul 29, 2024
c889cba
Remove var
BryanFauble Jul 29, 2024
d0027bb
Add policy reporter to view scan results
BryanFauble Jul 29, 2024
49718cd
Increment operator
BryanFauble Jul 29, 2024
9644ea2
Update trivy
BryanFauble Jul 29, 2024
ec7dbc8
Increment
BryanFauble Jul 29, 2024
0daffec
Correct mistake
BryanFauble Jul 29, 2024
ca41933
Increment operator
BryanFauble Jul 29, 2024
220739a
Remove CISKubeBenchReport
BryanFauble Jul 29, 2024
0c05c79
Increment
BryanFauble Jul 29, 2024
ed3a8e8
Add to readme
BryanFauble Jul 29, 2024
18beeaa
Remove not yet implemented modules
BryanFauble Jul 29, 2024
eae8e08
Set default resources
BryanFauble Jul 29, 2024
71cd4f5
Increment
BryanFauble Jul 29, 2024
879fb4f
Bump defaults
BryanFauble Jul 29, 2024
3ef7bd7
Bump up version
BryanFauble Jul 29, 2024
ba6caf6
Turn of alert and bump up scrap interval
BryanFauble Jul 30, 2024
6fbc513
Increment
BryanFauble Jul 30, 2024
7dc8b31
Adjust interval back
BryanFauble Jul 30, 2024
0121a37
Increment
BryanFauble Jul 30, 2024
4183734
flips accessGlobalSecretsAndServiceAccount to false for values-trivy …
BWMac Jul 31, 2024
f683b13
increments trivy-operator version
BWMac Jul 31, 2024
d037fba
increments trivy-operator version for deployment
BWMac Jul 31, 2024
97cdff2
Adding apache airflow module
BryanFauble Jul 31, 2024
89b641f
Deploy airflow
BryanFauble Jul 31, 2024
59ea85e
Leave airflow turned off
BryanFauble Jul 31, 2024
b931e5f
Deploy ariflow
BryanFauble Jul 31, 2024
b2c577c
Update eks module
BryanFauble Jul 31, 2024
e776195
Correct var reference
BryanFauble Jul 31, 2024
0bd5037
Correction
BryanFauble Jul 31, 2024
fdeb6f6
Increment eks module
BryanFauble Jul 31, 2024
a154a5f
Increment vpc version
BryanFauble Jul 31, 2024
862838a
Update the autoscaler to use nitro based instances
BryanFauble Jul 31, 2024
51c5318
Increment autoscaler
BryanFauble Jul 31, 2024
1949755
Update where filter is defined
BryanFauble Jul 31, 2024
7e20928
Increment autoscaler
BryanFauble Jul 31, 2024
3ad49d5
Set required properties
BryanFauble Jul 31, 2024
a55a11d
Increment
BryanFauble Jul 31, 2024
abbe412
Remove files that are not needed
BryanFauble Jul 31, 2024
122419c
Merge branch 'main' into ibcdpe-1007-monitoring
BryanFauble Aug 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 92 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ This repo is used to deploy an EKS cluster to AWS. CI/CD is managed through Spac
└── modules: Templatized collections of terraform resources that are used in a stack
├── apache-airflow: K8s deployment for apache airflow
│ └── templates: Resources used during deployment of airflow
├── demo-network-policies: K8s deployment for a demo showcasing how to use network policies
├── demo-pod-level-security-groups-strict: K8s deployment for a demo showcasing how to use pod level security groups in strict mode
├── sage-aws-eks: Sage specific EKS cluster for AWS
├── sage-aws-k8s-node-autoscaler: K8s node autoscaler using spotinst ocean
└── sage-aws-vpc: Sage specific VPC for AWS
Expand Down Expand Up @@ -54,6 +56,10 @@ configurable parameters in order to run a number of workloads.
#### EKS API access
API access to the kubernetes cluster endpoint is set to `Public and private`.

Reading:

- <https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/network_connectivity.md>

##### Public
This allows one outside of the VPC to connect via `kubectl` and related tools to
interact with kubernetes resources. By default, this API server endpoint is public to
Expand All @@ -78,9 +84,13 @@ Kubernetes nodes and configuring the necessary networking for Pods on each node.
Allows us to assign EC2 security groups directly to pods running in AWS EKS clusters.
This can be used as an alternative or in conjunction with `Kubernetes network policies`.

See `modules/demo-pod-level-security-groups-strict` for more context on how this works.

#### Kubernetes network policies
Controls network traffic within the cluster, for example pod to pod traffic.

See `modules/demo-network-policies` for more context on how this works.

Further reading:
- https://docs.aws.amazon.com/eks/latest/userguide/cni-network-policy.html
- https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html
Expand All @@ -90,11 +100,26 @@ Further reading:

#### EKS Autoscaler

Us use spot.io to manage the nodes attached to each of the EKS cluster. This tool has
We use spot.io to manage the nodes attached to each of the EKS cluster. This tool has
scale-to-zerio capabilities and will dynamically add or removes nodes from the cluster
depending on the required demand. The autoscaler is templatized and provided as a
terraform module to be used within an EKS stack.

Setup of spotio (Manual per AWS Account):

* Subscribe through the AWS Marketplace: <https://aws.amazon.com/marketplace/saas/ordering?productId=bc241ac2-7b41-4fdd-89d1-6928ec6dae15>
* "Set up your account" on the spotio website and link it to an existing organization
* Link the account through the AWS UI:
* Create a policy (See the JSON in the spotio UI)
* Create a role (See instructions in the spotio UI)

After this has been setup the last item is to get an API token from the spotio UI and
add it to the AWS secret manager.

* Log into the spot UI and go to <https://console.spotinst.com/settings/v2/tokens/permanent>
* Create a new Permanent token, name it `{AWS-Account-Name}-token` or similar
* Copy the token and create an `AWS Secrets Manager` Plaintext secret named `spotinst_token` with a description `Spot.io token`


#### Connecting to an EKS cluster for kubectl commands

Expand All @@ -111,3 +136,69 @@ aws sso login --profile dpe-prod-admin
# cluster". This will update your kubeconfig with permissions to access the cluster.
aws eks update-kubeconfig --region us-east-1 --name dpe-k8 --role-arn arn:aws:iam::766808016710:role/eks_admin_role --profile dpe-prod-admin
```

### Spacelift
Here are some instructions on setting up spacelift.


#### Connecting a new AWS account for cloud integration

This document describes the abbreviated process below:
<https://docs.spacelift.io/integrations/cloud-providers/aws#setup-guide>

- Create a new role and set it's name to something unique within the account, such as `spacelift-admin-role`
- Description: "Role for spacelift CICD to assume when deploying resources managed by terraform"
- Use the custom trust policy below:

```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::324880187172:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringLike": {
"sts:ExternalId": "sagebionetworks@*"
}
}
},
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::{{AWS ACCOUNT ID}}:root"
},
"Action": "sts:AssumeRole"
}
]
}
```

- Attach a few policies to the role:
- `PowerUserAccess`
- Create an inline policy to allow interaction with IAM (Needed if TF is going to be creating, editing, and deleting IAM roles/policies):
```
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"iam:*Role",
"iam:*RolePolicy",
"iam:*RolePolicies",
"iam:*Policy",
"iam:*PolicyVersion",
"iam:*OpenIDConnectProvider",
"iam:*InstanceProfile"
],
"Resource": "*"
}
]
}
```
- Add a new `spacelift_aws_integration` resources to the `common-resources/aws-integrations` directory.

8 changes: 8 additions & 0 deletions common-resources/aws-integrations/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Resources derived from: https://registry.terraform.io/providers/spacelift-io/spacelift/latest/docs/resources/aws_integration
resource "spacelift_aws_integration" "org-sagebase-dnt-dev-aws-integration" {
name = "org-sagebase-dnt-dev-aws-integration"
role_arn = "arn:aws:iam::631692904429:role/spacelift-admin-role"
generate_credentials_in_worker = false
duration_seconds = 3600
space_id = "root"
}
8 changes: 8 additions & 0 deletions common-resources/aws-integrations/versions.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
terraform {
required_providers {
spacelift = {
source = "spacelift-io/spacelift"
version = "1.13.0"
}
}
}
4 changes: 4 additions & 0 deletions common-resources/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,7 @@ module "policies" {
module "contexts" {
source = "./contexts"
}

module "aws-integrations" {
source = "./aws-integrations"
}
1 change: 1 addition & 0 deletions dev/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ resource "spacelift_space" "development" {
module "dpe-sandbox-spacelift" {
source = "./spacelift/dpe-sandbox"
parent_space_id = spacelift_space.development.id
admin_stack_id = var.admin_stack_id
}
26 changes: 22 additions & 4 deletions dev/spacelift/dpe-sandbox/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ resource "spacelift_stack" "k8s-stack" {

administrative = false
autodeploy = true
branch = "ibcdpe-935-vpc-updates"
branch = "ibcdpe-1007-monitoring"
BryanFauble marked this conversation as resolved.
Show resolved Hide resolved
description = "Infrastructure to support deploying to an EKS cluster"
name = "DPE DEV Kubernetes Infrastructure"
project_root = "dev/stacks/dpe-sandbox-k8s"
Expand All @@ -31,7 +31,7 @@ resource "spacelift_stack" "k8s-stack-deployments" {

administrative = false
autodeploy = true
branch = "ibcdpe-935-vpc-updates"
branch = "ibcdpe-1007-monitoring"
description = "Deployments internal to an EKS cluster"
name = "DPE DEV Kubernetes Deployments"
project_root = "dev/stacks/dpe-sandbox-k8s-deployments"
Expand All @@ -41,6 +41,16 @@ resource "spacelift_stack" "k8s-stack-deployments" {
space_id = spacelift_space.dpe-sandbox.id
}

# resource "spacelift_stack_dependency" "dependency-on-admin-stack" {
# for_each = {
# k8s-stack = spacelift_stack.k8s-stack,
# k8s-stack-deployments = spacelift_stack.k8s-stack-deployments
# }

# stack_id = each.value.id
# depends_on_stack_id = var.admin_stack_id
# }

resource "spacelift_context_attachment" "k8s-kubeconfig-hooks" {
context_id = "kubernetes-deployments-kubeconfig"
stack_id = spacelift_stack.k8s-stack-deployments.id
Expand Down Expand Up @@ -69,6 +79,12 @@ resource "spacelift_stack_dependency_reference" "security-group-id-reference" {
input_name = "TF_VAR_node_security_group_id"
}

resource "spacelift_stack_dependency_reference" "pod-to-node-security-group-id-reference" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "pod_to_node_dns_sg_id"
input_name = "TF_VAR_pod_to_node_dns_sg_id"
}

resource "spacelift_stack_dependency_reference" "vpc-cidr-block-reference" {
stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
output_name = "vpc_cidr_block"
Expand Down Expand Up @@ -111,14 +127,16 @@ resource "spacelift_stack_destructor" "k8s-stack-destructor" {
}

resource "spacelift_aws_integration_attachment" "k8s-aws-integration-attachment" {
integration_id = "01HXW154N60KJ8NCC93H1VYPNM"
# org-sagebase-dnt-dev-aws-integration
integration_id = "01J3R9GX6DC09QV7NV872DDYR3"
BryanFauble marked this conversation as resolved.
Show resolved Hide resolved
stack_id = spacelift_stack.k8s-stack.id
read = true
write = true
}

resource "spacelift_aws_integration_attachment" "k8s-deployments-aws-integration-attachment" {
integration_id = "01HXW154N60KJ8NCC93H1VYPNM"
# org-sagebase-dnt-dev-aws-integration
integration_id = "01J3R9GX6DC09QV7NV872DDYR3"
stack_id = spacelift_stack.k8s-stack-deployments.id
read = true
write = true
Expand Down
5 changes: 5 additions & 0 deletions dev/spacelift/dpe-sandbox/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,8 @@ variable "tags" {
"CostCenter" = "No Program / 000000"
}
}

variable "admin_stack_id" {
description = "ID of the admin stack"
type = string
}
1 change: 0 additions & 1 deletion dev/stacks/dpe-sandbox-k8s-deployments/data.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,3 @@ data "aws_secretsmanager_secret" "spotinst_token" {
data "aws_secretsmanager_secret_version" "secret_credentials" {
secret_id = data.aws_secretsmanager_secret.spotinst_token.id
}

Loading