Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IBCDPE-935] VPC Updates & VPC CNI Exploration #13

Merged
merged 97 commits into from
Aug 15, 2024

Conversation

BryanFauble
Copy link
Contributor

Problem:

  1. We needed to move the deployment over to the dnt-dev AWS account
  2. We needed some more instructions on how to set up the AWS integration with spacelift and the integration with SpotIO
  3. Some exploration of how to configure the VPC CNI plugin was needed.
  4. Some files were no longer needed
  5. A number of logs were not configured to be sent to cloud watch

Solution:

  1. Moving to use the dnt-dev AWS account, setting up the SpotIO and spacelift integrations as well as notes on how these were set up.
  2. Creating a demo module of both pod level security groups, and network policies along with recommendations within the apache airflow module about things to implement for the module.
  3. Cleaning up some files that were not needed
  4. Creating the configuration to export logs out to cloud watch along with a retention period of 1 day for this dev deployment.

Testing:

  1. I have deployed everything out to the dnt-dev AWS account and that is where testing for both pod level security groups and network policies were done.

@BryanFauble BryanFauble requested a review from a team as a code owner July 23, 2024 21:46
@spacelift-int-sagebionetworks spacelift-int-sagebionetworks bot temporarily deployed to spacelift/root-spacelift-administrative-stack July 23, 2024 21:48 Inactive
Base automatically changed from ibcdpe-935-private-worker-pool to main July 25, 2024 16:54
@@ -13,6 +13,11 @@ variable "node_security_group_id" {
type = string
}

variable "pod_to_node_dns_sg_id" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a default for this variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We would have no way of knowing what it would be. The way this works is in the spacelift terraform modules you'll see this block:

resource "spacelift_stack_dependency_reference" "pod-to-node-security-group-id-reference" {
  stack_dependency_id = spacelift_stack_dependency.k8s-stack-to-deployments.id
  output_name         = "pod_to_node_dns_sg_id"
  input_name          = "TF_VAR_pod_to_node_dns_sg_id"
}

This is passing along the output from the k8s infrastructure (Which contains the EKS module) as input to the k8s deployments stack (Which deploys the kubernetes resources).

stack_id = spacelift_stack.k8s-stack.id
read = true
write = true
}

resource "spacelift_aws_integration_attachment" "k8s-deployments-aws-integration-attachment" {
integration_id = "01HXW154N60KJ8NCC93H1VYPNM"
# org-sagebase-dnt-dev-aws-integration

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for dev or prod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dev

version = "0.3.3"
vpc_name = "dpe-sandbox"
capture_flow_logs = true
flow_log_retention = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to increase retention of logs to e.g. 3 days in case we have things running and breaking over the weekends?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For dev I don't think we will need to. However, yes - We will want to use the retention period that is higher than a day. I'm not sure at this point what that period should be, but it's something I will determine before we move over to prod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Word on the street this might be 90 days. Still looking to confirm with IT on this

Copy link
Contributor

@BWMac BWMac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

variable "cloudwatch_retention" {
description = "Number of days to retain CloudWatch Logs"
type = number
default = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enough retention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a later PR I have bumped this to 90

Copy link
Contributor

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 LGTM! It'll be great to eventually deploy this in dpe-prod to run production workflows and great reviews!

@BryanFauble BryanFauble merged commit c91409e into main Aug 15, 2024
2 of 10 checks passed
@BryanFauble BryanFauble deleted the ibcdpe-935-vpc-updates branch August 15, 2024 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants