Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCHEMATIC-138] SigNoz cold storage and backups #47

Merged
merged 136 commits into from
Nov 21, 2024

Conversation

BryanFauble
Copy link
Contributor

@BryanFauble BryanFauble commented Nov 6, 2024

https://sagebionetworks.jira.com/browse/SCHEMATIC-138

Problem:

  1. We did not have anything in place to support tiered storage (Meaning some data on elastic block storage, and some data on S3) for cost effectiveness as we scale out with more data.
  2. We had nothing in place to support backing up or restoring the clickhouse data.

Solution:

  1. Implement an S3 module to support creating an S3 bucket along with the policies for an IAM role to manage that S3 bucket.
  2. Using IRSA (IAM role for service account) to support attaching a kubernetes service account to assume an IAM role to get the permissions of that IAM role. This allows the service account to access and manage that S3 bucket.
  3. Migrating to the usage of FluxCD instead of ArgoCD. FluxCD allows us to use postRenders and allows us to inject custom configuration into the rendered helm chart. This is a missing feature from ArgoCD and prevented us from continuing to use that tool.
  4. Attaching the clickhouse-backup tool (From the same creators as the clickhouse operator being used) to support backups of the data and restore of the data.
  5. Updating the storage configuration for the clickhouse server to support tiered storage and offloading "cold" data into S3. Lowering the overall amount of data that the EBS volumes need to contain.

Testing:

  1. Brad and I tested the backup/restore process with a new k8s cluster
  2. We verified data was being offloaded the EBS volumes into S3
  3. Verified that the deployment of the resources were occurring through FluxCD

@BryanFauble BryanFauble force-pushed the schematic-138-cold-storage-and-backups branch from 330562c to e000497 Compare November 6, 2024 23:09
@spacelift-int-sagebionetworks spacelift-int-sagebionetworks bot temporarily deployed to spacelift/brad-sandbox-kubernetes-deployments November 7, 2024 21:53 Inactive
@spacelift-int-sagebionetworks spacelift-int-sagebionetworks bot temporarily deployed to spacelift/brad-sandbox-kubernetes-deployments November 7, 2024 22:12 Inactive
@spacelift-int-sagebionetworks spacelift-int-sagebionetworks bot temporarily deployed to spacelift/brad-sandbox-kubernetes-deployments November 7, 2024 22:19 Inactive
@spacelift-int-sagebionetworks spacelift-int-sagebionetworks bot temporarily deployed to spacelift/brad-sandbox-kubernetes-deployments November 7, 2024 22:53 Inactive
@spacelift-int-sagebionetworks spacelift-int-sagebionetworks bot temporarily deployed to spacelift/dpe-dev-kubernetes-deployments November 20, 2024 19:39 Inactive
@spacelift-int-sagebionetworks spacelift-int-sagebionetworks bot temporarily deployed to spacelift/dpe-dev-kubernetes-deployments November 20, 2024 19:48 Inactive
Comment on lines +1 to +6
# global

installCRDs: true
crds:
# -- Add annotations to all CRD resources, e.g. "helm.sh/resource-policy": keep
annotations: {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All default config. I didn't see anything needing to be updated

@BryanFauble BryanFauble marked this pull request as ready for review November 20, 2024 20:56
@BryanFauble BryanFauble requested a review from a team as a code owner November 20, 2024 20:56
Copy link
Contributor

@BWMac BWMac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks again for your help on this

@BryanFauble BryanFauble merged commit 74f33bf into signoz-testing Nov 21, 2024
6 of 11 checks passed
@BryanFauble BryanFauble deleted the schematic-138-cold-storage-and-backups branch November 21, 2024 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants