Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs and DX improvements #6

Merged
merged 20 commits into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
aaf291e
chore(DX): auto-generate .env via terraform
chapati23 Jul 24, 2024
3ac9ce5
chore: set author in package.json to Mento Labs
chapati23 Jul 24, 2024
9380336
chore: remove hardcoded function name and entry point from deploy script
chapati23 Jul 24, 2024
0a8a6e7
chore: renamed 'npm test' tasks to more standard names
chapati23 Jul 24, 2024
b81aa66
docs: updated & improved README
chapati23 Jul 24, 2024
d5a617b
chore: updated checkov linter
chapati23 Jul 24, 2024
b22eb3c
chore: increase printed log entries to 50 for get-logs.sh script
chapati23 Jul 24, 2024
04ccd00
docs(README): added Debugging section
chapati23 Jul 24, 2024
ccacb5d
chore: removed superfluous comments in cloud function tf
chapati23 Jul 25, 2024
17deeb2
build: made shell scripts more robust and reusable across projects
chapati23 Jul 25, 2024
9c1bd51
chore(DX): added caching speed up shell scripts
chapati23 Jul 26, 2024
f2bbed5
build: added 'npm run todo' task to print all TODOs/FIXMEs
chapati23 Jul 26, 2024
1e0cb2a
fix: added a shellcheck disable for a case it cant handle
chapati23 Jul 26, 2024
d92632b
Merge branch 'main' into chore/docs-and-dx-improvements
chapati23 Jul 26, 2024
aba11cd
Merge branch 'main' into chore/docs-and-dx-improvements
chapati23 Jul 26, 2024
0bf3c3a
Merge branch 'main' into chore/docs-and-dx-improvements
nvtaveras Jul 29, 2024
229000b
fix: update telegram bot token cmd
nvtaveras Jul 30, 2024
d1054a5
chore: docs improvements
nvtaveras Jul 30, 2024
7e83abd
docs: add victorops_webhook_url
nvtaveras Jul 30, 2024
e404180
chore: trunk
nvtaveras Jul 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
# YOU DO NOT NEED TO CREATE A .env FILE MANUALLY
# Terraform will automatically create a local .env file with the required environment variables when you run `terraform apply`.
# This file is just to illustrate the required environment variables for the function to work locally.

# Required for the function to be able to look up the Discord Webhook URL in GCP Secret Manager.
# Get it via `gcloud projects list --filter="name:governance-watchdog*" --format="value(projectId)")`
# You can check it manually via `gcloud projects list --filter="name:governance-watchdog*" --format="value(projectId)"`
GCP_PROJECT_ID=

# Required for the function to be able to look up the Discord Webhook URL and Telegram Bot Token in GCP Secret Manager.
# Get it via `gcloud secrets list`
# Required for the function to be able to look up secrets in GCP Secret Manager.
# You can check it manually via `gcloud secrets list`
DISCORD_WEBHOOK_URL_SECRET_ID=
TELEGRAM_BOT_TOKEN_SECRET_ID=

# Get it via inviting @MissRose_bot to the telegram group and then using the `/id` command (please remove the bot after you're done)
# You can check it manually either via
# a) `terraform state show "google_cloudfunctions2_function.watchdog_notifications" | grep TELEGRAM_CHAT_ID | awk -F '= ' '{print $2}' | tr -d '"'`
# OR
# b) inviting @MissRose_bot to the telegram group and then using the `/id` command (please remove the bot after you're done)
TELEGRAM_CHAT_ID=
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,7 @@ errored.tfstate
.env.yaml
dist/
function-source.zip
node_modules/
node_modules/

# Local Stuff
.project_vars_cache
223 changes: 128 additions & 95 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,20 @@ A monorepo for our governance watchdog, a system that monitors Mento Governance

![Architecture Diagram](arch-diagram.png)

- [Requirements](#requirements)
- [Local Development of Cloud Function Code](#local-development-of-cloud-function-code)
- [Requirements for local development](#requirements-for-local-development)
- [Local Infra Setup (when project is deployed already)](#local-infra-setup-when-project-is-deployed-already)
- [Running and testing the Cloud Function locally](#running-and-testing-the-cloud-function-locally)
- [Testing the Deployed Cloud Function](#testing-the-deployed-cloud-function)
- [Infra Setup (when project is deployed already)](#infra-setup-when-project-is-deployed-already)
- [First Time Infra Deployment via Terraform](#first-time-infra-deployment-via-terraform)
- [Updating the Cloud Function](#updating-the-cloud-function)
- [Infra Deployment via Terraform](#infra-deployment-via-terraform)
- [Google Cloud Permission Requirements](#google-cloud-permission-requirements)
- [Deployment from scratch](#deployment-from-scratch)
- [Migrate Terraform State to Google Cloud](#migrate-terraform-state-to-google-cloud)
- [Updating the Cloud Function](#updating-the-cloud-function)
- [Debugging Problems](#debugging-problems)
- [View Logs](#view-logs)
- [Teardown](#teardown)

## Requirements
## Requirements for local development

1. Install the `gcloud` CLI

Expand Down Expand Up @@ -51,6 +53,15 @@ A monorepo for our governance watchdog, a system that monitors Mento Governance
# For other systems, see https://developer.hashicorp.com/terraform/install
```

1. Install `jq` (used in a few shell scripts)

```sh
# On macOS
brew install jq

# For other systems, see https://jqlang.github.io/jq/
```

1. Authenticate with Google Cloud default credentials in your local shell

```sh
Expand All @@ -70,46 +81,8 @@ A monorepo for our governance watchdog, a system that monitors Mento Governance
1. A Telegram group to send notifications to

1. A Telegram bot must be in the group to receive the notifications.
If you're doing this from scratch, here's how to create a bot

- Open a new chat with @BotFather
- Use the `/newbot` command to create a new bot
- Copy the API key printed out at the end of the prompt and store it in your `terraform.tfvars`

```hcl
telegram_bot_token = "<bot-api-key>"
```

- Get the Chat ID by inviting @MissRose_bot to the group and then using the `/id` command
- Add the Chat ID to your `terraform.tfvars`

```hcl
telegram_chat_id = "<group-chat-id>"
```

- Remove @MissRose_bot after you got the Chat ID

## Local Development of Cloud Function Code

- `npm install` (couldn't use `pnpm` because Google Cloud Build failed trying to install pnpm at the time of writing)
- `cp .env.example .env` and fill in the required values (there are comments in the `.env.example` explaining how to get them)
- `npm start` to start a local cloud function
- `npm test` to call the local cloud function with a mocked payload, this will send a real Discord message into channel belonging to the webhook in `.env`:

```sh
curl -H "Content-Type: application/json" -d @src/proposal-created.fixture.json localhost:8080
```

## Testing the Deployed Cloud Function

You can test the deployed cloud function manually by using the `proposal-created.fixture.json` which contains a similar payload to what a QuickAlert would send to the cloud function:

```sh
./test-deployed-function.sh
# or `npm run test-in-prod` if you prefer npm to call this script
```

## Infra Setup (when project is deployed already)
## Local Infra Setup (when project is deployed already)

1. Set your local `gcloud` project to the watchdog project:

Expand All @@ -124,11 +97,11 @@ You can test the deployed cloud function manually by using the `proposal-created
terraform init
```

1. Create a `terraform.tfvars` file in the `./infra` folder, this is like `.env` for Terraform:
1. While inside the `infra` folder, create `terraform.tfvars` file. This is like `.env` for Terraform:

```sh
touch ./infra/terraform.tfvars
# This file should be `.gitignore`d to avoid accidentally leaking sensitive data
touch terraform.tfvars
# This file is `.gitignore`d to avoid accidentally leaking sensitive data
```

1. Add the following values to your `terraform.tfvars`, you can look up all values in the Google Cloud console (or ask another dev to share his local `terraform.tfvars` with you)
Expand All @@ -147,24 +120,89 @@ You can test the deployed cloud function manually by using the `proposal-created
group_billing_admins = "<our-billing-admins-group>"
```

1. Add the Discord Webhook URL from Google Cloud Secret Manager into your local `terraform.tfvars`:
1. Add the Discord Webhook URL from Google Cloud Secret Manager to your local `terraform.tfvars`:

```sh
# You will need the "Secret Manager Secret Accessor" IAM role for this command to succeed
# You need the "Secret Manager Secret Accessor" IAM role for this command to succeed
echo "discord_webhook_url = \"$(gcloud secrets versions access latest --secret discord-webhook-url)\"" >> terraform.tfvars
```

1. Add the Telegram Bot Token and Chat ID to your local `terraform.tfvars`

```sh
# Get the chat ID from cloud function's terraform state
echo "\ntelegram_chat_id = \"$(terraform state show "google_cloudfunctions2_function.watchdog_notifications" | grep TELEGRAM_CHAT_ID | awk -F '= ' '{print $2}' | tr -d '"')\"" >> terraform.tfvars

# Get the bot token from secret manager (you need the "Secret Manager Secret Accessor" IAM role for this command to succeed)
echo "telegram_bot_token = \"$(gcloud secrets versions access latest --secret telegram-bot-token)\"" >> terraform.tfvars
```

1. [Get our QuickNode API key from the QuickNode dashboard](https://dashboard.quicknode.com/api-keys) and add it to your local `terraform.tfvars`:

```sh
# ./infra/terraform.tfvars
discord_webhook_url = "<discord-webhook-url>"
quicknode_api_key = "<your-quicknode-api-key>"
```

This is necessary for Terraform to be able to create & destroy QuickAlerts as part of `terraform apply`

## First Time Infra Deployment via Terraform
1. Get the VictorOps Webhook URL to your local `terraform.tfvars`. You can get it by going to VictorOps and clicking `Integrations` > `Stackdriver` and copying the URL. The routing key can be founder under the `Settings` tab:

```sh
# ./infra/terraform.tfvars
victorops_webhook_url = "<victorops-webhook-url>/<victorops-routing-key>"
```

1. Auto-generate a local `.env` file by running `npm run generate:env`

## Running and testing the Cloud Function locally

- Make sure you generated a local `.env` file via `npm run generate:env` earlier
- `npm install` (couldn't use `pnpm` because Google Cloud Build failed trying to install pnpm at the time of writing)
- `npm start` to start a local cloud function
- `npm test` to call the local cloud function with a mocked payload, this will send a real Discord message into the channel belonging to the configured Discord Webhook:

```sh
curl -H "Content-Type: application/json" -d @src/proposal-created.fixture.json localhost:8080
```

## Testing the Deployed Cloud Function

You can test the deployed cloud function manually by using the `proposal-created.fixture.json` which contains a similar payload to what a QuickAlert would send to the cloud function:

```sh
./test-deployed-function.sh
# or `npm run test:prod` if you prefer npm to call this script
```

## Updating the Cloud Function

You have two options, using `terraform` or the `gcloud` cli. Both are perfectly fine to use.

1. Via `terraform` by running `npm run deploy:via:tf`
- How? The npm task will:
- Call `terraform apply` which re-deploys the function with the latest code from your local machine
- Pros
- Keeps the terraform state clean
- Same command for all changes, regardless of infra or cloud function code
- Cons
- Less familiar way of deploying cloud functions (if you're used to `gcloud functions deploy`)
- Less log output
- Slightly slower because `terraform apply` will always fetch the current state from the cloud storage bucket before deploying
2. Via `gcloud` by running `npm run deploy:via:gcloud`
- How? The npm task will:
- Look up the service account used by the cloud function
- Call `gcloud functions deploy` with the correct parameters
- Pros
- Familiar way of deploying cloud functions
- More log output making deployment failures slightly faster to debug
- Slightly faster because we're skipping the terraform state lookup
- Cons
- Will lead to inconsistent terraform state (because terraform is tracking the function source code and its version)
- Different commands to remember when updating infra components vs cloud function source code
- Will only work for updating a pre-existing cloud function's code, will fail for a first-time deploy

## Infra Deployment via Terraform

### Google Cloud Permission Requirements

Expand All @@ -176,28 +214,47 @@ In order to create this project from scratch using the [terraform-google-bootstr

### Deployment from scratch

1. Outcomment the `backend` section in `main.tf` (because this bucket doesn't exist yet, it will be created by the first `terraform apply` run)

```hcl
# backend "gcs" {
# bucket = "governance-watchdog-terraform-state-<random-suffix>"
# }
```

1. Run `terraform init` to install the required providers and init a temporary local backend in a `terraform.tfstate` file

<!-- markdown-link-check-disable -->

1. [Create a Discord Webhook URL](https://support.discord.com/hc/en-us/articles/228383668-Intro-to-Webhooks) for the channel you want to receive notifications in <!-- markdown-link-check-enable -->

2. Add the Discord Webhook URL to your local `terraform.tfvars`:
1. Add the Discord Webhook URL to your local `terraform.tfvars`:

```sh
# This will be stored in Google Secret Manager upon deployment via Terraform
echo "discord_webhook_url = \"<discord-webhook-url>"" >> terraform.tfvars
```

3. Outcomment the `backend` section in `main.tf` (because this bucket doesn't exist yet, it will be created by the first `terraform apply` run)
1. Create a Telegram group and invite a new bot into it

```hcl
# backend "gcs" {
# bucket = "governance-watchdog-terraform-state-<random-suffix>"
# }
```
- Open a new telegram chat with @BotFather
- Use the `/newbot` command to create a new bot
- Copy the API key printed out at the end of the prompt and store it in your `terraform.tfvars`

```hcl
telegram_bot_token = "<bot-api-key>"
```

- Get the Chat ID by inviting @MissRose_bot to the group and then using the `/id` command
- Add the Chat ID to your `terraform.tfvars`

```hcl
telegram_chat_id = "<group-chat-id>"
```

4. Run `terraform init` to install the required providers and init a temporary local backend in a `terraform.tfstate` file
- Remove @MissRose_bot after you got the Chat ID

5. **Deploy the entire project via `terraform apply`**
1. **Deploy the entire project via `terraform apply`**

- You will see an overview of all resources to be created. Review them if you like and then type "Yes" to confirm.
- This command can take up to 10 minutes because it does a lot of work creating and configuring all defined Google Cloud Resources
Expand All @@ -206,21 +263,17 @@ In order to create this project from scratch using the [terraform-google-bootstr

**Often a simple retry of `terraform apply` helps**. Sometimes a dependency of a resource has simply not finished creating when terraform already tried to deploy the next one, so waiting a few minutes for things to settle can help.

6. Set your local `gcloud` project to our freshly created one:
1. Set your local `gcloud` project ID to our freshly created one:

```sh
# If that `awk` magic fails, just look up the project ID manually via `gcloud projects list`
project_id=$(terraform state show "module.bootstrap.module.seed_project.module.project-factory.google_project.main" | grep 'project_id' | awk -F '"' '{print $2}')

gcloud config set project $project_id
gcloud auth application-default set-quota-project $project_id
./set-project-id.sh
```

7. Check that everything worked as expected
1. Check that everything worked as expected

```sh
# 1. Call the deployed function via:
npm run test-in-prod # or call the script directly via ./test-deployed-function.sh
npm run test:prod # or call the script directly via ./test-deployed-function.sh

# 2. Monitor the configured Discord channel for a message to appear
open https://discord.com/channels/966739027782955068/1262714272476037212
Expand Down Expand Up @@ -271,34 +324,14 @@ For all team members to be able to manage the Google Cloud infrastructure, you n
rm terraform.tfstate.backup
```

## Updating the Cloud Function
## Debugging Problems

You have two options, using `terraform` or the `gcloud` cli. Both are perfectly fine to use.
### View Logs

1. Via `terraform` by running `npm run deploy:via:tf`
- How? The npm task will:
- Compile TS to JS
- Zip the `./dist` folder into `function-source.zip`
- And then call `terraform apply` which re-deploys the function with the new source code from the zip file
- Pros
- Keeps the terraform state clean
- Same command for all changes, regardless of infra or cloud function code
- Cons
- Less familiar way of deploying cloud functions (if you're used to `gcloud functions deploy`)
- Less log output
- Slightly slower because `terraform apply` will always fetch the current state from the cloud storage bucket before deploying
2. Via `gcloud` by running `npm run deploy:via:gcloud`
- How? The npm task will:
- Generate a temporary `.env.yaml` (because for some reason gcloud does not support normal `.env` files)
- Look up the service account used by the cloud function
- Call `gcloud functions deploy` with the correct parameters
- Pros
- Familiar way of deploying cloud functions
- More log output making deployment failures slightly faster to debug
- Slightly faster because we're skipping the terraform state lookup
- Cons
- Will lead to inconsistent terraform state (because terraform is tracking the function source code and its version)
- Different commands to remember when updating infra components vs cloud function source code
For most problems, you'll likely want to check the cloud function logs first.

- `npm run logs` will print the latest 50 log entries into your local terminal for quick and easy access
- `npm run logs:url` will print the URL to the function logs in the Google Cloud Console for full access

## Teardown

Expand Down
23 changes: 9 additions & 14 deletions deploy-via-gcloud.sh
Original file line number Diff line number Diff line change
@@ -1,19 +1,14 @@
#! /bin/bash
set -e # fail on any error
set -o pipefail # ensure non-zero exit codes are propagated in piped commands
#!/bin/bash
set -e # Fail on any error
set -o pipefail # Ensure piped commands propagate exit codes properly
set -u # Treat unset variables as an error when substituting

entry_point="watchdogNotifier"
function_name="watchdog-notifications"
region="europe-west1"
# Load the project variables
source ./set-project-vars.sh

printf "Looking up function name..."
function_name=$(gcloud functions list --format="value(name)" | grep '^watchdog-notifications')
printf ' \033[1m%s\033[0m\n' "${function_name}"

printf "Looking up project ID..."
project_name="governance-watchdog"
project_id=$(gcloud projects list --filter="name:${project_name}*" --format="value(projectId)")
printf ' \033[1m%s\033[0m\n' "${project_id}"
printf "Looking up entry point..."
entry_point=$(gcloud functions describe "${function_name}" --region="${region}" --format json | jq .buildConfig.entryPoint)
printf ' \033[1m%s\033[0m\n' "${entry_point}"

printf "Looking up service account for function..."
service_account_email=$(gcloud functions describe "${function_name}" --region="${region}" --format="value(serviceConfig.serviceAccountEmail)")
Expand Down
Loading
Loading