Aegis

The modern concept of doing something "under someone's aegis" means doing something under the protection of a powerful, knowledgeable, or benevolent source. The word Aegis is identified with protection by a strong force rooted in Greek mythology and adopted by the Romans.

Aegis is a monitoring tool that exposes the result of on-chain view calls as Prometheus metrics that get ingested into Grafana. The system's ethos is that it should be generic and agnostic when it comes to business logic.

It allows you to turn a config.yaml like this:

# config.yaml
source: SortedOracles.numRates(address rateFeed)(uint256)
schedule: 0/10 * * * * *
type: gauge
chains: all
variants:
  - ['CELOUSD']
  - ['CELOEUR']
  - ['CELOBRL']
  - ['USDCUSD']
  - ['USDCEUR']
  - ['USDCBRL']

Into a Grafana Dashboard like this:

Running the app
Tests
Deployment
Checking the Logs
Configuration
Terraform for Grafana

Running the app

# install dependencies
pnpm install

# run a dev server
pnpm start

# run a dev server with hot reload
pnpm run start:dev

# run in prod mode
pnpm run start:prod

Tests

# unit tests
pnpm run test

# test coverage
pnpm run test:cov

Deployment

Deploying Aegis

There are three main components you have to think about:

The aegis service that polls view calls and exposes Prometheus metrics based on a config.yaml file.
A service that ingests the metrics, this could be: a. A grafana-agent instance which pushes the metrics to grafana-cloud. b. A Prometheus server that ingests the metrics.
(Optional) Helper smart contracts, which do any transformations needed to on-chain data for ingestion by aegis.

Deploying aegis is done by running

pnpm run deploy

To deploy the grafana-agent follow the instructions in grafana-agent/README.md

Deploying Grafana Resources

The Grafana Dashboard and Grafana Alerts are managed via Terraform and can be deployed via:

pnpm run tf:deploy

How to deploy a new rate feed

Update the config.yaml:
- Add the new rate feed IDs and relayer signer wallets to global.vars
- Add the new rate feeds as variants to the SortedOracles.isOldestReportExpired() metric
- Add the new rate feeds as variants to the BreakerBox.getRateFeedTradingMode() metric
- Add the new relayer signer as variants to the CELOToken.balanceOf() metric
Test the new config locally by running pnpm start and checking for any errors in the logs
After code review, deploy the new config via pnpm run deploy
After successful deployment, check the logs for any errors via pnpm run logs
Check that the new metrics appear in the Grafana Dashboard: pnpm run grafana
- New rate feeds should be picked up automatically, it might take a few minutes after they show up
Check that new Oracle Relayer Grafana Alerts have been added for the new Relayer Signer Wallets' CELO Balance

Checking the Logs

# Tails the logs of the prod aegis app
pnpm run logs

Configuration

The config.yaml has three immediate children:

interface Config {
  global: Global; // Global definitions
  chains: Chain[]; // Chain definitions
  metrics: Metric[]; // Metric definitions
}

Global Config

The global section hosts variables that can be referenced in the metrics section as arguments passed to view calls. Global variables can also be extended or overridden by `chain '-specific variables.

interface Global {
  vars: Record<string, string>;
}

Chain-specific Config

The chain section is an array of chain-specific settings:

interface Chain {
  id: string;
  label: string;
  httpRpcUrl: string;
  contracts: Record<string, string>;
  vars: Record<string, string>;
}

If the chain id matches an import from viem's supported chains, we will use that import, enabling multicall and other goodness (for example, celoAlfajores or celo). However, arbitrary values also work, and Aegis will create a custom chain instance in that case.
label will be used for segmenting in the context of Prometheus metrics chain={label}.
contracts should list addresses for all contracts referenced in metrics.
vars can contain chain-specific variables to extend or override entries in the global.vars object.

Metrics Config

Metric defines a metric with all its possible variations.

interface Metric {
  source: string;
  schedule: string;
  type: 'gauge';
  chains: 'all' | string[];
  variants: string[][];
}

The source for a metric is the view call used in string format of:
```
Contract.function(inputType inputName)(outputType)
```
For example:
```
SortedOracles.numRates(address rateFeed)(uint256)
```
The contract name must be defined in the chains[id].contracts configs that the metric targets.

The system currently supports various view call types, such as:
- SortedOracles.numRates(address rateFeed)(uint256): A single uint256 value which must not exceed Number.MAX_SAFE_INTEGER.
- CELOToken.balanceOf(address owner)(uint256): A single uint256 value, divided by 1e18 which must not exceed Number.MAX_SAFE_INTEGER.
- OracleHelper.deviation(address rateFeed)(uint256,uint256): Two uint256: value and scale. The metric exposed is then value/scale, which must also fit into a Number. The conversion will keep 1e6 of precision.
See the Adding a new Metric section to learn how to add new view calls to Aegis.
schedule is a cron schedule definition.
type can currently be only gauge.
chains can be all or an array of chain IDs to which this metric will apply.
variants is a list where each item is an array of arguments passed to the view call.

Full Metrics Example

source: SortedOracles.numRates(address rateFeed)(uint256)
schedule: 0/10 * * * * *
type: gauge
chains: all
variants:
  - ['CELOUSD']
  - ['CELOEUR']
  - ['CELOBRL']
  - ['USDCUSD']
  - ['USDCEUR']
  - ['USDCBRL']

The variants are where the vars from the global and chain configs come into play. Here, we're calling the function six times and, each time, passing the value of the variant variables as the rateFeed argument to the view call.

All arguments of the view call will also be passed as metric labels to Prometheus. Thus, each metric will result in number of variants * number of chains values recorded.

An example of the Prometheus endpoint result:

SortedOracles_numRates{rateFeed="CELOBRL",chain="celo"} 10
SortedOracles_numRates{rateFeed="CELOEUR",chain="celo"} 10
SortedOracles_numRates{rateFeed="CELOUSD",chain="celo"} 10
SortedOracles_numRates{rateFeed="USDCBRL",chain="celo"} 0
SortedOracles_numRates{rateFeed="USDCEUR",chain="celo"} 0
SortedOracles_numRates{rateFeed="USDCUSD",chain="celo"} 10
SortedOracles_numRates{rateFeed="CELOBRL",chain="alfajores"} 5
SortedOracles_numRates{rateFeed="CELOEUR",chain="alfajores"} 5
SortedOracles_numRates{rateFeed="CELOUSD",chain="alfajores"} 5
SortedOracles_numRates{rateFeed="USDCBRL",chain="alfajores"} 5
SortedOracles_numRates{rateFeed="USDCEUR",chain="alfajores"} 5
SortedOracles_numRates{rateFeed="USDCUSD",chain="alfajores"} 6

Adding a new Metric

Add the contract you want to run a view call on to the chains[id].contracts section in config.yaml and make sure to add the correct address for each chain
Add your new view call to the bottom of the metrics section
1. If your view call needs any input parameters, make sure to define these either in global.vars or chains[id].vars, and reference them as variants in your metric
Extend the switch statement in the Metric.parse() function with the appropriate logic for your view call's contract & function name.
1. If you already see another case for an existing view call using the same logic (i.e. another call returning a simple uint256), you can add the function name of your view call to that case
2. If your view call requires new or adjusted logic, add a new case for your function name with the appropriate logic
Try out your changes locally by running npm run dev and see if the logs output the values you expect
If everything works locally, deploy your changes via npm run deploy
After successful deployment, check if everything works as expected by monitoring the logs via npm run logs
Create a new Grafana visualization consuming your newly added metric
1. If you're not a Grafana expert, the easiest would be to create a new empty dashboard and manually compose your query via the UI. You can also take inspiration from viewing the configuration of existing queries on other dashboards.
Export your new Grafana visualization to Terraform format
1. After you're happy with your manually created query, it's time to convert it to Terraform so we can manage all our Grafana config as code and under version control. Your manually created query should offer an Export option somewhere on the top right.
2. From there, it should have the option to export as JSON, YAML, or Terraform (HCL) — pick Terraform (HCL)
Add your export to ./terraform/grafana-dashboard/dashboard.tf to the appropriate section
1. Finding the right place can be a bit annoying as the exported config is quite verbose. AI is your friend here. You can copy/paste the existing dashboard.tf into your LLM of choice and then ask it to insert your newly exported visualization into the right place.
Deploy your new Grafana visualization into the main Aegis dashboard via cd terraform && terraform apply
Ensure that it worked by reviewing the main Aegis dashboard in Grafana
If anything went wrong, roll back your changes to dashboard.tf and keep editing until you get it right :)

Terraform for Grafana

We use Terraform to deploy Grafana Dashboards and Grafana Alerts. The end-to-end Aegis flow is as follows:

The Aegis service executes view calls and forwards the results to Prometheus
Grafana ingests Prometheus metrics and allow us to visualize and react to them
Terraform deploys our metric visualizations into Grafana dashboards
Terraform also deploys our alert rules based on these metrics

Set up Terraform

Install Terraform

# On macOS
brew tap hashicorp/tap
brew install hashicorp/tap/terraform

# For other systems, see https://developer.hashicorp.com/terraform/install

Initialize Terraform
```
cd terraform
terraform init
```

Set up your local terraform.tfvars (think of it as .env for Terraform)

# Create the file
touch terraform/terraform.tfvars

####################
# terraform.tfvars #
####################

# Get this from LastPass
grafana_service_account_token =

# Get this from the Discord channel integration settings of #🚨︱stg-oracle-relayers
discord_alerts_webhook_url_staging =

# Get this from the Discord channel integration settings of #🚨︱prod-oracle-relayers
discord_alerts_webhook_url_prod =

# Get this from the Discord channel integration settings of #🏦︱reserve-alerts
discord_alerts_webhook_url_reserve =

# Get this from the Discord channel integration settings of #alerts-catch-all
discord_alerts_webhook_url_catch_all =

# Get this from [our VictorOps dashboard](https://portal.victorops.com/dash/mento-labs-gmbh#/advanced/grafana) (find the routing key under ["settings"](https://portal.victorops.com/dash/mento-labs-gmbh#/routekeys))
splunk_on_call_alerts_webhook_url =

Check that it's set up correctly

# You must be inside the ./terraform folder for this command
terraform plan

Grafana Dashboard

# Opens the Aegis Grafana Dashboard in your default browser
pnpm run grafana

We are using Terraform to deploy a Grafana Dashboard containing visualizations for all configured metrics.

To update the dashboard, you simply make the desired changes in ./terraform/grafana-dashboard and then run cd terraform && terraform apply to deploy them.

Grafana Alerts

We are using Terraform to deploy Discord and On-Call Alerts based on the Aegis metrics.

To update the alerts, you simply make the desired changes in ./terraform/grafana-alerts and then run cd terraform && terraform apply to deploy them.

Grafana uses the following concepts for managing alerts:

Alert Rules: A set of evaluation criteria for when an alert should trigger
Contact Points: Alert channels like Discord, Splunk/VictorOps, Email etc.
Notification Policies: Routing rules to determine which alerts get routed to what contact points.

Terraform Troubleshooting

Terraform fails to delete Grafana Resources

At times, we've seen Terraform throw 409s when trying to delete old Grafana Resources (to replace them with new ones).

You have two choices when this happens:

Nuke everything via terraform destroy and re-deploy everything from a clean slate via terraform apply
OR try to manually delete the resources terraform is struggling with via the Grafana API. There is a little helper script that has some example API calls that you can use locally: grafana-api-interactions.sh

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
.github		.github
.trunk		.trunk
bin		bin
contracts		contracts
docs		docs
grafana-agent		grafana-agent
lib		lib
src		src
terraform		terraform
.eslintrc.js		.eslintrc.js
.gcloudignore		.gcloudignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.prettierrc		.prettierrc
.tool-versions		.tool-versions
README.md		README.md
app.yaml		app.yaml
commitlint.config.mjs		commitlint.config.mjs
config.yaml		config.yaml
foundry.toml		foundry.toml
nest-cli.json		nest-cli.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aegis

Running the app

Tests

Deployment

Deploying Aegis

Deploying Grafana Resources

How to deploy a new rate feed

Checking the Logs

Configuration

Global Config

Chain-specific Config

Metrics Config

Full Metrics Example

Adding a new Metric

Terraform for Grafana

Set up Terraform

Grafana Dashboard

Grafana Alerts

Terraform Troubleshooting

Terraform fails to delete Grafana Resources

About

Releases

Packages

Contributors 7

Languages

mento-protocol/aegis

Folders and files

Latest commit

History

Repository files navigation

Aegis

Running the app

Tests

Deployment

Deploying Aegis

Deploying Grafana Resources

How to deploy a new rate feed

Checking the Logs

Configuration

Global Config

Chain-specific Config

Metrics Config

Full Metrics Example

Adding a new Metric

Terraform for Grafana

Set up Terraform

Grafana Dashboard

Grafana Alerts

Terraform Troubleshooting

Terraform fails to delete Grafana Resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages