Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Readme and Refactor Pipeline Code #157

Merged
merged 6 commits into from
Mar 18, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,25 +19,24 @@ deep: scan
baseline:
@detect-secrets scan --exclude-files '^(yarn.lock|.yarn/|.local/|openapi/)' > .secrets.baseline

# The default outer `test` target only run the top level cdk application unit tests under `./test`
test:
@yarn test

# Test only section
test-stateful:
@yarn run test ./test/stateful
test-stateless:
@yarn run test ./test/stateless

# Run all test suites - i.e. cdk app unit tests + each microservice app test suites
# Run all test suites for each app/microservice
# Each app root should have Makefile `test` target; that run your app test pipeline including compose stack up/down
# Note by running `make suite` target from repo root means your local dev env is okay with all app toolchains i.e.
# Python (conda or venv), Rust and Cargo, TypeScript and Node environment, Docker and Container runtimes
suite: test-stateless
test-suite:
@(cd lib/workload/stateless/sequence_run_manager && $(MAKE) test)
@(cd lib/workload/stateless/metadata_manager && $(MAKE) test)
@(cd lib/workload/stateless/filemanager && $(MAKE) test)

# The default outer `test` target only run the top level cdk application unit tests under `./test`
test: test-stateless test-stateful test-suite

clean:
@yarn clean
@#for zf in $(shell find ./lib/workload/stateless/layers -maxdepth 1 -mindepth 1 -type f -iname '*.zip'); do rm -v $$zf; done
101 changes: 74 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# OrcaBus

UMCCR Orchestration Bus that leverage AWS EventBridge as Event Bus to automate the BioInformatics Workflows Pipeline.
UMCCR OrcaBus (Orchestration Bus) leverages AWS EventBridge as an Event Bus to automate the BioInformatics Workflows Pipeline.

## CDK

Expand All @@ -10,54 +10,101 @@ Please note; this is the _INVERSE_ of some typical standalone project setup such

In this repo, we flip this view such that the Git repo root is the TypeScript CDK project; that wraps our applications into `./lib/` directory. You may [sparse checkout](https://git-scm.com/docs/git-sparse-checkout) or directly open subdirectory to set up the application project alone if you wish; e.g. `webstorm lib/workload/stateless/metadata_manager` or `code lib/workload/stateless/metadata_manager` or `pycharm lib/workload/stateless/sequence_run_manager` or `rustrover lib/workload/stateless/filemanager`. However, `code .` is a CDK TypeScript project.

This root level CDK app contains 4 major stacks: `stateful-pipeline`,`stateless-pipeline` , `stateful` and `stateless`. Pipeline stack is the CI/CD automation with CodePipeline setup. The `stateful` stack holds and manages some long-running AWS infrastructure resources. The `stateless` stack manages self-mutating CodePipeline reusable CDK Constructs for the [MicroService Applications](docs/developer/MICROSERVICE.md). In terms of CDK deployment point-of-view, the microservice application will be "stateless" application such that it will be changing/mutating over time; whereas "the data" its holds like PostgreSQL server infrastructure won't be changing that frequent. When updating "stateful" resources, there involves additional cares, steps and ops-procedures such as backing up database, downtime planning and so on; hence stateful. We use [configuration constants](./config) to decouple the reference between `stateful` and `stateless` AWS resources.
There are 2 CDK apps here:

In most cases, we deploy with automation across operational target environments or AWS accounts: `beta`, `gamma`, `prod`. For some particular purpose (such as onboarding procedure, isolated experimentation), we can spin up the whole infrastructure into some unique isolated AWS account. These key CDK entrypoints are documented in the following sections: Automation and Manual.
- Stateful

This holds and manages long-running AWS stateful resources. The resources will typically be something that won't be
changing frequently and could not be torn down easily. For example, the RDS Cluster which contains application data. When
updating "stateful" resources, additional care is needed such as backing up the database, downtime planning and so on;
hence stateful.

- Stateless

As the opposite of stateful resources, stateless resources will have the ability to redeploy quickly without worrying
about any retainable data. For example, AWS lambdas and API Gateway have no retainable data when destroyed and spin up
easily. The [MicroService Applications](docs/developer/MICROSERVICE.md) resources will usually be here and they will
have a lookup from stateful resources when needed.

You could access the CDK command for each app via `yarn cdk-stateless` or `yarn cdk-stateful`. The `cdk-*` is
just a CDK alias that points to a specific app, so you could use `cdk` command natively for each app (e.g. `yarn cdk-stateless --help`).

We use [configuration constants](./config) to reference constants between the `stateful` and `stateless` CDK apps.

In most cases, we deploy with automation across operational target environments or AWS accounts: `beta` (dev), `gamma` (staging),
`prod`. For some particular purpose (such as onboarding procedure, or isolated experimentation), we can spin up the
whole infrastructure into some unique isolated AWS account.
williamputraintan marked this conversation as resolved.
Show resolved Hide resolved

### Automation

_CI/CD through CodePipeline automation from AWS toolchain account_
_CI/CD through CodePipeline automation from the AWS toolchain account_

There are 2 pipeline stacks in this project, one for the stateful and one for the stateless stack deployment. There is a
script to access the `cdk` command for each pipeline:
-`cdk-stateless-pipeline` - for stateless pipeline
-`cdk-stateful-pipeline` - for stateful pipeline
There are 2 pipeline stacks in this project, one for the `stateful` and one for the `stateless` stack deployment. Both
pipelines are triggered from the `main` branch and configured as a self-mutating pipeline. The pipeline will automatically deploy
CDK changes from `beta` -> `gamma` -> `prod` account, where each transition has an approval stage before deploying to the next account.

```
To access the pipeline's CDK you could do it within the app stack with the pipeline name either be
`OrcaBusStatelessPipeline` or `OrcaBusStatefulPipeline` (e.g. `yarn cdk-stateless
OrcaBusStatelessPipeline`).

In general, you do **NOT** need to touch the pipeline stack at all, as changes to the deployment stack will be taken care of
by the self-mutating pipeline. You might need to touch if there is a dependency in any of the build processes (unit
testing or `cdk synth` ). For example, Rust installation is required to build the lambda asset.

```sh
# prerequisite before running cdk command to the OrcaBus Pipeline
make install
make check
make test
make test # This will test all tests available in this repo

yarn cdk-stateless-pipeline synth
yarn cdk-stateless-pipeline diff
yarn cdk-stateless-pipeline deploy
# accessing the stateless pipeline with cdk
yarn cdk-stateless synth OrcaBusStatelessPipeline
yarn cdk-stateless diff OrcaBusStatelessPipeline
yarn cdk-stateless deploy OrcaBusStatelessPipeline

# or for stateful pipeline
yarn cdk-stateful synth OrcaBusStatefulPipeline
yarn cdk-stateful diff OrcaBusStatefulPipeline
yarn cdk-stateful deploy OrcaBusStatefulPipeline
```

The pipeline is deployed on the toolchain/build account (bastion in the UMCCR AWS account).

### Manual

_manual deploy to an isolated specific AWS account_
_manual deployment from local computer to AWS account_

```
export AWS_PROFILE=dev
You may want to see your resources deployed quickly without relying on the pipeline to do it for you. You could do so by
deploying to the `beta` account by specifying the stack name with the relevant AWS Credentials.

make install
make check
make test
You could use the `yarn cdk-*` command described above to deploy the microservice. Remember you use the credential to
where the resource will be deployed and **NOT** the pipeline (toolchain) credential.

You could list the CDK stacks with the `cdk ls` command to look at the stackId given to your microservice app.

yarn orcabus --help
```sh
yarn cdk-stateless ls

yarn cdk-orcabus list
yarn cdk-orcabus synth OrcaBusStatefulStack
yarn cdk-orcabus diff OrcaBusStatefulStack
yarn cdk-orcabus deploy OrcaBusStatefulStack
yarn cdk-orcabus deploy --all
yarn cdk-orcabus destroy --all
OrcaBusStatelessPipeline
OrcaBusStatelessPipeline/BetaDeployment/MetadataManager
...
```

For example, deploying the metadata manager stateless resources directly from your computer as follows.

```sh
yarn cdk-stateless deploy OrcaBusStatelessPipeline/BetaDeployment/MetadataManager
```

NOTE: If you deployed manually and the pipeline starts running (e.g. a new commit at the source branch) your stack will be overridden to what you have in the main branch.

## Development

_Heads up: Polyglot programming environment. We shorten some trivial steps into `Makefile` target. You may deduce step-by-step from `Makefile`, if any._

To develop your microservice application please read the [microservice guide](docs/developer/MICROSERVICE.md).

Do note that we have some shared resources that is expected to be used across microservices at [shared resource docs](docs/developer/SHARED_RESOURCES.md).

### Typography

When possible, please use either `OrcaBus` (camel case) or `orcabus` (all lower case).
Expand Down
38 changes: 0 additions & 38 deletions bin/orcabus.ts

This file was deleted.

1 change: 0 additions & 1 deletion cdk.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
{
"app": "yarn run -B ts-node --prefer-ts-exts bin/orcabus.ts",
"watch": {
"include": [
"**"
Expand Down
57 changes: 35 additions & 22 deletions docs/developer/MICROSERVICE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,24 @@

There are two high level tasks.

1. **uApp** : create your app using your favourite dev stack and toolchain. This should typically be in `./lib/workload/stateless/`
2. **CDK** : write up the deployment "CDK Construct" of your app; to wire up with the root level infrastructure "CDK App".
1. **µApp** : create your app using your favourite dev stack and toolchain. This should typically be in `./lib/workload/stateless/`
2. **CDK** : write up the deployment CDK Stack of your app

> NOTE:
> * We only have one CDK project at outer level of the Git repository root; i.e. a CDK project in TypeScript.
>
> * Nested CDK projects are discouraged to avoid confusion. If you need a specific CDK `App()` object instance for some experimentation, say building a `demo` CDK app, you can instantiate one through project root level `./bin/demo.ts` and, assemble your demo CDK Stack/Construct(s) under `./lib/` directory. Then, wire up as `yarn` target in project root `package.json > script` entry e.g. `"demo": "cdk --app 'npx ts-node --prefer-ts-exts bin/demo.ts'",` Then, call as `yarn demo list` and `yarn demo deploy --all` and so on so ford.

Either tasks _(developing an app and/or cdk deployment constructs)_; we promote code reuse for some boilerplate and common best practise patterns. Please give time to read articles/concepts in the `Reading` section below for developer background technical alignment. We also share knowledge-based (KB) discussion, revise and harmonise these high level technical concepts through our routine OrcaBus catchup meetings.

## uApp
## µApp

µApp = microservice app

_Mac user: Option + M for the µ symbol_

_uApp = microservice app_

### Native Bootstrap

You may also just simply use "native toolchain boostrap" method. This could be the typical "getting started" of respective tool or framework. Some examples as follows.
You may also just simply use "native toolchain bootstrap" method. This could be the typical "getting started" of respective tool or framework. Some examples as follows.

```
cargo init
Expand Down Expand Up @@ -48,30 +49,42 @@ Think of; it is "the origin" of where your _now_ very complex application to dat

## CDK

Since it is the single CDK Project, all CDK dependencies are managed centrally at `package.json` at the Git repo project root and, the CDK CLI version is harmonised with localised Node.js execution through Yarn e.g. `yarn cdk list`. With this way, every developer's local dev environment and, automation CodePipeline environment will have the same CDK version, enforced.

### Infrastructure as Code for microservice

- Encourage to use CDK with TypeScript.
- You could write one CDK construct from scratch. However, prefer use Construct Library whenever possible.
- In the order of preference; please browse and make use of Construct patterns from the following.
1. https://docs.aws.amazon.com/solutions/latest/constructs/welcome.html
2. https://serverlessland.com
3. https://constructs.dev
- Please check existing microservice implementations for reference.
Since it is the single CDK Project, all CDK dependencies are managed centrally at `package.json` at the Git repo project root and, the CDK CLI version is harmonised with localised Node.js execution through Yarn. With this way, every developer's local dev environment and, automation CodePipeline environment will have the same CDK version, enforced.

For example, to use https://docs.aws.amazon.com/solutions/latest/constructs/aws-cognito-apigateway-lambda.html

- At project root, execute as follows:
```
At project root, execute as follows:

```sh
yarn add @aws-solutions-constructs/aws-cognito-apigateway-lambda
```

- Or, to remove:
```
Or, to remove:

```sh
yarn remove @aws-solutions-constructs/aws-cognito-apigateway-lambda
```

### Infrastructure as Code for microservice

You are encouraged to use CDK with TypeScript.

In general the microservice application should be deployed as an independent stack. Having your CDK as a stack allow to deploy
your microservice app without touching other microservices app.

Most probably you microservice stack should only create new stateless resources as most of the stateful part may already
be provisioned from the shared stateful stack. For example, your application may need an RDS cluster for its database,
but the shared stack has an existing RDS cluster that is intended to be used across microservices.

See [SHARED_RESOURCES.md](./SHARED_RESOURCES.md) for more shared resources detail.

Useful resources:

1. https://docs.aws.amazon.com/solutions/latest/constructs/welcome.html
2. https://serverlessland.com
3. https://constructs.dev


## Reading

1. https://trello.com/c/KDVIxQfm/1407-orcabus-v1-capture-high-level-orcabus-design-tech-choices
Expand Down
28 changes: 28 additions & 0 deletions docs/developer/SHARED_RESOURCES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Shared Resources

In the stateful world of the OrcaBus we will be sharing some resources so it could be used across microservices.
These resources will be deployed into a stack and will go under the CDK stateful app.

These stateful resources usually have a unique name that could act as an Id for the resource. The unique name will be
defined at the CDK config file where it could be passed in both stateful and stateless stack. The stateless stack can
use the resource by the CDK lookup.

## Database

An Amazon Aurora Serverless PostgreSQL is provisioned to be used across microservices.

A security group is created and available for lookup that could be attached to your compute which allow traffic to the
RDS cluster. The security group name is in the CDK config that your microservice could pass this in as one of the stack props.

Each RDS cluster could contain multiple databases and each microservice is expected to to create their own database and
role to be used in their application. There is a microservice called `PostgresManager` that specifically handle this administrative
task on PostgreSQL.

RDS IAM is enabled for the cluster, therefore is encouraged to used rather than relying on username-password approach to login to your
database. You could choose the type of the authentication upon creating a role at the RDS when using the `PostgresManager`.

Please check the: [PostgresManager](../../lib/workload/stateless/postgres_manager/README.md)


## Eventbridge
...
2 changes: 1 addition & 1 deletion lib/pipeline/orcabus-stateful-pipeline-stack.ts
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ export class StatefulPipelineStack extends cdk.Stack {
});

const synthAction = new pipelines.CodeBuildStep('Synth', {
commands: ['yarn install --immutable', 'yarn run cdk-stateful-pipeline synth'],
commands: ['yarn install --immutable', 'yarn run cdk-stateful synth'],
input: unitTest,
primaryOutputDirectory: 'cdk.out',
rolePolicyStatements: [
Expand Down
10 changes: 5 additions & 5 deletions lib/pipeline/orcabus-stateless-pipeline-stack.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ export class StatelessPipelineStack extends cdk.Stack {
`source $HOME/.cargo/env`,
`pip3 install cargo-lambda`,
],
commands: ['yarn install --immutable', 'make suite'],
commands: ['yarn install --immutable', 'make test-stateless', 'make test-suite'],
input: sourceFile,
primaryOutputDirectory: '.',
buildEnvironment: {
Expand Down Expand Up @@ -62,7 +62,7 @@ export class StatelessPipelineStack extends cdk.Stack {
`source $HOME/.cargo/env`,
`pip3 install cargo-lambda`,
],
commands: ['yarn install --immutable', 'yarn run cdk-stateless-pipeline synth'],
commands: ['yarn install --immutable', 'yarn run cdk-stateless synth'],
input: unitTest,
primaryOutputDirectory: 'cdk.out',
rolePolicyStatements: [
Expand Down Expand Up @@ -99,7 +99,7 @@ export class StatelessPipelineStack extends cdk.Stack {
const betaConfig = getEnvironmentConfig('beta');
if (!betaConfig) throw new Error(`No 'Beta' account configuration`);
pipeline.addStage(
new OrcaBusStatelessDeploymentStage(this, 'BetaStatelessDeployment', betaConfig.stackProps, {
new OrcaBusStatelessDeploymentStage(this, 'BetaDeployment', betaConfig.stackProps, {
account: betaConfig.accountId,
})
);
Expand All @@ -116,7 +116,7 @@ export class StatelessPipelineStack extends cdk.Stack {
// pipeline.addStage(
// new OrcaBusStatelessDeploymentStage(
// this,
// 'GammaStatelessDeployment',
// 'GammaDeployment',
// gammaConfig.stackProps,
// {
// account: gammaConfig.accountId,
Expand All @@ -131,7 +131,7 @@ export class StatelessPipelineStack extends cdk.Stack {
// const prodConfig = getEnvironmentConfig('prod');
// if (!prodConfig) throw new Error(`No 'Prod' account configuration`);
// pipeline.addStage(
// new OrcaBusStatelessDeploymentStage(this, 'ProdStatelessDeployment', prodConfig.stackProps, {
// new OrcaBusStatelessDeploymentStage(this, 'ProdDeployment', prodConfig.stackProps, {
// account: prodConfig?.accountId,
// }),
// { pre: [new pipelines.ManualApprovalStep('PromoteToProd')] }
Expand Down
Loading
Loading