Skip to content

Commit

Permalink
💥 refactor: Refactor codebase and add docs
Browse files Browse the repository at this point in the history
  • Loading branch information
bmd1905 committed Aug 31, 2024
1 parent 8339814 commit ec83e7b
Show file tree
Hide file tree
Showing 276 changed files with 70,176 additions and 175 deletions.
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
Expand All @@ -24,7 +23,6 @@ share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
Expand Down
3 changes: 1 addition & 2 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,7 @@ pipeline {
steps {
script {
container('helm') {
sh("chmod +x ./cluster.sh")
sh("./cluster.sh")
sh("kubectl apply -f ./open-webui/kubernetes/manifest/base -n model-serving")
}
}
}
Expand Down
201 changes: 171 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

[![Pipeline](./assets/prompt_alchemy.jpg)](#features)

## ! Please go to [here](https://bmd1905.github.io/PromptAlchemy/) to read the docs due to the heavily of the documentation)
## Please go to [here](https://bmd1905.github.io/PromptAlchemy/) to read the docs due to the heavily of the documentation)

## Target Audience: Developers

Expand Down Expand Up @@ -227,8 +227,7 @@ Next, Deploy the web UI to your GKE cluster:

```bash
cd open-webui
kubens model-serving
kubectl apply -f ./kubernetes/manifest/base
kubectl apply -f ./kubernetes/manifest/base -n model-serving
```

![Deploy Open WebUI](assets/gifs/7-deploy-openwebui.gif)
Expand All @@ -245,7 +244,9 @@ For automated CI/CD pipelines, use Jenkins and Ansible as follows:

**1. Set up Jenkins Server:**

First create a Google Compute Engine instance named "jenkins-server" running Ubuntu 22.04 with a firewall rule allowing traffic on ports 8081 and 50000 from any source.
First, create a Service Account and assign it the `Compute Admin` role. Then create a Json key file for the Service Account and store it in the `iac/ansible/secrets` directory.

Next create a Google Compute Engine instance named "jenkins-server" running Ubuntu 22.04 with a firewall rule allowing traffic on ports 8081 and 50000.

```bash
ansible-playbook iac/ansible/deploy_jenkins/create_compute_instance.yaml
Expand All @@ -257,8 +258,20 @@ Deploy Jenkins on a server by installing prerequisites, pulling a Docker image,
ansible-playbook -i iac/ansible/inventory iac/ansible/deploy_jenkins/deploy_jenkins.yaml
```

![Create Ansible secrets](assets/gifs/9-create-ansible-secrets.gif)

**2. Access Jenkins:**

To access the Jenkins server through SSH, we need to create a public/private key pair. Run the following command to create a key pair:

```bash
ssh-keygen
```

Open `Metadata` and copy the `ssh-keys` value.

![Create SSH key pair](assets/gifs/10-setup-ssh-key.gif)

We need to find the Jenkins server password to be able to access the server. First, access the Jenkins server:

```bash
Expand All @@ -268,17 +281,19 @@ ssh <USERNAME>:<EXTERNAL_IP>
Then run the following command to get the password:

```bash
sudo docker exec -it jenkins bash
sudo docker exec -it jenkins-server bash
cat /var/jenkins_home/secrets/initialAdminPassword
```

![Get Jenkins password](assets/gifs/11-get-jenkins-password.gif)

Once Jenkins is deployed, access it via your browser:

```plaintext
http://<EXTERNAL_IP>:8081
```

Get password
![Access Jenkins](assets/gifs/12-access-jenkins-server.gif)

**3. Install Jenkins Plugins:**

Expand All @@ -287,53 +302,179 @@ Install the following plugins to integrate Jenkins with Docker, Kubernetes, and
- Docker
- Docker Pipeline
- Kubernetes
- GCloud SDK
- Google Kubernetes Engine

After installing the plugins, restart Jenkins.

```bash
sudo docker restart jenkins-server
```

![Install Jenkins Plugins](assets/gifs/13-install-plugins.gif)

**4. Configure Jenkins:**

Set up your GitHub repository in Jenkins, and add the necessary credentials for DockerHub and GKE.
4.1. Add webhooks to your GitHub repository to trigger Jenkins builds.

Go to the GitHub repository and click on `Settings`. Click on `Webhooks` and then click on `Add Webhook`. Enter the URL of your Jenkins server (e.g. `http://<EXTERNAL_IP>:8081/github-webhook/`). Then click on `Let me select individual events` and select `Let me select individual events`. Select `Push` and `Pull Request` and click on `Add Webhook`.

![Add webhooks to GitHub repository](assets/gifs/14-add-webhooks.gif)

4.2. Add Github repository as a Jenkins source code repository.

Go to Jenkins dashboard and click on `New Item`. Enter a name for your project (e.g. `prompt-alchemy`) and select `Multibranch Pipeline`. Click on `OK`. Click on `Configure` and then click on `Add Source`. Select `GitHub` and click on `Add`. Enter the URL of your GitHub repository (e.g. `https://github.com/bmd1905/PromptAlchemy`). In the `Credentials` field, select `Add` and select `Username with password`. Enter your GitHub username and password (or use a personal access token). Click on `Test Connection` and then click on `Save`.

![Add Github repository as a Jenkins source code repository](assets/gifs/15-add-github-repo.gif)

4.3. Setup docker hub credentials.

First, create a Docker Hub account. Go to the Docker Hub website and click on `Sign Up`. Enter your username and password. Click on `Sign Up`. Click on `Create Repository`. Enter a name for your repository (e.g. `prompt-alchemy`) and click on `Create`.

From Jenkins dashboard, go to `Manage Jenkins` > `Credentials`. Click on `Add Credentials`. Select `Username with password` and click on `Add`. Enter your Docker Hub username, access token, and set `ID` to `dockerhub`.

![Setup docker hub credentials](assets/gifs/16-setup-dockerhub-credentials.gif)

4.4. Setup Kubernetes credentials.

First, create a Service Account for the Jenkins server to access the GKE cluster. Go to the GCP console and navigate to IAM & Admin > Service Accounts. Create a new service account with the `Kubernetes Engine Admin` role. Give the service account a name and description. Click on the service account and then click on the `Keys` tab. Click on `Add Key` and select `JSON` as the key type. Click on `Create` and download the JSON file.

![Setup Kubernetes credentials](assets/gifs/17-setup-kubernetes-credentials.gif)

Then, from Jenkins dashboard, go to `Manage Jenkins` > `Cloud`. Click on `New cloud`. Select `Kubernetes`. Enter the name of your cluster (e.g. `gke-prompt-alchemy-cluster-1), enter the URL and Certificate from your GKE cluster. In the `Kubernetes Namespace`, enter the namespace of your cluster (e.g. `model-serving`). In the `Credentials` field, select `Add` and select `Google Service Account from private`. Enter your project-id and the path to the JSON file.

![Setup Kubernetes credentials](assets/gifs/18-setup-kubernetes-credentials.gif)

**5. Test the setup:**

Push a new commit to your GitHub repository. You should see a new build in Jenkins.

![Test the setup](assets/gifs/19-test-cicd.gif)


### Monitoring with Prometheus
### Monitoring with Prometheus and Grafana

To monitor your deployed application, follow these steps:
**1. Create Discord webhook:**

**1. Install Dependencies:**
First, create a Discord webhook. Go to the Discord website and click on `Server Settings`. Click on `Integrations`. Click on `Create Webhook`. Enter a name for your webhook (e.g. `prompt-alchemy-discord-webhook`) and click on `Create`. Copy the webhook URL.

![Create Discord webhook](assets/gifs/20-create-discord-webhook.gif)


**2. Configure Helm Repositories**

First, we need to add the necessary Helm repositories for Prometheus and Grafana:

```bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
```

These commands add the official Prometheus and Grafana Helm repositories and update your local Helm chart information.

**3. Install Dependencies**

Prometheus requires certain dependencies that can be managed with Helm. Navigate to the monitoring directory and build these dependencies:

```bash
cd deployments/monitoring/kube-prometheus-stack
helm dependency build
helm dependency build ./deployments/monitoring/kube-prometheus-stack
```

**2. Deploy Prometheus:**
**4. Deploy Prometheus**

Deploy Prometheus and its associated services using Helm:
Now, we'll deploy Prometheus and its associated services using Helm:

```bash
kubectl create namespace monitoring
helm upgrade --install -f deployments/monitoring/kube-prometheus-stack.expanded.yaml kube-prometheus-stack deployments/monitoring/kube-prometheus-stack -n monitoring
```

This setup will provide monitoring capabilities for your Kubernetes cluster, ensuring you can track performance and troubleshoot issues.
This command does the following:
- `helm upgrade --install`: This will install Prometheus if it doesn't exist, or upgrade it if it does.
- `-f deployments/monitoring/kube-prometheus-stack.expanded.yaml`: This specifies a custom values file for configuration.
- `kube-prometheus-stack`: This is the release name for the Helm installation.
- `deployments/monitoring/kube-prometheus-stack`: This is the chart to use for installation.
- `-n monitoring`: This specifies the namespace to install into.

![Deploy Prometheus](assets/gifs/21-start-monitoring-services.gif)

By default, the services are not exposed externally. To access them, you can use port-forwarding:

For Prometheus:
```bash
kubectl port-forward -n monitoring svc/kube-prometheus-stack-prometheus 9090:9090
```
Then access Prometheus at `http://localhost:9090`

For Grafana:
```bash
kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80
```
Then access Grafana at `http://localhost:3000`

The default credentials for Grafana are usually:
- Username: admin
- Password: prom-operator (you should change this immediately)

![Access Prometheus and Grafana](assets/gifs/22-access-prom-graf.gif)

**5. Test Alerting**

First we need to create a sample alert. Navigate to the `monitoring` directory and run the following command:

```bash
kubectl port-forward -n monitoring svc/alertmanager-operated 9093:9093
```

Then, in a new terminal, run the following command:

```bash
curl -XPOST -H "Content-Type: application/json" -d '[
{
"labels": {
"alertname": "DiskSpaceLow",
"severity": "critical",
"instance": "server02",
"job": "node_exporter",
"mountpoint": "/data"
},
"annotations": {
"summary": "Disk space critically low",
"description": "Server02 has only 5% free disk space on /data volume"
},
"startsAt": "2023-09-01T12:00:00Z",
"generatorURL": "http://prometheus.example.com/graph?g0.expr=node_filesystem_free_bytes+%2F+node_filesystem_size_bytes+%2A+100+%3C+5"
},
{
"labels": {
"alertname": "HighMemoryUsage",
"severity": "warning",
"instance": "server03",
"job": "node_exporter"
},
"annotations": {
"summary": "High memory usage detected",
"description": "Server03 is using over 90% of its available memory"
},
"startsAt": "2023-09-01T12:05:00Z",
"generatorURL": "http://prometheus.example.com/graph?g0.expr=node_memory_MemAvailable_bytes+%2F+node_memory_MemTotal_bytes+%2A+100+%3C+10"
}
]' http://localhost:9093/api/v2/alerts
```

This command creates a sample alert. You can verify that the alert was created by running the following command:

```bash
curl http://localhost:9093/api/v2/status
```

Or, you can manually check the Discord channel.

## 📝 To-Do List
![Discord alert](assets/gifs/23-discord-alert.gif)

### 🚀 Deployment
- [x] Implement core features
- [x] Set up CI pipeline (Jenkins)
- [x] IaC (Ansible + Terraform)
- [x] Monitoring (Grafana + Prometheus + Alert)
- [x] Caching chatbot responses (Redis)
- [ ] Tracing (Jaeger)
- [ ] Set up CD pipeline (Argo CD)
- [ ] Optimize performance (Batching)
---

### 🌟 Post-Launch
- [ ] Create tutorials and examples
- [ ] Gather user feedback
- [ ] Implement enhancements
This setup provides comprehensive monitoring capabilities for your Kubernetes cluster. With Prometheus collecting metrics and Grafana visualizing them, you can effectively track performance, set up alerts for potential issues, and gain valuable insights into your infrastructure and applications.

## Contributing
We welcome contributions to PromptAlchemy! Please see our CONTRIBUTING.md for more information on how to get started.
Expand Down
Binary file added assets/gifs/10-setup-ssh-key.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/11-get-jenkins-password.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/12-access-jenkins-server.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/13-install-plugins.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/14-add-webhooks.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/15-add-github-repo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/16-setup-dockerhub-credentials.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/17-setup-kubernetes-credentials.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/18-setup-kubernetes-credentials.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/19-test-cicd.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/20-create-discord-webhook.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/21-start-monitoring-services.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/22-access-prom-graf.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/23-discord-alert.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/gifs/9-create-ansible-secrets.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified assets/prompt_alchemy.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
51 changes: 38 additions & 13 deletions deployments/monitoring/kube-prometheus-stack.expanded.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,38 @@
grafana:
env:
GF_SERVER_ROOT_URL: http://promptalchemy.monitoring.com/grafana
GF_SERVER_SERVE_FROM_SUB_PATH: 'true'
# username is 'admin'
adminPassword: prom-operator
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /$2
hosts: ['promptalchemy.monitoring.com']
path: "/grafana"
alertmanager:
config:
global:
resolve_timeout: 5s
inhibit_rules:
- source_matchers:
- 'severity = critical'
target_matchers:
- 'severity =~ warning|info'
equal:
- 'namespace'
- 'alertname'
- source_matchers:
- 'severity = warning'
target_matchers:
- 'severity = info'
equal:
- 'namespace'
- 'alertname'
- source_matchers:
- 'alertname = InfoInhibitor'
target_matchers:
- 'severity = info'
equal:
- 'namespace'
route:
group_by: [ 'alertname', 'job' ]
group_wait: 15s
group_interval: 2m
repeat_interval: 4h
receiver: discord
receivers:
- name: discord
discord_configs:
- webhook_url: 'https://discord.com/api/webhooks/1279478330130825267/lP9uKOfkd-hbFfTcPCWkkBkwEIkb-1m1V_NWOGD7lnduqjawoQIJKkZU-PlQmn_w3Wbt' # deleted leu leu
- name: 'null'
templates:
- '/etc/alertmanager/config/*.tmpl'
Binary file added deployments/redis/charts/common-2.22.0.tgz
Binary file not shown.
8 changes: 4 additions & 4 deletions iac/ansible/deploy_jenkins/create_compute_instance.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@
- name: Start an instance
gcp_compute_instance:
name: jenkins-server
machine_type: e2-medium
machine_type: e2-standard-2
# Refer to https://cloud.google.com/compute/docs/images/os-details#ubuntu_lts
# or use the command `gcloud compute images list --project=ubuntu-os-cloud`
zone: asia-southeast1-b
zone: us-central1-a
project: prompt-alchemy
# The service account is needed to create the resources
auth_kind: serviceaccount
service_account_file: ../secrets/prompt-alchemy-088144947091.json
service_account_file: ../secrets/prompt-alchemy-d32063c4c41d.json
disks:
- auto_delete: true
boot: true
Expand Down Expand Up @@ -42,4 +42,4 @@
description: Allow incoming traffic on port 30000
project: prompt-alchemy
auth_kind: serviceaccount
service_account_file: ../secrets/prompt-alchemy-088144947091.json
service_account_file: ../secrets/prompt-alchemy-d32063c4c41d.json
2 changes: 1 addition & 1 deletion iac/ansible/deploy_jenkins/deploy_jenkins.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
hosts: servers # Which host to apply, you can replace by `servers`, or by `servers_1, servers_2` for multiple groups
become: yes # To run commands as a superuser (e.g., sudo)
vars:
default_container_name: jenkins
default_container_name: jenkins-server
default_container_image: bmd1905/jenkins-k8s
tasks:
- name: Install aptitude
Expand Down
2 changes: 1 addition & 1 deletion iac/ansible/inventory
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
[servers]
35.240.191.221 ansible_ssh_private_key_file=/Users/bmd1905/.ssh/id_ed25519
34.121.184.104 ansible_ssh_private_key_file=/Users/bmd1905/.ssh/id_ed25519
Loading

0 comments on commit ec83e7b

Please sign in to comment.