Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
yang1666204 committed Apr 23, 2024
2 parents 8f13cb7 + 01f6fb1 commit 11a17fd
Show file tree
Hide file tree
Showing 96 changed files with 942 additions and 345 deletions.
100 changes: 100 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"

on:
push:
branches: [ "master", "*_release" ]
paths:
- '**/*.go'
- 'ui/**/*'
pull_request:
branches: [ "master", "*_release" ]
paths:
- '**/*.go'
- 'ui/**/*'
schedule:
- cron: '26 23 * * 5'

jobs:
analyze:
name: Analyze (${{ matrix.language }})
# Runner size impacts CodeQL analysis time. To learn more, please see:
# - https://gh.io/recommended-hardware-resources-for-running-codeql
# - https://gh.io/supported-runners-and-hardware-resources
# - https://gh.io/using-larger-runners (GitHub.com only)
# Consider using larger runners or machines with greater resources for possible analysis time improvements.
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
timeout-minutes: ${{ (matrix.language == 'swift' && 120) || 360 }}
permissions:
# required for all workflows
security-events: write

# required to fetch internal or private CodeQL packs
packages: read

# only required for workflows in private repositories
actions: read
contents: read

strategy:
fail-fast: false
matrix:
include:
- language: go
build-mode: autobuild
- language: javascript-typescript
build-mode: none
# CodeQL supports the following values keywords for 'language': 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift'
# Use `c-cpp` to analyze code written in C, C++ or both
# Use 'java-kotlin' to analyze code written in Java, Kotlin or both
# Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both
# To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,
# see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.
# If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality

# If the analyze step fails for one of the languages you are analyzing with
# "We were unable to automatically build your code", modify the matrix above
# to set the build mode to "manual" for that language. Then modify this step
# to build your code.
# ℹ️ Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
- if: matrix.build-mode == 'manual'
run: |
echo 'If you are using a "manual" build mode for one or more of the' \
'languages you are analyzing, replace this with the commands to build' \
'your code, for example:'
echo ' make bootstrap'
echo ' make release'
exit 1
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"
4 changes: 2 additions & 2 deletions README-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ echo $(kubectl get -n default secret oceanbase-dashboard-user-credentials -o jso
```
一个 NodePort 类型的 service 会默认创建,可以通过如下命令查看 service 的地址,然后在浏览器中打开。
```
kubectl get svc oceanbase-dashboard-ob-dashboard
kubectl get svc oceanbase-dashboard-oceanbase-dashboard
```
![oceanbase-dashboard-service](./docsite/static/img/oceanbase-dashboard-service.jpg)

Expand All @@ -154,7 +154,7 @@ ob-operator 以 kubebuilder 为基础,通过统一的资源管理器接口、

![ob-operator 架构设计](./docsite/static/img/ob-operator-arch.png)

有关架构细节可参见[架构设计文档](./docsite/i18n/zh-Hans/docusaurus-plugin-content-docs/current/developer/arch.md)
有关架构细节可参见[架构设计文档](https://oceanbase.github.io/ob-operator/zh-Hans/docs/developer/arch)

## 特性

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ echo $(kubectl get -n default secret oceanbase-dashboard-user-credentials -o jso
```
A service of type NodePort is created by default, you can check the address and port and open it in browser
```
kubectl get svc oceanbase-dashboard-ob-dashboard
kubectl get svc oceanbase-dashboard-oceanbase-dashboard
```
![oceanbase-dashboard-service](./docsite/static/img/oceanbase-dashboard-service.jpg)

Expand All @@ -153,7 +153,7 @@ ob-operator is built on top of kubebuilder and provides control and management o

![ob-operator Architecture](./docsite/static/img/ob-operator-arch.png)

For more detailed information about the architecture, please refer to the [Architecture Document](./docsite/docs/developer/arch.md).
For more detailed information about the architecture, please refer to the [Architecture Document](https://oceanbase.github.io/ob-operator/docs/developer/arch).


## Features
Expand Down
40 changes: 1 addition & 39 deletions docsite/docs/developer/arch.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,42 +66,4 @@ To address this issue, ob-operator adopts task flow mechanism and a global task

The relationship among the control loop, resource manager, and task manager is depicted in the following figure.

![The relationship among the control loop, resource manager, and task manager](/img/ob-operator-arch.png)


Tasks in the task flow are submitted by the Resource Manager (`ResourceManager`) to the global Task Manager (`TaskManager`) for execution. The overall relationship and interaction flow between the resources, Resource Manager, and Task Manager are illustrated in the following sequence diagram:

<main>
<pre class="mermaid">
sequenceDiagram
participant r as Resource
participant c as Controller (ResourceManager)
participant t as TaskManager
autonumber
r->>c: Resource changes
c->>t: Get task flow according to recourse status
t->>t: Create goroutine to execute specific task
t->>c: Return task ID to controller
c->>r: Stores task ID and other task context in resource
loop Watch task progress
r->>c: Requeue and requeue
c->>t: Checks the task status
alt If task is still pending
t->>c: Empty result
c->>c: Continues loop and requeues resource with a shorter interval
else If task is finished
t->>c: Task results
alt if no other tasks in flow
c->>r: Updates status of resource
else if there are other tasks in flow
c->>r: Updates task context of resource
c->>t: Watches progress of new task, back to [6] loop
end
end
end
t->>t: Clean maps
</pre>
<script type="module">
import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs';
</script>
</main>
![The relationship among the control loop, resource manager, and task manager](/img/ob-operator-arch.png)
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 1
---

# Manage clusters

ob-operator defines the following custom resource definitions (CRDs) based on the deployment mode of OceanBase clusters:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 2
---

# Create a cluster

This topic describes how to create an OceanBase cluster by using ob-operator.
Expand Down Expand Up @@ -144,9 +148,9 @@ The following table describes available annotations. For short, the annotation `

| Annotation | Description |
| -- | -- |
| `independent-pvc-lifecycle` | `true`: PVCs won't be deleted even if the OBCluster is deleted. |
| `mode` | `standalone`: Require observer version >= 4.2.0. Bootstrap the single-node cluster with 127.0.0.1, which cannot contact other nodes any more. <br/> `service`: Require observer version >= 4.2.3. Create a specific K8s service for each OBServer and use the service's `ClusterIP` as the OBServer's IP address. |
| `single-pvc` | `true`: Create and bind a single PVC to a OBServer pod (three PVCs by default). |
| `independent-pvc-lifecycle` | `true`: Require ob-operator >= 2.1.1. PVCs won't be deleted even if the OBCluster is deleted. |
| `mode` | `standalone`: Require ob-operator >= 2.1.1 and observer version >= 4.2.0. Bootstrap the single-node cluster with 127.0.0.1, which cannot contact other nodes any more. <br/> `service`: Require ob-operator >= 2.2.0 and observer version >= 4.2.3. Create a specific K8s service for each OBServer and use the service's `ClusterIP` as the OBServer's IP address. |
| `single-pvc` | `true`: Require ob-operator >= 2.1.2. Create and bind a single PVC to a OBServer pod (three PVCs by default). |

### Create a cluster

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
label: Zone management
position: 1
position: 3
link:
type: generated-index
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
label: Server management
position: 2
position: 4
link:
type: generated-index
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 5
---

# Upgrade a cluster

This topic describes how to upgrade an OceanBase cluster that is deployed by using ob-operator.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 6
---

# Manage parameters

This topic describes how to modify the parameters of an OceanBase cluster by using ob-operator.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 6.5
---

# Update resources

After the cluster is created and running, we may need to adjust the resource configuration of the OBServer node, such as CPU, memory, and storage volumes. This article introduces the resource configuration that can be modified and the specific operations.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 7
---

# Delete a cluster

This topic describes how to delete an OceanBase cluster by using ob-operator.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
---
sidebar_position: 1
---
# Manage tenants

ob-operator defines the following resources for OceanBase Database tenants:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 1.5
---

# Create a tenant

This topic describes how to create a tenant by using ob-operator.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 3
---

# Delete a tenant

This topic describes how to use ob-operator to delete a tenant from a Kubernetes environment.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 4
---

# Perform tenant O&M operations

A tenant contains various resources. To prevent a tenant from becoming bloated and improve tenant O&M flexibility, ob-operator provides the tenant O&M resource `OBTenantOperation` for you to perform intra-tenant and inter-tenant O&M operations. ob-operator V2.1.0 supports three O&M operations: changing the password of the root user, activating a standby tenant, and executing a primary/standby tenant switchover. Standby tenant activation and primary/standby tenant switchover are related to the [physical standby database](../300.high-availability/600.standby-tenant-of-ob-operator.md) feature. Here are sample configurations of the three O&M operations:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
sidebar_position: 1
---

# High availability

ob-operator ensures the high availability of data by using the following features of OceanBase Database.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,30 +1,82 @@
# Restore service from node failure
---
sidebar_position: 2
---

This topic describes how to handle the faults of OBServer nodes by using ob-operator.
import Tabs from '@theme/Tabs'
import TabItem from '@theme/TabItem'

## Prerequisites
# Recover from node failure

* You must deploy an OceanBase cluster with at least 3 OBServer nodes and a tenant with 3 replicas or above.
* To use the static IP address feature, you must install `Calico` as the network plugin for the Kubernetes cluster.
This topic describes how to recover from the failures of OBServer nodes by using ob-operator.

## Restore policy
:::note

When a minority of OBServer nodes fail, the multi-replica mechanism of OceanBase Database ensures the availability of the cluster, and ob-operator detects a pod exception. Then, ob-operator creates an OBServer node, adds it to the cluster, and deletes the abnormal OBServer node.
OceanBase Database replicates the data of replicas on the abnormal OBServer node to the new node.
Before OceanBase 4.2.3.0, the database kernel cannot communicate using virtual IP addresses. When the Pod IP address changes, the observer cannot start normally. To restart the observer in place after it fails, you must fix the IP address of the node. Otherwise, you can only rely on the majority of nodes to add a new Pod to the cluster as a new node and synchronize data to restore the original number of nodes.

:::


## Based on the multi-replica capability of OceanBase Database

### Prerequisites

- To successfully recover the OceanBase cluster, you must deploy at least three nodes and a tenant with at least three replicas.
- **This method can only handle the failure of a minority of nodes**, such as the failure of one node in a three-node cluster.

### Recovery policy

When an OBServer node fails, ob-operator detects the pod exception and creates a new OBServer node to join the cluster. The new OBServer node synchronizes data with the original node until all data is synchronized.

During the recovery process, if a majority of OBServer nodes fail, the cluster cannot be restored. In this case, you must manually restore the cluster by using the backup and restore feature of ob-operator.

## Based on the Calico network plugin

### Prerequisites

- To use the static IP address feature, you must install [Calico](https://docs.tigera.io/calico/latest/getting-started/kubernetes/) as the network plugin for the Kubernetes cluster.

### Restore policy

When a minority of OBServer nodes fail, the multi-replica mechanism of OceanBase Database ensures the availability of the cluster, and ob-operator detects a pod exception. Then, ob-operator creates an OBServer node, adds it to the cluster, and deletes the abnormal OBServer node. OceanBase Database replicates the data of replicas on the abnormal OBServer node to the new node.

If you have installed Calico for the Kubernetes cluster, this process can be easier. ob-operator can start a new observer process by using the IP address of the abnormal OBServer node. This way, the data on the abnormal OBServer node, if the data still exists, can be directly used without the replication step. Moreover, if a majority of OBServer nodes fail, this method can also restore the service after all new OBServer nodes are started.

## Based on Kubernetes service

### Prerequisites

- Only OceanBase Database of version >= 4.2.3.0 support this feature.
- Create an OBCluster in `service` mode. The configuration is as follows:

```yaml
apiVersion: oceanbase.oceanbase.com/v1alpha1
kind: OBCluster
metadata:
name: test
namespace: oceanbase
annotations:
oceanbase.oceanbase.com/mode: service # This is the key configuration
spec:
# ...
```
### Restore policy

After creating an OBCluster in `service` mode, ob-operator will attach a `service` to each OBServer pod and take the ClusterIP of the service as the networking IP.

1. When an OBServer pod restarts, the observer can restart in place by using the constant ClusterIP of the service for communication.
2. When an OBServer pod is mistakenly deleted, ob-operator will create a new OBServer pod and use the same ClusterIP for communication. The new node will automatically join the OceanBase cluster and resume service.

## Verification

You can verify the restore result of ob-operator by performing the following steps.

Delete the pod
1. Delete the pod:

```shell
kubectl delete pod obcluster-1-zone1-074bda77c272 -n oceanbase
```

View the restore result. The output shows that the pod of zone1 has been created and is ready.
2. View the restore result. The output shows that the pod of zone1 has been created and is ready.

```shell
kubectl get pods -n oceanbase
Expand All @@ -37,4 +89,4 @@ obcluster-1-zone1-94ecf05cb290 2/2 Running 0 1m

## Deployment suggestions

To deploy a production cluster with high availability, we recommend that you deploy an OceanBase cluster of at least three OBServer nodes, create tenants that each has at least three replicas, distribute nodes of each zone on different servers, and use the network plug-in Calico. This minimizes the risk of an unrecoverable cluster disaster. ob-operator also provides high-availability solutions based on the backup and restore of tenant data and the primary and standby tenants. For more information, see [Restore data from a backup](500.data-recovery-of-ob-operator.md) and [Back up a tenant](400.tenant-backup-of-ob-operator.md).
To deploy a production cluster with high availability, we recommend that you deploy an OceanBase cluster of at least three OBServer nodes, create tenants that each has at least three replicas, distribute nodes of each zone on different servers, and use the network plug-in Calico (or set OBCluster as `service` mode in ob-operator 2.2.0 and later versions). This minimizes the risk of an unrecoverable cluster disaster. ob-operator also provides high-availability solutions based on the backup and restore of tenant data and the primary and standby tenants. For more information, see [Restore data from a backup](500.data-recovery-of-ob-operator.md) and [Back up a tenant](400.tenant-backup-of-ob-operator.md).
Loading

0 comments on commit 11a17fd

Please sign in to comment.