Skip to content

Commit

Permalink
docs: admin documents (#964)
Browse files Browse the repository at this point in the history
  • Loading branch information
killme2008 committed May 18, 2024
1 parent b1f426c commit 8077b6e
Show file tree
Hide file tree
Showing 16 changed files with 245 additions and 16 deletions.
8 changes: 4 additions & 4 deletions docs/nightly/en/reference/sql/create.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name
...
[TIME INDEX (column)],
[PRIMARY KEY(column1, column2, ...)]
) ENGINE = engine WITH([TTL | REGIONS] = expr, ...)
) ENGINE = engine WITH([TTL | storage | ...] = expr, ...)
[
PARTITION ON COLUMNS(column1, column2, ...) (
<PARTITION EXPR>,
Expand Down Expand Up @@ -92,13 +92,13 @@ Users can add table options by using `WITH`. The valid options contain the follo
| `memtable.type` | Type of the memtable. | String value, supports `time_series`, `partition_tree`. |
| `append_mode` | Whether the table is append-only | String value. Default is 'false', which removes duplicate rows by primary keys and timestamps. Setting it to 'true' to enable append mode and create an append-only table which keeps duplicate rows. |

For example, to create a table with the storage data TTL(Time-To-Live) is seven days and region number is 10:
For example, to create a table with the storage data TTL(Time-To-Live) is seven days:

```sql
CREATE TABLE IF NOT EXISTS temperatures(
ts TIMESTAMP TIME INDEX,
temperature DOUBLE DEFAULT 10,
) engine=mito with(ttl='7d', regions=10);
) engine=mito with(ttl='7d');
```

Create a table that stores the data in Google Cloud Storage:
Expand All @@ -107,7 +107,7 @@ Create a table that stores the data in Google Cloud Storage:
CREATE TABLE IF NOT EXISTS temperatures(
ts TIMESTAMP TIME INDEX,
temperature DOUBLE DEFAULT 10,
) engine=mito with(ttl='7d', regions=10, storage="Gcs");
) engine=mito with(ttl='7d', storage="Gcs");
```

Create a table with custom compaction options. The table will attempt to partition data into 1-day time window based on the timestamps of the data.
Expand Down
21 changes: 21 additions & 0 deletions docs/nightly/en/reference/sql/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,27 @@ Where the `datatype` can be any valid Arrow data type in this [list](https://arr

Please refer to [API documentation](https://greptimedb.rs/script/python/rspython/builtins/greptime_builtin/index.html#functions)

### Admin Functions

GreptimeDB provides some administration functions to manage the database and data:

* `flush_table(table_name)` to flush a table's memtables into SST file by table name.
* `flush_region(region_id)` to flush a region's memtables into SST file by region id. Find the region id through [REGION_PEERS](./information-schema/region-peers.md) table.
* `compact_table(table_name)` to schedule a compaction task for a table by table name.
* `compact_region(region_id)` to schedule a compaction task for a region by region id.
* `migrate_region(region_id, from_peer, to_peer, [timeout])` to migrate regions between datanodes, please read the [Region Migration](/user-guide/operations/region-migration).
* `procedure_state(procedure_id)` to query a procedure state by its id.

For example:
```sql
-- Flush the table test --
select flush_table("test");

-- Schedule a compaction for table test --
select compact_table("test");
```


## Time and Date

### `date_trunc`
Expand Down
4 changes: 2 additions & 2 deletions docs/nightly/en/summary.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,9 @@
- api
- cluster
- Operations:
# - overview
- admin
- configuration
- back-up-&-restore-data
- kubernetes
- gtctl
- run-on-android
Expand All @@ -81,7 +82,6 @@
# - alert
# - import-data
# - export-data
# - back-up-&-restore-data
# - capacity-planning
- upgrade
- GreptimeCloud:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Of course, you can set TTL for every table when creating it:
CREATE TABLE IF NOT EXISTS temperatures(
ts TIMESTAMP TIME INDEX,
temperature DOUBLE DEFAULT 10,
) engine=mito with(ttl='7d', regions=10);
) engine=mito with(ttl='7d');
```

The TTL of temperatures is set to be seven days.
Expand Down
30 changes: 30 additions & 0 deletions docs/nightly/en/user-guide/operations/admin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Administration

This document addresses strategies and practices used in the operation of GreptimeDB systems and deployments.

## Database/Cluster management

* [Installation](/getting-started/installation/overview.md) for GreptimeDB and the [g-t-control](./gtctl.md) command line tool.
* Database Configuration, please read the [Configuration](./configuration.md) reference.
* [Monitoring](./monitoring.md) and [Tracing](./tracing.md) for GreptimeDB.
* GreptimeDB [Backup & Restore methods](./back-up-\&-restore-data.md).

### Runtime information

* Find the topology information of the cluster though [CLUSTER_INFO](/reference/sql/information-schema/cluster-info.md) table.
* Find the table regions distribution though [REGION_PEERS](/reference/sql/information-schema/region-peers.md) table.

The `INFORMATION_SCHEMA` database provides access to system metadata, such as the name of a database or table, the data type of a column, etc. Please read the [reference](/reference/sql/information-schema/overview.md).

## Data management

* [The Storage Location](/user-guide/concepts/storage-location.md).
* Cluster Failover for GreptimeDB by [Setting Remote WAL](./remote-wal/quick-start.md).
* [Flush and Compaction for Table & Region](/reference/sql/functions#admin-functions).
* Partition the table by regions, read the [Table Sharding](/contributor-guide/frontend/table-sharding.md) reference.
* [Migrate the Region](./region-migration.md) for Load Balance.
* [Expire Data by Setting TTL](/user-guide/concepts/features-that-you-concern#can-i-set-ttl-or-retention-policy-for-different-tables-or-measurements).

## Best Practices

TODO
46 changes: 46 additions & 0 deletions docs/nightly/en/user-guide/operations/back-up-&-restore-data.md
Original file line number Diff line number Diff line change
@@ -1 +1,47 @@
# Back up & restore data

Use [`COPY` command](/reference/sql/copy.md)to backup and restore data.

## Backup Table

Backup the table `monitor` in `parquet` format to the file `/home/backup/monitor/monitor.parquet`:

```sql
COPY monitor TO '/home/backup/monitor/monitor.parquet' WITH (FORMAT = 'parquet');
```

Backup the data in the time range:

```sql
COPY monitor TO '/home/backup/monitor/monitor_20240518.parquet' WITH (FORMAT = 'parquet', START_TIME='2024-05-18 00:00:00', END_TIME='2025-05-19 00:00:00');
```

The above command will export the data on `2024-05-18`. Use such command to achieve incremental backup.

## Restore Table

Restore the `monitor` table:

```sql
COPY monitor FROM '/home/backup/monitor/monitor.parquet' WITH (FORMAT = 'parquet');
```

If exporting the data every data incrementally, all the files under the same folder but with different file names, we can restore them with `PATTERN` option:

```sql
COPY monitor FROM '/home/backup/monitor/` WITH (FORMAT = 'parquet', PATTERN = '.*parquet')
```
## Backup & Restore Database
It's almost the same as the table:

```sql
-- Backup the database public --
COPY DATABASE public TO '/home/backup/public/' WITH (FORMAT='parquet');
-- Restore the database public --
COPY DATABASE public FROM '/home/backup/public/' WITH (FORMAT='parquet');
```

Look at the folder `/home/backup/public/`, the command exports each table as a separate file.
17 changes: 17 additions & 0 deletions docs/nightly/en/user-guide/operations/region-migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,20 @@ select migrate_region(region_id, from_peer_id, to_peer_id, replay_timeout);
| `from_peer_id` | The peer id of the migration source(Datanode). | **Required** | |
| `to_peer_id` | The peer id of the migration destination(Datanode). | **Required** | |
| `replay_timeout` | The timeout(secs) of replay data. If the new Region fails to replay the data within the specified timeout, the migration will fail, however the data in the old Region will not be lost. | Optional | |

## Query the migration state

The `migrate_region` function returns the procedure id that executes the migration, queries the procedure state by it:

```sql
select procedure_state('538b7476-9f79-4e50-aa9c-b1de90710839')
```

If it's done, outputs the state in JSON:

```json
{"status":"Done"}
```

Of course, you can confirm the region distribution by querying from `region_peers` and `partitions` in `information_schema`.

2 changes: 1 addition & 1 deletion docs/nightly/en/user-guide/table-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ Using the following code to create a table through POST method:
curl -X POST \
-H 'authorization: Basic {{authorization if exists}}' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'sql=CREATE TABLE monitor (host STRING, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(), cpu FLOAT64 DEFAULT 0, memory FLOAT64, TIME INDEX (ts), PRIMARY KEY(host)) ENGINE=mito WITH(regions=1)' \
-d 'sql=CREATE TABLE monitor (host STRING, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(), cpu FLOAT64 DEFAULT 0, memory FLOAT64, TIME INDEX (ts), PRIMARY KEY(host)) ENGINE=mito' \
http://localhost:4000/v1/sql?db=public
```

Expand Down
8 changes: 4 additions & 4 deletions docs/nightly/zh/reference/sql/create.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name
...
[TIME INDEX (column)],
[PRIMARY KEY(column1, column2, ...)]
) ENGINE = engine WITH([TTL | REGIONS] = expr, ...)
) ENGINE = engine WITH([TTL | storage | ...] = expr, ...)
[
PARTITION ON COLUMNS(column1, column2, ...) (
<PARTITION EXPR>,
Expand Down Expand Up @@ -93,13 +93,13 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name
| `memtable.type` | memtable 的类型 | 字符串值,支持 `time_series``partition_tree` |
| `append_mode` | 该表是否时 append-only 的 | 字符串值. 默认为 'false',表示会根据主键和时间戳对数据去重。设置为 'true' 可以开启 append 模式和创建 append-only 表,保留所有重复的行 |

例如,创建一个存储数据 TTL(Time-To-Live) 为七天,region 数为 10 的表
例如,创建一个存储数据 TTL(Time-To-Live) 为七天的表

```sql
CREATE TABLE IF NOT EXISTS temperatures(
ts TIMESTAMP TIME INDEX,
temperature DOUBLE DEFAULT 10,
) engine=mito with(ttl='7d', regions=10);
) engine=mito with(ttl='7d');
```

或者创建一个表单独将数据存储在 Google Cloud Storage 服务上:
Expand All @@ -108,7 +108,7 @@ CREATE TABLE IF NOT EXISTS temperatures(
CREATE TABLE IF NOT EXISTS temperatures(
ts TIMESTAMP TIME INDEX,
temperature DOUBLE DEFAULT 10,
) engine=mito with(ttl='7d', regions=10, storage="Gcs");
) engine=mito with(ttl='7d', storage="Gcs");
```

创建带自定义 twcs compaction 参数的表。这个表会尝试根据数据的时间戳将数据按 1 天的时间窗口分区。
Expand Down
20 changes: 20 additions & 0 deletions docs/nightly/zh/reference/sql/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,26 @@ arrow_cast(expression, datatype)

请参考 [API 文档](https://greptimedb.rs/script/python/rspython/builtins/greptime_builtin/index.html#functions)

### 管理函数

GreptimeDB 提供了一些管理函数来管理数据库和数据:

* `flush_table(table_name)` 通过表名将表的内存表刷写到 SST 文件。
* `flush_region(region_id)` 通过 Region Id 将 Region 的内存表刷写到 SST 文件。可以通过 [REGION_PEERS](./information-schema/region-peers.md) 表查找 Region Id。
* `compact_table(table_name)` 通过表名为表发起compaction 任务。
* `compact_region(region_id)` 通过 Region Id 为 Region 发起 compaction 任务。
* `migrate_region(region_id, from_peer, to_peer, [timeout])` 在 Datanode 之间迁移 Region,请阅读 [ Region迁移](/user-guide/operations/region-migration)
* `procedure_state(procedure_id)` 通过 Procedure Id 查询 Procedure 状态。

例如:
```sql
-- 刷新表 test --
select flush_table("test");

-- 为表 test 启动一个 compaction 任务 --
select compact_table("test");
```

## Time and Date

### `date_trunc`
Expand Down
3 changes: 3 additions & 0 deletions docs/nightly/zh/summary-i18n.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,8 @@ Frontend: Frontend
Datanode: Datanode
Metasrv: Metasrv
Reference: Reference
Admin: 管理
Administration: 管理
back-up-&-restore-data: 备份和恢复
SDK: SDK
SQL: SQL
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
CREATE TABLE IF NOT EXISTS temperatures(
ts TIMESTAMP TIME INDEX,
temperature DOUBLE DEFAULT 10,
) engine=mito with(ttl='7d', regions=10);
) engine=mito with(ttl='7d');
```

在上述 SQL 中 `temperatures` 表的 TTL 被设置为 7 天。
Expand Down
30 changes: 30 additions & 0 deletions docs/nightly/zh/user-guide/operations/admin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# 管理

本文件介绍了在 GreptimeDB 系统运维和部署中使用的策略和实践。

## 数据库/集群管理

* GreptimeDB 的 [安装](/getting-started/installation/overview.md)[g-t-control](./gtctl.md) 命令行工具。
* 数据库配置,请阅读 [配置](./configuration.md) 参考。
* GreptimeDB 的 [监控](./monitoring.md)[链路追踪](./tracing.md)
* GreptimeDB 的 [备份与恢复方法](./back-up-\&-restore-data.md)

### 运行时信息

* 通过 [CLUSTER_INFO](/reference/sql/information-schema/cluster-info.md) 表查找集群的拓扑信息。
* 通过 [REGION_PEERS](/reference/sql/information-schema/region-peers.md) 表查找表的 Region 分布。

`INFORMATION_SCHEMA` 数据库提供了对系统元数据的访问,如数据库或表的名称、列的数据类型等。请阅读 [参考文档](/reference/sql/information-schema/overview.md)

## 数据管理

* [存储位置说明](/user-guide/concepts/storage-location.md)
* 通过 [设置Remote WAL](./remote-wal/quick-start.md) 实现 GreptimeDB 的集群容灾。
* [Table 和 Region 的 Flush 和 Compaction](/reference/sql/functions#admin-functions)
* 通过 Region 对表进行分区,请阅读 [表的分片](./contributor-guide/frontend/table-sharding.md) 参考。
* [迁移 Region](./region-migration.md) 以实现负载均衡。
* [通过设置 TTL 过期数据](/user-guide/concepts/features-that-you-concern#can-i-set-ttl-or-retention-policy-for-different-tables-or-measurements)

## 最佳实践

TODO
48 changes: 47 additions & 1 deletion docs/nightly/zh/user-guide/operations/back-up-&-restore-data.md
Original file line number Diff line number Diff line change
@@ -1 +1,47 @@
TODO
# 备份和恢复数据

使用 [`COPY` 命令](/reference/sql/copy.md) 来备份和恢复数据。

## 备份表

将表 `monitor``parquet` 格式备份到文件 `/home/backup/monitor/monitor.parquet`

```sql
COPY monitor TO '/home/backup/monitor/monitor.parquet' WITH (FORMAT = 'parquet');
```

备份指定时间范围内的数据:

```sql
COPY monitor TO '/home/backup/monitor/monitor_20240518.parquet' WITH (FORMAT = 'parquet', START_TIME='2024-05-18 00:00:00', END_TIME='2025-05-19 00:00:00');
```

上述命令将导出 `2024-05-18` 的数据。可以使用此命令实现增量备份。

## 恢复表

恢复 `monitor` 表:

```sql
COPY monitor FROM '/home/backup/monitor/monitor.parquet' WITH (FORMAT = 'parquet');
```

如果每次增量导出数据,所有文件在同一文件夹下但文件名不同,可以使用 `PATTERN` 选项选中并恢复它们:

```sql
COPY monitor FROM '/home/backup/monitor/' WITH (FORMAT = 'parquet', PATTERN = '.*parquet');
```

## 备份和恢复数据库

和表的命令类似:

```sql
-- 备份数据库 public --
COPY DATABASE public TO '/home/backup/public/' WITH (FORMAT='parquet');

-- 恢复数据库 public --
COPY DATABASE public FROM '/home/backup/public/' WITH (FORMAT='parquet');
```

导出后,查看文件夹 `/home/backup/public/`,该命令将每个表导出为单独的文件。
16 changes: 16 additions & 0 deletions docs/nightly/zh/user-guide/operations/region-migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,19 @@ select migrate_region(region_id, from_peer_id, to_peer_id, replay_timeout);
| `from_peer_id` | 迁移起始节点(Datanode) 的 peer id。 | **Required** | |
| `to_peer_id` | 迁移目标节点(Datanode) 的 peer id。 | **Required** | |
| `replay_timeout` | 迁移时回放数据的超时时间(单位:秒)。如果新 Region 未能在指定时间内回放数据,迁移将失败,旧 Region 中的数据不会丢失。 | Optional | |

## 查询迁移状态

`migrate_region` 函数将返回执行迁移的 Procedure Id,可以通过它查询过程状态:

```sql
select procedure_state('538b7476-9f79-4e50-aa9c-b1de90710839')
```

如果顺利完成,将输出 JSON 格式的状态:

```json
{"status":"Done"}
```

当然,最终可以通过从 `information_schema` 中查询 `region_peers``partitions` 来确认 Region 分布是否符合预期。
4 changes: 2 additions & 2 deletions docs/nightly/zh/user-guide/table-management.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ CREATE TABLE monitor (
ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP() TIME INDEX,
cpu FLOAT64 DEFAULT 0,
memory FLOAT64,
PRIMARY KEY(host)) ENGINE=mito WITH(regions=1);
PRIMARY KEY(host)) ENGINE=mito;
```

```sql
Expand Down Expand Up @@ -219,7 +219,7 @@ Query OK, 1 row affected (0.01 sec)
curl -X POST \
-H 'authorization: Basic {{authorization if exists}}' \
-H 'Content-Type: application/x-www-form-urlencoded' \
-d 'sql=CREATE TABLE monitor (host STRING, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(), cpu FLOAT64 DEFAULT 0, memory FLOAT64, TIME INDEX (ts), PRIMARY KEY(host)) ENGINE=mito WITH(regions=1)' \
-d 'sql=CREATE TABLE monitor (host STRING, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(), cpu FLOAT64 DEFAULT 0, memory FLOAT64, TIME INDEX (ts), PRIMARY KEY(host)) ENGINE=mito' \
http://localhost:4000/v1/sql?db=public
```

Expand Down

0 comments on commit 8077b6e

Please sign in to comment.