diff --git a/docs/user-guide/logs/manage-pipelines.md b/docs/user-guide/logs/manage-pipelines.md index 5c36c3cc5..67209a053 100644 --- a/docs/user-guide/logs/manage-pipelines.md +++ b/docs/user-guide/logs/manage-pipelines.md @@ -87,4 +87,138 @@ Output: Readable timestamp (UTC): 2024-06-27 12:02:34.257312110Z ``` -The output `Readable timestamp (UTC)` represents the creation time of the pipeline and also serves as the version number. \ No newline at end of file +The output `Readable timestamp (UTC)` represents the creation time of the pipeline and also serves as the version number. + +## Debug + +First, please refer to the [Quick Start example](/user-guide/logs/quick-start.md#write-logs-by-pipeline) to see the correct execution of the Pipeline. + +### Debug creating a Pipeline + +You may encounter errors when creating a Pipeline. For example, when creating a Pipeline using the following configuration: + + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: + - "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +The pipeline configuration contains an error. The `gsub` Processor expects the `replacement` field to be a string, but the current configuration provides an array. As a result, the pipeline creation fails with the following error message: + + +```json +{"error":"Failed to parse pipeline: 'replacement' must be a string"} +``` + +Therefore, We need to modify the configuration of the `gsub` Processor and change the value of the `replacement` field to a string type. + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +Now that the Pipeline has been created successfully, you can test the Pipeline using the `dryrun` interface. + +### Debug writing logs + +We can test the Pipeline using the `dryrun` interface. We will test it with erroneous log data where the value of the message field is in numeric format, causing the pipeline to fail during processing. + +**This API is only used to test the results of the Pipeline and does not write logs to GreptimeDB.** + + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": 1998.08,"time":"2024-05-25 20:16:37.217"}' + +{"error":"Failed to execute pipeline, reason: gsub processor: expect string or array string, but got Float64(1998.08)"} +``` + +The output indicates that the pipeline processing failed because the `gsub` Processor expects a string type rather than a floating-point number type. We need to adjust the format of the log data to ensure the pipeline can process it correctly. +Let's change the value of the message field to a string type and test the pipeline again. + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": "1998.08","time":"2024-05-25 20:16:37.217"}' +``` + +At this point, the Pipeline processing is successful, and the output is as follows: + +```json +{ + "rows": [ + [ + { + "data_type": "STRING", + "key": "message", + "semantic_type": "FIELD", + "value": "1998-08" + }, + { + "data_type": "TIMESTAMP_NANOSECOND", + "key": "time", + "semantic_type": "TIMESTAMP", + "value": "2024-05-25 20:16:37.217+0000" + } + ] + ], + "schema": [ + { + "colume_type": "FIELD", + "data_type": "STRING", + "fulltext": false, + "name": "message" + }, + { + "colume_type": "TIMESTAMP", + "data_type": "TIMESTAMP_NANOSECOND", + "fulltext": false, + "name": "time" + } + ] +} +``` + +It can be seen that the `.` in the string `1998.08` has been replaced with `-`, indicating a successful processing of the Pipeline. \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md index f0175c261..5d6514072 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/logs/manage-pipelines.md @@ -87,4 +87,136 @@ timestamp_ns="1719489754257312110"; readable_timestamp=$(TZ=UTC date -d @$((${ti Readable timestamp (UTC): 2024-06-27 12:02:34.257312110Z ``` -输出的 `Readable timestamp (UTC)` 即为 Pipeline 的创建时间同时也是版本号。 \ No newline at end of file +输出的 `Readable timestamp (UTC)` 即为 Pipeline 的创建时间同时也是版本号。 + +## 问题调试 + +首先,请参考 [快速入门示例](/user-guide/logs/quick-start.md#使用-pipeline-写入日志)来查看 Pipeline 正确的执行情况。 + +### 调试创建 Pipeline + +在创建 Pipeline 的时候你可能会遇到错误,例如使用如下配置创建 Pipeline: + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: + - "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +Pipeline 配置存在错误。`gsub` processor 期望 `replacement` 字段为字符串,但当前配置提供了一个数组。因此,该 Pipeline 创建失败,并显示以下错误消息: + + +```json +{"error":"Failed to parse pipeline: 'replacement' must be a string"} +``` + +因此,你需要修改 `gsub` processor 的配置,将 `replacement` 字段的值更改为字符串类型。 + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/test" \ + -H 'Content-Type: application/x-yaml' \ + -d $'processors: + - date: + field: time + formats: + - "%Y-%m-%d %H:%M:%S%.3f" + ignore_missing: true + - gsub: + fields: + - message + pattern: "\\\." + replacement: "-" + ignore_missing: true + +transform: + - fields: + - message + type: string + - field: time + type: time + index: timestamp' +``` + +此时 Pipeline 创建成功,可以使用 `dryrun` 接口测试该 Pipeline。 + +### 调试日志写入 + +我们可以使用 `dryrun` 接口测试 Pipeline。我们将使用错误的日志数据对其进行测试,其中消息字段的值为数字格式,会导致 Pipeline 在处理过程中失败。 + +**此接口仅仅用于测试 Pipeline 的处理结果,不会将日志写入到 GreptimeDB 中。** + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": 1998.08,"time":"2024-05-25 20:16:37.217"}' + +{"error":"Failed to execute pipeline, reason: gsub processor: expect string or array string, but got Float64(1998.08)"} +``` + +输出显示 Pipeline 处理失败,因为 `gsub` Processor 期望的是字符串类型,而不是浮点数类型。我们需要修改日志数据的格式,确保 Pipeline 能够正确处理。 +我们再将 message 字段的值修改为字符串类型,然后再次测试该 Pipeline。 + +```bash +curl -X "POST" "http://localhost:4000/v1/events/pipelines/dryrun?pipeline_name=test" \ + -H 'Content-Type: application/json' \ + -d $'{"message": "1998.08","time":"2024-05-25 20:16:37.217"}' +``` + +此时 Pipeline 处理成功,输出如下: + +```json +{ + "rows": [ + [ + { + "data_type": "STRING", + "key": "message", + "semantic_type": "FIELD", + "value": "1998-08" + }, + { + "data_type": "TIMESTAMP_NANOSECOND", + "key": "time", + "semantic_type": "TIMESTAMP", + "value": "2024-05-25 20:16:37.217+0000" + } + ] + ], + "schema": [ + { + "colume_type": "FIELD", + "data_type": "STRING", + "fulltext": false, + "name": "message" + }, + { + "colume_type": "TIMESTAMP", + "data_type": "TIMESTAMP_NANOSECOND", + "fulltext": false, + "name": "time" + } + ] +} +``` + +可以看到,`1998.08` 字符串中的 `.` 已经被替换为 `-`,Pipeline 处理成功。 \ No newline at end of file