Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a new pipeline to parse nginx log #46

Merged
merged 3 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 89 additions & 32 deletions kafka-ingestion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,44 +25,101 @@ You can access GreptimeDB using `mysql` client. Just run `mysql -h 127.0.0.1 -P
start.

```
$ mysql -h 127.0.0.1 -P 4002
mysql: Deprecated program name. It will be removed in a future release, use '/usr/bin/mariadb' instead
WARNING: option --ssl-verify-server-cert is disabled, because of an insecure passwordless login.
Welcome to the MariaDB monitor. Commands end with ; or \g.
mysql -h 127.0.0.1 -P 4002
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.4.2 Greptime

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Copyright (c) 2000, 2024, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MySQL [(none)]> show tables;
+-----------+
| Tables |
+-----------+
| demo_logs |
| numbers |
+-----------+
2 rows in set (0.009 sec)

MySQL [(none)]> select * from demo_logs order by timestamp desc limit 10;
+------------------------+----------------------------+
| message | timestamp |
+------------------------+----------------------------+
| this is a test message | 2024-08-20 08:46:50.700000 |
| this is a test message | 2024-08-20 08:46:47.696000 |
| this is a test message | 2024-08-20 08:46:44.693000 |
| this is a test message | 2024-08-20 08:46:41.689000 |
| this is a test message | 2024-08-20 08:46:38.685000 |
| this is a test message | 2024-08-20 08:46:35.682000 |
| this is a test message | 2024-08-20 08:46:32.679000 |
| this is a test message | 2024-08-20 08:46:29.675000 |
| this is a test message | 2024-08-20 08:46:26.671000 |
| this is a test message | 2024-08-20 08:46:23.668000 |
+------------------------+----------------------------+
10 rows in set (0.005 sec)

MySQL [(none)]> Bye
mysql> show tables;
+-------------+
| Tables |
+-------------+
| demo_logs |
| monitor_cpu |
| numbers |
+-------------+
3 rows in set (0.00 sec)

mysql> select * from demo_logs order by timestamp desc limit 15;
+-----------------+--------+----------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------+--------+------+---------------------+----------------------------+
| ip | method | path | protocol | user_agent | status | size | datetime | timestamp |
+-----------------+--------+----------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------+--------+------+---------------------+----------------------------+
| 28.0.250.103 | GET | / | HTTP/1.1 | Mozilla/5.0 (Linux; U; Android 2.2.1; zh-cn; HTC_Wildfire_A3333 Build/FRG83D) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 | 200 | 162 | 2024-09-19 03:44:32 | 2024-09-19 03:44:32.172000 |
| 76.75.15.140 | POST | /contact | HTTP/1.1 | Mozilla/5.0 (iPad; U; CPU OS 4_3_3 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8J2 Safari/6533.18.5 | 201 | 437 | 2024-09-19 03:44:29 | 2024-09-19 03:44:29.168000 |
| 142.32.59.93 | PUT | /blog | HTTP/2.0 | Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser) | 404 | 842 | 2024-09-19 03:44:26 | 2024-09-19 03:44:26.165000 |
| 162.84.33.98 | PUT | /about | HTTP/1.1 | UCWEB7.0.2.37/28/999 | 500 | 227 | 2024-09-19 03:44:23 | 2024-09-19 03:44:23.164000 |
| 55.142.233.127 | PUT | /blog | HTTP/1.1 | Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11 | 404 | 568 | 2024-09-19 03:44:20 | 2024-09-19 03:44:20.161000 |
| 207.190.178.233 | GET | /contact | HTTP/1.1 | Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SV1; QQDownload 732; .NET4.0C; .NET4.0E; SE 2.X MetaSr 1.0) | 200 | 812 | 2024-09-19 03:44:17 | 2024-09-19 03:44:17.157000 |
| 130.138.180.236 | GET | / | HTTP/2.0 | Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.11 TaoBrowser/2.0 Safari/536.11 | 404 | 517 | 2024-09-19 03:44:14 | 2024-09-19 03:44:14.154000 |
| 12.146.223.64 | PUT | /blog | HTTP/2.0 | Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 UBrowser/6.2.4094.1 Safari/537.36 | 404 | 145 | 2024-09-19 03:44:11 | 2024-09-19 03:44:11.150000 |
| 240.243.25.80 | GET | /blog | HTTP/1.1 | Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; SV1; QQDownload 732; .NET4.0C; .NET4.0E; SE 2.X MetaSr 1.0) | 200 | 869 | 2024-09-19 03:44:08 | 2024-09-19 03:44:08.147000 |
| 14.117.122.189 | DELETE | / | HTTP/1.1 | Mozilla/5.0 (Linux; U; Android 2.3.7; en-us; Nexus One Build/FRF91) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 | 404 | 191 | 2024-09-19 03:44:05 | 2024-09-19 03:44:05.142000 |
| 55.67.247.103 | POST | / | HTTP/1.1 | Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.57.2 (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2 | 200 | 151 | 2024-09-19 03:44:02 | 2024-09-19 03:44:02.138000 |
| 107.235.244.182 | GET | /blog | HTTP/1.1 | Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 9.50 | 404 | 340 | 2024-09-19 03:43:59 | 2024-09-19 03:43:59.135000 |
| 24.11.217.6 | POST | /blog | HTTP/2.0 | Mozilla/5.0 (Linux; U; Android 2.3.7; en-us; Nexus One Build/FRF91) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 | 404 | 523 | 2024-09-19 03:43:56 | 2024-09-19 03:43:56.131000 |
| 197.9.255.166 | GET | / | HTTP/2.0 | Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.11 TaoBrowser/2.0 Safari/536.11 | 500 | 240 | 2024-09-19 03:43:53 | 2024-09-19 03:43:53.127000 |
| 207.158.217.64 | POST | /about | HTTP/1.1 | Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; | 404 | 509 | 2024-09-19 03:43:50 | 2024-09-19 03:43:50.123000 |
+-----------------+--------+----------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------+--------+------+---------------------+----------------------------+
15 rows in set (0.00 sec)

mysql> desc demo_logs;
+------------+---------------------+------+------+---------+---------------+
| Column | Type | Key | Null | Default | Semantic Type |
+------------+---------------------+------+------+---------+---------------+
| ip | String | | YES | | FIELD |
| method | String | | YES | | FIELD |
| path | String | | YES | | FIELD |
| protocol | String | | YES | | FIELD |
| user_agent | String | | YES | | FIELD |
| status | UInt32 | | YES | | FIELD |
| size | UInt32 | | YES | | FIELD |
| datetime | TimestampNanosecond | PRI | NO | | TIMESTAMP |
| timestamp | TimestampNanosecond | | YES | | FIELD |
+------------+---------------------+------+------+---------+---------------+
9 rows in set (0.00 sec)

mysql> select * from monitor_cpu order by ts desc limit 15;
+----------------------------+-------+---------+------+--------+--------------------+
| ts | host | region | size | vendor | val |
+----------------------------+-------+---------+------+--------+--------------------+
| 2024-09-19 03:46:12.310000 | host1 | eu-east | m | Azure | 81.1847434699629 |
| 2024-09-19 03:46:09.305000 | host2 | eu-west | s | Azure | 47.471702382943604 |
| 2024-09-19 03:46:06.302000 | host1 | eu-west | m | Azure | 10.62541620347205 |
| 2024-09-19 03:46:03.298000 | host2 | eu-east | l | Azure | 60.0293939887678 |
| 2024-09-19 03:46:00.294000 | host2 | eu-east | xl | AWS | 50.92738133733691 |
| 2024-09-19 03:45:57.290000 | host2 | eu-west | s | Azure | 6.071477553328142 |
| 2024-09-19 03:45:54.285000 | host1 | eu-east | l | Azure | 88.49961065662427 |
| 2024-09-19 03:45:51.281000 | host3 | eu-west | s | Azure | 54.83269013700283 |
| 2024-09-19 03:45:48.277000 | host1 | eu-east | l | Azure | 84.48387978112943 |
| 2024-09-19 03:45:45.271000 | host2 | us-west | xl | Azure | 96.22845331414254 |
| 2024-09-19 03:45:42.268000 | host1 | us-west | m | GCP | 29.737405055642952 |
| 2024-09-19 03:45:39.263000 | host1 | eu-west | l | GCP | 6.580741027423609 |
| 2024-09-19 03:45:36.260000 | host2 | eu-west | l | AWS | 14.249790844756994 |
| 2024-09-19 03:45:33.255000 | host1 | us-west | m | Azure | 98.00983620747533 |
| 2024-09-19 03:45:30.250000 | host1 | us-west | s | GCP | 76.0098811353222 |
+----------------------------+-------+---------+------+--------+--------------------+
15 rows in set (0.00 sec)

mysql> desc monitor_cpu;
+--------+----------------------+------+------+---------+---------------+
| Column | Type | Key | Null | Default | Semantic Type |
+--------+----------------------+------+------+---------+---------------+
| ts | TimestampMillisecond | PRI | NO | | TIMESTAMP |
| host | String | PRI | YES | | TAG |
| region | String | PRI | YES | | TAG |
| size | String | PRI | YES | | TAG |
| vendor | String | PRI | YES | | TAG |
| val | Float64 | | YES | | FIELD |
+--------+----------------------+------+------+---------+---------------+
6 rows in set (0.00 sec)
```

You can also open your browser at http://localhost:4000/dashboard for the Web
Expand Down
25 changes: 23 additions & 2 deletions kafka-ingestion/config_data/pipeline.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
processors:
- dissect:
fields:
- message
patterns:
- '%{ip} - - [%{datetime}] "%{method} %{path} %{protocol}" %{status} %{size} "-" "%{user_agent}"'
- date:
fields:
- datetime
formats:
- "%d/%b/%Y:%H:%M:%S %z"
- date:
fields:
- timestamp
Expand All @@ -7,9 +17,20 @@ processors:

transform:
- fields:
- message
- ip
- method
- path
- protocol
- user_agent
type: string
- fields:
- timestamp
- status
- size
type: uint32
- fields:
- datetime
type: timestamp
index: timestamp
- fields:
- timestamp
type: timestamp
2 changes: 1 addition & 1 deletion kafka-ingestion/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ services:
- demo-network
depends_on:
greptimedb:
condition: service_started
condition: service_healthy
volumes:
- ./config_data:/config_data
command: sh -c "curl -X 'POST' 'http://greptimedb:4000/v1/events/pipelines/demo_pipeline' -F 'file=@/config_data/pipeline.yaml' -v"
Expand Down
Loading