An AWS Athena crawler for InfluxDB.
This project is a utility designed to get AWS Athena results (CSV objects stored in AWS S3), parse them and write InfluxDB points.
To be used with AWS and interact with the s3 bucket, an AWS account with the following permissions on s3 is required (note that s3:DeleteObject
is only required if clean-objects is set):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "<BUCKET_NAME>"
},
{
"Effect": "Allow",
"Action": ["s3:ListObjects", "s3:GetObject", "s3:DeleteObject"],
"Resource": "<BUCKET_NAME>/*"
}
]
}
Follow influxdb-athena-crawler documentation for Helm deployment here.
influxdb-athena-crawler takes as argument the parameters below.
Key | Description | Default |
---|---|---|
region | The AWS region. | "" |
bucket | The AWS bucket to watch. | "" |
prefix | The bucket prefix. | "" |
suffix | Filename suffix to restrict files processed on the bucket. | "" |
clean-objects | Whether to delete S3 objects after processing them. | false |
max-object-age | How long to wait since last modification before file cleaning. | 10m |
timeout | The global timeout. | "30s" |
influx-server | The InfluxDB server address. | "" |
influx-token | The InfluxDB token. | "" |
influx-org | The InfluxDB org to write to. | "" |
influx-bucket | The InfluxDB bucket write to. | "" |
measurement | A measurement acts as a container for tags, fields, and timestamps. Use a measurement name that describes your data. | "" |
timestamp-row | The timestamp row in CSV. | "timestamp" |
timestamp-layout | The layout to parse timestamp. | "2006-01-02T15:04:05.000Z" |
tag | Tags to add to InfluxDB point. Could be of the form --tag=foo if tag name matches CSV row or --tag='foo={row:bar}' to specify row. |
"" |
field | Fields to add to InfluxDB point. Could be of the form --field='foo={type:int,row:bar}' , if not specified, CSV row matches field name. Type can be float, int, string or bool. |
"" |
max-routines | The max number of concurrent object processing routines. | 100 |
Distributed under the Apache 2.0 License. See LICENSE
for more information.
We use SemVer for versioning.
Got a question? File a GitHub issue.