Skip to content

mariusmilea/kafka_exporter

Repository files navigation

Introduction

This is a Kafka exporter for Prometheus written in Python. It uses JPype to talk to the JMX server. Using JPype here may look like a hack, but it works very well actually. It is only needed to run one instance of the Kafka exporter and that will take care of collecting the metrics from all the Kafka brokers.

Instalation

  1. Add your kafka brokers into kafka_prom.json
  2. Add the JMX MBeans that you want to query into kafka_jmx_targets.json. The included file comes with good MBean selection, but more could be added.
  3. Add the entry from below to prometheus.yml and replace IP_Exporter with the IP address of the machine where you're running this app from
  - job_name: 'kafka'
    file_sd_configs:
      - files: ['/etc/prometheus/kafka_prom.json']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: <IP_Exporter>:5557
  1. Finally, run the exporter:
make up
  1. Import the included Grafana dashboard Alt text

  2. Add some alerts into Alertmanager:

ALERT KafkaOfflinePartitions
  IF sum(OfflinePartitionsCount) > 0
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    summary = "{{$labels.cluster}} has offline partitions",
    description = "{{$labels.cluster}} has offline partitions"
  }

ALERT KafkaMaxLagclientIdReplica
  IF sum(MaxLagclientIdReplica) > 50
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    summary = "{{$labels.cluster}} has {{$value}} of replica lag",
    description = "{{$labels.cluster}} has {{$value}} of replica lag"
  }

ALERT KafkaActiveControllerCount
  IF sum(ActiveControllerCount) by (cluster) != 1
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    summary = "{{$labels.cluster}} has {{$value}} active controllers",
    description = "{{$labels.cluster}} has {{$value}} active controllers"
  }

ALERT KafkaFailedFetchRequestsPerSec
  IF sum(FailedFetchRequestsPerSec) > 0
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    summary = "{{$labels.cluster}} has {{$value}} failed fetched req/sec",
    description = "{{$labels.cluster}} has {{$value}} failed fetched req/sec"
  }

ALERT KafkaFailedProduceRequestsPerSec
  IF sum(FailedProduceRequestsPerSec) > 0
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    summary = "{{$labels.cluster}} has {{$value}} failed produced req/sec",
    description = "{{$labels.cluster}} has {{$value}} failed produced req/sec"
  }
  1. Run it as a service once you're happy with it:
git clone git@github.com:mariusmilea/kafka_exporter.git
cd kafka_exporter
sudo cp -r etc/kafka_exporter /etc/
sudo cp etc/systemd/system/kafka_exporter.service /etc/systemd/system/
sudo systemctl enable kafka_exporter

About

A Prometheus Kafka exporter written in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published