Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3 replicas in Deployment collecting same k8s events (duplication) #520

Open
dianakutca opened this issue Jun 13, 2024 · 0 comments
Open

Comments

@dianakutca
Copy link

Bug Report

Describe the bug
We are trying to set up Fluent Bit in a high-availability configuration with 3 replicas. Each pod is processing the same Kubernetes events, leading to duplication of events when sending them to S3.
No available filter for deduplication

To Reproduce

  1. Deploy Fluent Bit as a Deployment with 3 replicas.
  2. Configure the input plugin for Kubernetes events as described in the documentation.
  3. Configure the output plugin to send events to S3.

Configuration files: server, input, filters and output

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Values.name }}
  namespace: {{ .Values.namespace }}
  labels:
    app: {{ .Values.appName }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app: {{ .Values.appName }}
  template:
    metadata:
      labels:
        app: {{ .Values.appName }}
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "2020"
        prometheus.io/path: /api/v1/metrics/prometheus
    spec:
      {{- if .Values.fluentbit.tolerations.enable }}
      tolerations:
      {{- toYaml .Values.fluentbit.tolerations.items | nindent 6 }}
      {{- end }}
      serviceAccountName: {{ .Values.serviceAccount.name }}
      initContainers:
      - name: fmeta-list
        image: "{{ .Values.fmeta.image.repository }}:{{ .Values.fmeta.image.tag }}"
        command:       ["python", "/src/get_pods.py", "list"]
        imagePullPolicy: Always
        env: 
        - name: META_DIR
          value: "{{ .Values.fluentbit.env.metadata_dir }}"
        - name: POD_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        volumeMounts:
        - mountPath: /mnt/fmeta
          name: meta-events
      containers:
      - name: fluent-bit
        image: "{{- if .Values.fluentbit.enable_debug }}{{ .Values.fluentbit.image.repository }}:{{ .Values.fluentbit.image.tag_debug }}{{- else }}{{ .Values.fluentbit.image.repository }}:{{ .Values.fluentbit.image.tag }}{{- end }}"
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 2020
          protocol: TCP
        env:
        - name: FLUENT_CODE_CHANGE_DATE
          value: "{{ .Values.fluentbit.env.code_change_date}}"
        - name: FLUENT_OS_LOCATION
          value: "{{ .Values.fluentbit.env.object_storage_location}}"
        - name: FLUENT_OS_BUCKET
          value: "{{ .Values.fluentbit.env.object_storage_bucket}}"
        - name: FLUENT_OS_REGION
          value: "{{ .Values.fluentbit.env.object_storage_region}}"
        - name: AWS_ACCESS_KEY_ID
          value:  "{{ .Values.fluentbit.env.object_storage_key}}"
        - name: AWS_SECRET_ACCESS_KEY
          value:  "{{ .Values.fluentbit.env.object_storage_secret}}"
        - name: FLUENT_OS_UPLOAD_TIMEOUT
          value: "{{ .Values.fluentbit.env.object_storage_upload_timeout}}"
        - name: FLUENT_META_CACHE_DIR
          value: "{{ .Values.fluentbit.env.metadata_dir }}"
        - name: FLUENT_MEMBUFLIMIT
          value: "{{ .Values.fluentbit.env.memory_buffer_limit }}"
        - name: FLUENT_REFRESH_INTERVAL
          value: "{{ .Values.fluentbit.env.refresh_interval }}"
        - name: EMPTYDIR_PATH
          value: "{{ .Values.kubernetes.env.emptydir_dir_location }}"
        - name: PVC_PATH
          value: "{{ .Values.kubernetes.env.pvc_dir_location }}"
        - name: CLUSTER_NAME
          value: "{{ .Values.kubernetes.env.cluster_name }}"
        - name: FLUENT_TAIL_DB
          value:  "{{ .Values.fluentbit.env.tail_db_file }}"
        resources:
          limits:
            cpu: "{{ .Values.fluentbit.limits.cpu }}"
            memory: "{{ .Values.fluentbit.limits.memory }}"
            ephemeral-storage: "{{ .Values.fluentbit.limits.storage }}"
          requests:
            cpu: "{{ .Values.fluentbit.requests.cpu }}"
            memory: "{{ .Values.fluentbit.requests.memory }}"
            ephemeral-storage: "{{ .Values.fluentbit.requests.storage }}"
        volumeMounts:
        - mountPath: /mnt/fmeta
          name: meta-events
        - mountPath: /fluent-bit/etc/
          name: {{ .Values.configMapName }}-config
        - name: varlibpath-events
          mountPropagation: HostToContainer
          mountPath: "{{ .Values.fluentbit.mount_path }}"
          readOnly: true
        - name: flb-db-path-events
          mountPath: /mnt/var/events   
      - name: fmeta-watch
        image: "{{ .Values.fmeta.image.repository }}:{{ .Values.fmeta.image.tag }}"
        command:       ["python", "/src/get_pods.py", "watch"]
        imagePullPolicy: Always
        env: 
        - name: META_DIR
          value: "{{ .Values.fluentbit.env.metadata_dir }}"
        - name: POD_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          privileged: true
        resources:
          limits:
            cpu: "{{ .Values.fmeta.limits.cpu }}"
            memory: "{{ .Values.fmeta.limits.memory }}"
            ephemeral-storage: "{{ .Values.fmeta.limits.storage }}"
          requests:
            cpu: "{{ .Values.fmeta.requests.cpu }}"
            memory: "{{ .Values.fmeta.requests.memory }}"          
            ephemeral-storage: "{{ .Values.fmeta.requests.storage }}"    
        volumeMounts:
        - mountPath: /mnt/fmeta
          name: meta-events
        - name: varlibpath-events
          mountPropagation: HostToContainer
          mountPath: "{{ .Values.fluentbit.mount_path }}"
          readOnly: true
      volumes:
      - name: meta-events
        emptyDir: {}
      - configMap:
          defaultMode: 420
          name: {{ .Values.configMapName }}-config
        name: {{ .Values.configMapName }}-config
      - name: varlibpath-events
        hostPath:
          path: "{{ .Values.kubernetes.env.var_lib_location }}"
      - name: flb-db-path-events
        hostPath:
          path: "{{ .Values.kubernetes.env.flb_db_path }}"


apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ .Values.configMapName }}-config
  namespace: {{ .Values.namespace }}
  labels:
    app: {{ .Values.appName }}
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush         5
        Grace         20
        {{- if .Values.fluentbit.enable_debug }}
        Log_Level     debug
        {{- else }}
        Log_Level     debug
        {{- end }}
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   Off
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020
    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-s3.conf

  input-kubernetes.conf: |
    [INPUT]
        name              kubernetes_events
        tag               k8s_events
        kube_url          https://kubernetes.default.svc
        db                /var/log/event.db
        DB.Sync           normal


  filter-kubernetes.conf: |
    # Section Kubernetes specific filter, DO NOT CHANGE
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Annotations         Off
        Merge_Log           Off
        Labels              Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
        Kube_meta_preload_cache_dir  ${FLUENT_META_CACHE_DIR}
        Regex_Parser        custom-kube-filter
        Kube_Tag_Prefix     kube.
        tls.verify          Off


  output-s3.conf: |
    [OUTPUT]
        Name                         s3
        Match                        *
        Bucket                       ${FLUENT_OS_BUCKET}
        Region                       ${FLUENT_OS_REGION}
        Endpoint                     ${FLUENT_OS_LOCATION}
        Total_File_Size              90M
        Store_Dir_Limit_Size         200M
        # Compression                  gzip
        S3_Key_Format                /events/queue/%Y/%m/%d/%H/$UUID.json
        S3_Key_Format_Tag_Delimiters .-
        Upload_Timeout               ${FLUENT_OS_UPLOAD_TIMEOUT}
        Auto_Retry_Requests          True
        Workers                      1
        Use_Put_Object               True

Expected behavior
Fluent Bit should ensure that events are processed only once and sent to S3 without duplication.

Screenshots
image

Your Environment

  • Version used: 2.2
  • Environment name: Kubernetes
  • Filters and plugins: grep, lua, etc

Anything i am missing ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant