From bbdc09ef1f85d6aa3a1e677f6e9f027e05029c55 Mon Sep 17 00:00:00 2001 From: David Kilfoyle Date: Fri, 9 Feb 2024 10:10:55 -0500 Subject: [PATCH 1/2] Add LVM example to Filebeat file / device ID docs --- filebeat/docs/inputs/input-filestream.asciidoc | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index e55ff611496..4cc2062da82 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -94,9 +94,11 @@ By default, {beatname_uc} identifies files based on their inodes and device IDs. However, on network shares and cloud providers these values might change during the lifetime of the file. If this happens {beatname_uc} thinks that file is new and resends the whole content -of the file. To solve this problem you can configure `file_identity` option. Possible +of the file. To solve this problem you can configure the `file_identity` option. Possible values besides the default `inode_deviceid` are `path`, `inode_marker` and `fingerprint`. +For example, when using the Linux link:https://en.wikipedia.org/wiki/Logical_Volume_Manager_%28Linux%29[LVM] (Logical Volume Manager), device numbers are allocated dynamically at module load (refer to link:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lv#persistent_numbers[Persistent Device Numbers] in the Red Hat Enterprise Linux documentation). To avoid the possibility of data duplication in this case, you can set `file_identity` to `path`. + WARNING: Changing `file_identity` methods between runs may result in duplicated events in the output. From c0142bc0aefffc6e27a4b90f719ce5118e74e5ba Mon Sep 17 00:00:00 2001 From: David Kilfoyle Date: Mon, 12 Feb 2024 10:08:45 -0500 Subject: [PATCH 2/2] Move addition to under 'native' in the file_identity section --- filebeat/docs/inputs/input-filestream-file-options.asciidoc | 3 +++ filebeat/docs/inputs/input-filestream.asciidoc | 2 -- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/filebeat/docs/inputs/input-filestream-file-options.asciidoc b/filebeat/docs/inputs/input-filestream-file-options.asciidoc index db00a8fe766..a3be665e28e 100644 --- a/filebeat/docs/inputs/input-filestream-file-options.asciidoc +++ b/filebeat/docs/inputs/input-filestream-file-options.asciidoc @@ -527,6 +527,9 @@ duplicated events in the output. *`native`*:: The default behaviour of {beatname_uc} is to differentiate between files using their inodes and device ids. ++ +In some cases these values can change during the lifetime of a file. +For example, when using the Linux link:https://en.wikipedia.org/wiki/Logical_Volume_Manager_%28Linux%29[LVM] (Logical Volume Manager), device numbers are allocated dynamically at module load (refer to link:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lv#persistent_numbers[Persistent Device Numbers] in the Red Hat Enterprise Linux documentation). To avoid the possibility of data duplication in this case, you can set `file_identity` to `path` rather than `native`. [source,yaml] ---- diff --git a/filebeat/docs/inputs/input-filestream.asciidoc b/filebeat/docs/inputs/input-filestream.asciidoc index 4cc2062da82..47d1b24a8e8 100644 --- a/filebeat/docs/inputs/input-filestream.asciidoc +++ b/filebeat/docs/inputs/input-filestream.asciidoc @@ -97,8 +97,6 @@ values might change during the lifetime of the file. If this happens of the file. To solve this problem you can configure the `file_identity` option. Possible values besides the default `inode_deviceid` are `path`, `inode_marker` and `fingerprint`. -For example, when using the Linux link:https://en.wikipedia.org/wiki/Logical_Volume_Manager_%28Linux%29[LVM] (Logical Volume Manager), device numbers are allocated dynamically at module load (refer to link:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lv#persistent_numbers[Persistent Device Numbers] in the Red Hat Enterprise Linux documentation). To avoid the possibility of data duplication in this case, you can set `file_identity` to `path`. - WARNING: Changing `file_identity` methods between runs may result in duplicated events in the output.