From e9efc37d4fd74f0eae349eeca274804eeb7bb157 Mon Sep 17 00:00:00 2001 From: Max Phillips Date: Fri, 4 Oct 2024 14:03:56 -0500 Subject: [PATCH] [DLP] Add file scanning context (#17294) --- .../dlp-profiles/advanced-settings.mdx | 10 ++++++++-- .../policies/data-loss-prevention/index.mdx | 2 ++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx index e9f3c5f446d3fa..eb3672ea1f2c78 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/dlp-profiles/advanced-settings.mdx @@ -15,9 +15,15 @@ Match count refers to the number of times that any enabled entry in the profile ## Context analysis -Context analysis restricts DLP detections based on proximity keywords. Additional proximity keywords must be detected within a distance of 1000 bytes (\~1000 characters) from the original detection to trigger an action. For example, the string `123-45-6789` will only count as a detection if in proximity to keywords such as `ssn`. +Context analysis restricts detections based on proximity keywords to prevent false positives. Proximity keywords must be detected within a distance of 1000 bytes (~1000 characters) from the original detection to trigger an context-aware detection. For example, the string `123-45-6789` will only count as a detection if in proximity to keywords such as `ssn`. -Additionally, you can control context analysis for scans within files. When files are excluded from the context filter, DLP only evaluates uploaded and downloaded files based on regular expression and validation checks. Additional keywords within the file are not required. +DLP will apply context analysis to traffic and the content of [supported files](/cloudflare-one/policies/data-loss-prevention/#supported-file-types). Supported detections include the [Financial Information](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#financial-information) and [Social Security, Insurance, Tax, and Identifier Numbers](/cloudflare-one/policies/data-loss-prevention/dlp-profiles/predefined-profiles/#social-security-insurance-tax-and-identifier-numbers) predefined profiles. + +### Exclude files from context analysis + +You can exclude the content of files from context analysis while still applying context analysis to traffic. For example, if you send an email containing the string `123-45-6789`, DLP will only count a detection if the string is in proximity to keywords such as `ssn`. If you include a file in an email containing the string `123-45-6789`, DLP will match a detection regardless of keywords. + +To exclude file content from context analysis, in **Exclude content type**, choose _Files_. ## Optical Character Recognition (OCR) diff --git a/src/content/docs/cloudflare-one/policies/data-loss-prevention/index.mdx b/src/content/docs/cloudflare-one/policies/data-loss-prevention/index.mdx index 4ee5b181fe8d53..71730cbbb4213c 100644 --- a/src/content/docs/cloudflare-one/policies/data-loss-prevention/index.mdx +++ b/src/content/docs/cloudflare-one/policies/data-loss-prevention/index.mdx @@ -40,6 +40,8 @@ DLP supports scanning the following file types: - PDF - ZIP files containing the above +DLP will scan the text contained in Microsoft Office and PDF files. + ### Size The maximum file size is 100 MB. Size limitation is assessed against the file after unzipping. ZIP files can be recursively compressed a maximum of 10 times.