Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support to api security sampling #4755

Merged
merged 20 commits into from
Nov 18, 2024
Merged

add support to api security sampling #4755

merged 20 commits into from
Nov 18, 2024

Conversation

IlyasShabi
Copy link
Contributor

What does this PR do?

  • Implements a new API security sampling algorithm using an LRU cache with a 30s TTL
  • Delays schema extraction until the end of the request processing

Motivation

The current API security sampling method randomly selects 10% of requests. This PR introduces new algorithm based on request priority, utilizing an LRU cache with TTL for each { url, statusCode, method }

Delaying the schema extraction to the end of the request ensures that we have an accurate priority value

Plugin Checklist

Copy link

github-actions bot commented Oct 3, 2024

Overall package size

Self size: 7.97 MB
Deduped: 65.01 MB
No deduping: 65.35 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.2.1 | 19.18 MB | 19.19 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.5.0 | 2.51 MB | 2.65 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.0.1 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | path-to-regexp | 0.1.10 | 6.38 kB | 6.38 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@pr-commenter
Copy link

pr-commenter bot commented Oct 3, 2024

Benchmarks

Benchmark execution time: 2024-11-13 08:30:06

Comparing candidate commit 37d6aab in PR branch api-security-sampling with baseline commit 1ee8000 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.

@IlyasShabi IlyasShabi self-assigned this Oct 3, 2024
@IlyasShabi IlyasShabi force-pushed the api-security-sampling branch from e5a079e to ca8bb68 Compare October 3, 2024 15:09
@IlyasShabi IlyasShabi marked this pull request as ready for review October 3, 2024 15:17
@IlyasShabi IlyasShabi requested review from a team as code owners October 3, 2024 15:17
index.d.ts Show resolved Hide resolved
@uurien uurien requested a review from iunanua October 4, 2024 10:43
@IlyasShabi IlyasShabi force-pushed the api-security-sampling branch 2 times, most recently from f87c0d0 to 7d0ace0 Compare October 7, 2024 12:22
@IlyasShabi IlyasShabi requested a review from a team as a code owner October 8, 2024 07:59
docs/test.ts Show resolved Hide resolved
@IlyasShabi IlyasShabi force-pushed the api-security-sampling branch 6 times, most recently from a84bfce to f5c3bd3 Compare October 10, 2024 17:43
@IlyasShabi IlyasShabi force-pushed the api-security-sampling branch from f5c3bd3 to 3ecd200 Compare October 11, 2024 14:58
Copy link
Member

@simon-id simon-id left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't reviewed it all yet but that should keep you busy:

index.d.ts Outdated Show resolved Hide resolved
packages/dd-trace/src/appsec/api_security_sampler.js Outdated Show resolved Hide resolved
packages/dd-trace/src/appsec/api_security_sampler.js Outdated Show resolved Hide resolved
packages/dd-trace/src/appsec/api_security_sampler.js Outdated Show resolved Hide resolved
@simon-id
Copy link
Member

simon-id commented Nov 7, 2024

Code LGTM now, but haven't had time to fully review the tests. still I found some problems

@IlyasShabi IlyasShabi force-pushed the api-security-sampling branch from 56d45e9 to fad6a65 Compare November 12, 2024 10:15
@simon-id
Copy link
Member

You got some system tests failures for API Security scenario, is it just the old tests that we should disable now ?

@simon-id
Copy link
Member

Code and tests LGTM, but let's resolve that CI failure

@IlyasShabi
Copy link
Contributor Author

In some scenarios, we need to specify that the sampling delay is 0 "DD_API_SECURITY_SAMPLE_DELAY": "0.0" which means we want to sample all requests. Unfortunately, the @isaacs/ttlcache library does not support ttl: 0, so we need to handle this ourselves.

Here are some failing system test scenarios:

Solutions:

  • As implemented in the last commit, if the delay is 0, we instantiate a dummy TTL cache with basic functions to handle this edge case.
  • Completely remove the @isaacs/ttlcache library and replace it with a custom implementation, similar to the python implementation

WDYT? @iunanua @simon-id

@IlyasShabi IlyasShabi merged commit bdbeb02 into master Nov 18, 2024
214 checks passed
@IlyasShabi IlyasShabi deleted the api-security-sampling branch November 18, 2024 08:30
rochdev pushed a commit that referenced this pull request Nov 19, 2024
* add support to api security sampling

* fix express plugin schema extraction

* use priority simpler to get span priority

* use lru cache package

* use route path instead of url

* use route.path or url to generate the key

* use ttlcache

* Fix standalone integration test

* Increase test timeout

* simplify force sample

* avoid checking is sampled twice

* use route.path or url to generate the key

* remove unnecessary tests of sample delay

* fix non experimental options test

* remove unused isSampled

* always sample request if delay is 0

---------

Co-authored-by: Igor Unanua <igor.unanua@datadoghq.com>
Co-authored-by: simon-id <simon.id@datadoghq.com>
@rochdev rochdev mentioned this pull request Nov 19, 2024
rochdev pushed a commit that referenced this pull request Nov 19, 2024
* add support to api security sampling

* fix express plugin schema extraction

* use priority simpler to get span priority

* use lru cache package

* use route path instead of url

* use route.path or url to generate the key

* use ttlcache

* Fix standalone integration test

* Increase test timeout

* simplify force sample

* avoid checking is sampled twice

* use route.path or url to generate the key

* remove unnecessary tests of sample delay

* fix non experimental options test

* remove unused isSampled

* always sample request if delay is 0

---------

Co-authored-by: Igor Unanua <igor.unanua@datadoghq.com>
Co-authored-by: simon-id <simon.id@datadoghq.com>
@rochdev rochdev mentioned this pull request Nov 19, 2024
rochdev pushed a commit that referenced this pull request Nov 19, 2024
* add support to api security sampling

* fix express plugin schema extraction

* use priority simpler to get span priority

* use lru cache package

* use route path instead of url

* use route.path or url to generate the key

* use ttlcache

* Fix standalone integration test

* Increase test timeout

* simplify force sample

* avoid checking is sampled twice

* use route.path or url to generate the key

* remove unnecessary tests of sample delay

* fix non experimental options test

* remove unused isSampled

* always sample request if delay is 0

---------

Co-authored-by: Igor Unanua <igor.unanua@datadoghq.com>
Co-authored-by: simon-id <simon.id@datadoghq.com>
rochdev pushed a commit that referenced this pull request Nov 19, 2024
* add support to api security sampling

* fix express plugin schema extraction

* use priority simpler to get span priority

* use lru cache package

* use route path instead of url

* use route.path or url to generate the key

* use ttlcache

* Fix standalone integration test

* Increase test timeout

* simplify force sample

* avoid checking is sampled twice

* use route.path or url to generate the key

* remove unnecessary tests of sample delay

* fix non experimental options test

* remove unused isSampled

* always sample request if delay is 0

---------

Co-authored-by: Igor Unanua <igor.unanua@datadoghq.com>
Co-authored-by: simon-id <simon.id@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants