Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New automatic user event collection #4674

Merged
merged 93 commits into from
Dec 16, 2024
Merged

New automatic user event collection #4674

merged 93 commits into from
Dec 16, 2024

Conversation

simon-id
Copy link
Member

@simon-id simon-id commented Sep 11, 2024

What does this PR do?

TL;DR:

  • support blocking in passport strategies instrumentation
  • don't send passport strategies instrumentation events if there is an error
  • add new user login events collection logic
  • add a mode config to control the logic (+remote config)
  • pass user data to WAF in user login SDK calls
  • move some code around because god damn it was ugly

RTFM:

  • passport instrumentation:

    • chore: simplify the instrumentation for passport-local and passport-http by sharing the wrappers between the two in passport-utils.js (by using this._passReqToCallback instead of the config object)
    • fix: handle the err argument in passport-utils.js when wrapping the callback of verify() to not publish auth events when a db error occured for example
    • refactor: remove credentials fields in passport verify event as they're useless. Instead add success and login
    • feat: add abort controller blocking support in the datadog:passport:verify:finish channel
    • refactor: use this.name to get strategy name instead of setting an hardcoded value
  • passport-http and passport-local tests:

    • fix: pass passReqToCallback: true in the correct place to actually test this case, which requires having two strategies
  • chore: delete passport-utlis tests as they were not useful and replace them with passport-http and passport-local tests

  • RC for auto user events collection mode:

    • feat: send new RC capability
    • feat: add handler to configure collection mode
      • add a special logic to reject overlapping configurations
  • SDK:

    • refactor: separate code between SDK and automated user events, it was getting too complicated to handle both in one spot
      • remove non SDK logic from the SDK files
    • refactor: extract WAF call from trackEvent() into its own function for better clarity
    • feat: pass user data to the WAF in SDK user event calls
      • add special logic to convert SDK API into WAF events, by passing user ID and login depending on if success or failure
    • feat: add usr.login to the metadata tags if not already provided, with the value of usr.id
  • index.js

    • chore: at startup, set collection mode but do not override if already present, in case RC changed the value already
    • refactor: always subscribe onPassportVerify() because it can be enabled and disabled by RC
    • refactor: change onPassportVerify() a little to put the responsibility of providing the correct data on the publisher instead of the subscriber
    • feat: add blocking support in onPassportVerify()
  • Telemetry:

    • feat: add a counter for login events with missing data
  • Automated User Events:

    • feat: add a new file user_tracking.js to differentiate automated user events from SDK user events
      • remove passport.js and migrated some of the logic to user_tracking.js to make it framework independent and clearer
    • HOW IT WORKS:
      • 3 collection modes, set with setCollectionMode(), called by appsec index and remote config:
        • anon/anonymous/safe: hash the id and login
        • ident/identification/extended: don't hash the id and login
        • disabled: don't collect the user id and login
      • getUserId(): called by trackLogin()
        • we have a list of fields we try to get from the user object: USER_ID_FIELDS, if we find one we use it
        • we try to stringify the value for max compatibility
        • if it looks like an non-serializable object that starts with [object , we skip it
      • trackLogin(): called by appsec index in the passport.verify DC subscriber
        • make a list of tags to be added to rootspan depending on if login is successful or not
        • some tags overlap with the SDK so we don't set them if the SDK already did
        • in case of login success, if the user ID is present, we consider it as the "current" user by setting the tag usr.id, and also sending it to the WAF
        • in all cases we send the user login and the event type to the WAF and return the results so appsec index can handle the results and block if needed
  • Config:

    • refactor: remove internal config appsec.eventTracking.enabled
    • feat: add new collection modes identification and anonymous to appsec.eventTracking.mode
    • feat: add new env var for colleciton modes DD_APPSEC_AUTO_USER_INSTRUMENTATION_MODE
    • feat: add retrocompatibility with the old collection modes from DD_APPSEC_AUTOMATED_USER_EVENTS_TRACKING
  • tests: add missing coverage for existing and new features, fix various typos, incorrect tests, inconsistencies, etc...

  • chore: update typings for new config

Copy link

github-actions bot commented Sep 11, 2024

Overall package size

Self size: 8.28 MB
Deduped: 94.57 MB
No deduping: 95.09 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.2.2 | 29.27 MB | 29.27 MB | | @datadog/native-appsec | 8.3.0 | 19.37 MB | 19.38 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.6.0 | 2.58 MB | 2.72 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.0.1 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@pr-commenter
Copy link

pr-commenter bot commented Sep 11, 2024

Benchmarks

Benchmark execution time: 2024-12-13 12:41:15

Comparing candidate commit a752423 in PR branch new_user_collection with baseline commit 4304684 in branch master.

Found 3 performance improvements and 0 performance regressions! Performance is the same for 261 metrics, 2 unstable metrics.

scenario:plugin-graphql-with-depth-and-collapse-on-18

  • 🟩 max_rss_usage [-87.694MB; -76.382MB] or [-9.273%; -8.077%]

scenario:plugin-graphql-with-depth-off-18

  • 🟩 max_rss_usage [-84.680MB; -72.280MB] or [-9.049%; -7.724%]

scenario:plugin-graphql-with-depth-on-max-18

  • 🟩 max_rss_usage [-121.074MB; -64.386MB] or [-12.653%; -6.729%]

@simon-id simon-id marked this pull request as ready for review December 11, 2024 17:01
@simon-id simon-id requested review from a team as code owners December 11, 2024 17:01
uurien
uurien previously approved these changes Dec 12, 2024
iunanua
iunanua previously approved these changes Dec 12, 2024
@simon-id simon-id dismissed stale reviews from iunanua and uurien via 1df40e2 December 12, 2024 18:19
@simon-id
Copy link
Member Author

system tests are expected to fail. See this PR for green system-tests: DataDog/system-tests#3666

iunanua
iunanua previously approved these changes Dec 13, 2024
Co-authored-by: Igor Unanua <igor.unanua@datadoghq.com>
@simon-id simon-id merged commit 048868e into master Dec 16, 2024
292 checks passed
@simon-id simon-id deleted the new_user_collection branch December 16, 2024 16:14
@rochdev rochdev mentioned this pull request Dec 17, 2024
rochdev pushed a commit that referenced this pull request Dec 17, 2024
@rochdev rochdev mentioned this pull request Dec 17, 2024
rochdev pushed a commit that referenced this pull request Dec 17, 2024
rochdev pushed a commit that referenced this pull request Dec 18, 2024
rochdev pushed a commit that referenced this pull request Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants