Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[add_session_metadata processor] Enrich events with user and group names #39537

Merged
merged 8 commits into from
May 15, 2024

Conversation

mjwolf
Copy link
Contributor

@mjwolf mjwolf commented May 13, 2024

Proposed commit message

Update the add_session_metadata processor to add user and group names to enriched events, rather than just IDs, as it was doing previously.

This also renames the UpdateDB function to SyncDB. Previously this function was confusing, because it didn't always update the DB. With ebpf, the DB update is done separately. By renaming and updating the func comment, it should be more clear that the function should synchronize the DB so it's ready for enriching events, either by waiting until the DB is updated, or doing the synchronization itself, as it does with procfs backend.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Adding user and group names will increase event sized very slightly, it shouldn't have an impact on users.

How to test this PR locally

For process events enriched with the add_session_metadata processor, check that documents have values for these fields: process.user.name, process.group.name, process.parent.user.name, process.parent.group.name, process.session_leader.user.name, process.session_leader.group.name, process.group_leader.user.name, process.group_leader.group.name, process.entry_leader.user.name, process.entry_leader.group.name.

Related issues

Use cases

Screenshots

Screenshots showing session view, with user and group names for a process and related processes

Screenshot 2024-05-13 at 2 26 02 PM
Screenshot 2024-05-13 at 2 26 19 PM

Enrich process events with user and group names for the process, and related
processes (parent, session leader, group leader, entry leader).

A new cache for user and group names is added to the process DB, so that the
names only need to be read once, and not on every process event that's received.
@mjwolf mjwolf added enhancement backport-skip Skip notification from the automated backport with mergify labels May 13, 2024
@mjwolf mjwolf requested a review from a team May 13, 2024 22:13
@mjwolf mjwolf requested a review from a team as a code owner May 13, 2024 22:13
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 13, 2024
@mergify mergify bot assigned mjwolf May 13, 2024
cval.name = user.Username
cval.found = true
}
return cval.name, cval.found
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get this correctly before we return it we should save cval in the map right? That said, I think that the cache sure helps with the performance (especially if we are talking about LDAP-based users) but on the other hand it can affect the reliability of the data, e.g.

$ sudo useradd jane
$ sudo su jane
$ id
uid=1000(jane) gid=1001(jane) groups=1001(jane)
$ exit
$ sudo usermod -l new_jane jane
$ sudo su new_jane
$ id
uid=1000(new_jane) gid=1001(jane) groups=1001(jane)

I just renamed a user but the Id remains the same so better to search for it every-time?!?

Copy link
Contributor Author

@mjwolf mjwolf May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed all the caching here, it turns out it has no benefit. I've also tried changing the username as the benchmarking ran, and it returned the new username immediately.

@mjwolf
Copy link
Contributor Author

mjwolf commented May 15, 2024

I did some benchmarking on this (with this), and I learned a few things

  1. The uncached use of user.LookupId(id) is faster than my cached version
  2. The library has two implementations depending on if CGO is enabled or not, it will either use getpwnam syscall or read /etc/passwd.
  3. Both versions must have some sort of caching lower in the stack, they run much faster than if a file were actually read every time (~70ns/op vs ~5500ns/op for just reading the file)

Here's the benchmark results of 1. My caching code, 2. user.LookupId(id) with CGO disabled, 3. user.LookupId(id) with CGO enabled, 4. Just reading /etc/passwd

mwolf@fedora:~/git/beats/x-pack/auditbeat/processors/sessionmd/processdb$ CGO_ENABLED=0 go test -gcflags="all=-N -l" -v -bench=. -run=none  -count=5 -benchtime=5s
goos: linux
goarch: arm64
pkg: github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/processdb
BenchmarkCached
BenchmarkCached-8       60177172               101.4 ns/op
BenchmarkCached-8       60853856               102.9 ns/op
BenchmarkCached-8       59322025               103.3 ns/op
BenchmarkCached-8       60308276               102.6 ns/op
BenchmarkCached-8       60827533               103.1 ns/op
BenchmarkUnCached
BenchmarkUnCached-8     90296116                68.92 ns/op
BenchmarkUnCached-8     87325711                68.64 ns/op
BenchmarkUnCached-8     87732696                69.06 ns/op
BenchmarkUnCached-8     92894774                69.88 ns/op
BenchmarkUnCached-8     75549919                68.36 ns/op
PASS
ok      github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/processdb     64.974s
mwolf@fedora:~/git/beats/x-pack/auditbeat/processors/sessionmd/processdb$ CGO_ENABLED=1 go test -gcflags="all=-N -l" -v -bench=BenchmarkUnCached -run=none  -count=5 -benchtime=5s
goos: linux
goarch: arm64
pkg: github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/processdb
BenchmarkUnCached
BenchmarkUnCached-8     87594390                68.83 ns/op
BenchmarkUnCached-8     90117325                66.67 ns/op
BenchmarkUnCached-8     89905770                68.13 ns/op
BenchmarkUnCached-8     90145305                69.15 ns/op
BenchmarkUnCached-8     82918654                68.41 ns/op
PASS
ok      github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/processdb     30.452s
mwolf@fedora:~/git/beats/x-pack/auditbeat/processors/sessionmd/processdb$ CGO_ENABLED=1 go test -gcflags="all=-N -l" -v -bench=BenchmarkFileRead -run=none  -count=5 -benchtime=5s
goos: linux
goarch: arm64
pkg: github.com/elastic/beats/v7/x-pack/auditbeat/processors/sessionmd/processdb
BenchmarkFileRead
BenchmarkFileRead-8      1000000              5431 ns/op
BenchmarkFileRead-8      1000000              5458 ns/op
BenchmarkFileRead-8      1000000              5484 ns/op
BenchmarkFileRead-8      1000000              5455 ns/op
BenchmarkFileRead-8      1000000              5666 ns/op
PASS

Profiling has shown that `user.LookupId(id)` is already very fast, and there's
no need to add a cache for this data.
@mjwolf mjwolf added the Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution label May 15, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/sec-deployment-and-devices (Team:Security-Deployment and Devices)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 15, 2024
Copy link
Contributor

@pkoutsovasilis pkoutsovasilis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice findings @mjwolf , LGTM

@mjwolf mjwolf merged commit 4f65d8c into elastic:main May 15, 2024
17 checks passed
@mjwolf mjwolf deleted the session-view-names branch May 15, 2024 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify enhancement Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Auditbeat] Session view showing uid of Linux user initiated the session instead of user name
3 participants