Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PrincsFilter to ClientFilter reimplementation #186

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

MrAnno
Copy link
Contributor

@MrAnno MrAnno commented Oct 25, 2024

Client filter

Filtering modes:

  • Only: the subscription will only be shown to the listed clients
  • Except: the subscription will be shown to everyone except the listed clients

Filtering types:

  • KerberosPrinc: the filter will be evaluated on the Kerberos principal
  • TLSCertSubject: the filter will be evaluated on the TLS certificate's subject field
  • MachineID: the filtering is done based on the name of the computer

The default is either KerberosPrinc or TLSCertSubject, depending on how server authentication is configured.

Warning: MachineID is not cryptographically authenticated information, it can be spoofed.

Filtering flags:

  • GlobPattern: Glob patterns like * and ? can be used in targets
  • CaseSensitive: Filter matching will be case-sensitive

Flags are composable using the | operator.
The comparison is case-insensitive by default.


Backward compatibility:

  • case-sensitive behavior remains default when using the princs field instead of targets.
  • the deprecated CLI tool operates in compat mode, it will continue to work with old filters, but it can cause confusion if someone starts using both subscription config files (with new filters) and the CLI tool to modify filters.
  • should we implement the V3 Import/Export API?

@MrAnno MrAnno force-pushed the subscription-wildcard-filter branch from 4144e3a to 7220434 Compare October 25, 2024 22:10
@vruello
Copy link
Contributor

vruello commented Oct 26, 2024

Hi! Thanks, again, for this cool PR!

Let's start this (huge) answer with a some history! At first, OpenWEC only supported Kerberos authentication. The Kerberos principal filter was implemented to mimic the subscription ACL of MS WEC, even if explicitly listing principals is quite tedious. I had some plans to parse the Kerberos PAC at some point to enable filtering on AD groups or claims, but I never found the time (and the motivation) to do it. I later added the custom "uri" system to manage which subscriptions are available to which machines, depending on the configuration of the machines. When TLS authentication was implemented, we extended the "Kerberos principal filter" to support TLS subjects, but we did not change the name out of laziness (sorry).

In practice, my organization used custom "uri" to tag events based on the location of the machines, but we stopped doing that at some point. We never used the prinicpal/subject filtering feature. I assume you have a specific use case in mind. Could you share it?

cert_subjects alias

I do agree that the name princ is very unfriendly since the support of TLS was introduced. I wanted to change it to machine to be less "Kerberos" specific, but I never took the time to do it. Your solution of aliasing the field in configuration files looks good for now (but see my thoughts at the end).

glob patterns

I agree that using glob patterns maybe useful in some cases, but I'm not a big fan of the has_wildcard function to distinguish between "normal" strings and "glob" strings. I see two ways to handle this:

  1. The user must say, for each string or for all "strings" at once, whether it is a glob pattern or a normal string. This puts the burden on the user, but he is the only one who knows what he really wants.
  2. We consider all strings to be glob patterns. It might have a slight impact on performance, but I think it should be negligible. However, there may be issues where a user expects the filter to explicitly match a string that OpenWEC interprets as a glob pattern, leading to unexpected behaviors.

MachineID filter

There is a huge difference between authentication level filters and "machine controlled fields" filters. I decided to implement only authentication level filters because they can be trusted. On the other side, machine controlled fields (such as "MachineID") can be spoofed very easily (see https://github.com/cea-sec/openwec/blob/main/doc/issues.md#hunting-rogue-windows-event-forwarder). There is no mechanism to ensure that MachineID or Hostname matches the Kerberos principal or the TLS subject name.

Do you have a specific use case where you can't filter by subject/principal/uri?

Filter rework

Before today, I considered subscriptions filter to be a legacy feature that still works. However, if they are really being used and if they need to be improved/changed, then I think we should rebuild them from the ground instead of trying to add some patches here and there.

Currently, each subscription can only have a single filter which consists of an "operation" (None/Only/Except) and a list of values wrongly named "princs" :

pub enum PrincsFilterOperation {
    Only,
    Except,
}
pub struct PrincsFilter {
    operation: Option<PrincsFilterOperation>,
    princs: HashSet<String>,
}

We could:

  • rename "princ" to "values" or "trustees".
  • add an enum field "type" which could be equal to "kerberos principal", "tls subject", "machine id", ...
  • maybe add another field named "flags" that would specify whether the comparison is case sensitive, whether the values are patterns, ..., maybe implemented using bitflags ?
  • also, I can't remember why I decided that SubscriptionData.princs_filter would be a PrincsFilter (and using PrincsFilter.operation = None as a noop) instead of an Option<PrincsFilter>, so maybe we could change that too.

That would lead to a PrincsFilter struct that looks like this:

pub struct PrincsFilter {
    operation: PrincsFilterOperation,
    type: FilterType,
    flags: FilterFlags,
    trustees: HashSet<String>,
}

To go further, we could make it work like a Windows "ACL", where each "ACE" would have an operation type (allow, deny), a trustee type (kerberos principal, tls subject, MachineID, ...), some flags, and a trustees list. Then, SubscriptionData.PrincsFilter would be a (possibly empty) list of filters.

What do you think?

@MrAnno
Copy link
Contributor Author

MrAnno commented Oct 26, 2024

Thank you for the detailed and thorough explanation. :)

I also had a feeling that something bigger needs to be done around the current filtering implementation, so I really like your idea. The case-insensitivity option is also something that I find very useful, I would even consider changing the default (but only with the new config naming, I don't want to break anything when one uses the old option names, of course).

In our use-cases, machine filtering is essential, as events are coming from very different domains, locations, etc. towards the same collector, and even with wildcards, it is sometimes non-trivial to achieve the the desired categorization when using TLS.

The URI trick sounds good, but managing "what we collect and from where" would be better being stored on the server side, we would like to keep everything unified and stupid on the client side.

So all things considered, I would gladly continue this implementation in the direction you suggested as "filter rework". (Sorry for the typo, I usually wrap up conversations terribly, so I asked for a little help :))

@MrAnno
Copy link
Contributor Author

MrAnno commented Oct 26, 2024

About the MachineID part:

I don't have a specific use case that would not work with principals or cert subjects, I just had a fairly old memory when I was investigating the Microsoft implementation of WEC that in case of the source-initiated push method with TLS, their wildcard filters might have been applied on the Machine field of the request and not on the cert DN or common name.

@vruello
Copy link
Contributor

vruello commented Oct 26, 2024

I don't have a specific use case that would not work with principals or cert subjects, I just had a fairly old memory when I was investigating the Microsoft implementation of WEC that in case of the source-initiated push method with TLS, their wildcard filters might have been applied on the Machine field of the request and not on the cert DN or common name.

Ok. I'm fine with adding such a filter as long as the documentation clearly states that the value can be manipulated by an evil machine.

So all things considered, I would gladly continue this implementation in the direction you suggested as "filter rework".

That's nice! I will be happy to help you if you need it 😄 Do you plan to implement the "single filter" approach or the last one with multiple filters?

@MrAnno
Copy link
Contributor Author

MrAnno commented Oct 29, 2024

I would go with the single filter list and try to implement it in a way that wouldn't cause too many problems if one wanted to extend the functionality to the ACL-like idea you mentioned.

The single filter list with flags (case-sensitivity, pattern) and a type field covers all my use cases.

@MrAnno
Copy link
Contributor Author

MrAnno commented Dec 3, 2024

Sorry for the delay, I'll update the PR soon.

The name "princs" does not make sense when TLS is used, so an alias
has been added:

[filter]
operation = "Only"
cert_subjects = [
    "win10.windomain.local",
]
@MrAnno MrAnno changed the title Support wildcards in principal/cert_subject filters Support wildcards in client filters Dec 5, 2024
@MrAnno MrAnno force-pushed the subscription-wildcard-filter branch from 7220434 to 3bbb970 Compare December 6, 2024 16:00
@MrAnno MrAnno marked this pull request as draft December 6, 2024 16:01
@MrAnno MrAnno force-pushed the subscription-wildcard-filter branch from 3bbb970 to 183c7b5 Compare December 11, 2024 18:43
MrAnno and others added 4 commits December 13, 2024 12:22
This was used only by the CLI interface, which is deprecated.
Subscriptions are now config files, which represents the current state,
so no modifier methods are required.
It will be used to filter clients based on their self-advertised name.
@MrAnno MrAnno force-pushed the subscription-wildcard-filter branch from 183c7b5 to b7c61b4 Compare December 13, 2024 11:25
@MrAnno MrAnno changed the title Support wildcards in client filters PrincsFilter to ClientFilter reimplementation Dec 13, 2024
@MrAnno MrAnno marked this pull request as ready for review December 13, 2024 11:28
@MrAnno
Copy link
Contributor Author

MrAnno commented Dec 13, 2024

@vruello I updated the PR description.

I'm still learning Rust, and it seems I have made a few relatively bad decisions, so the implementation does not look as clean as I wanted it to be.
Using static or dynamic dispatch through a ClientFilter trait and with 2 implementations would have ended up being cleaner.

@MrAnno MrAnno force-pushed the subscription-wildcard-filter branch from b7c61b4 to 1f5850f Compare December 13, 2024 11:55
@vruello
Copy link
Contributor

vruello commented Dec 13, 2024

Hi! Thanks for all your work on this feature! Do you think it is ready for review?

@MrAnno
Copy link
Contributor Author

MrAnno commented Dec 13, 2024

My pleasure. :)

Do you think it is ready for review?

Yes, I think it is ready.
If you find this enum-based implementation terrible, please let me know and I will try the static or dynamic dispatch alternatives.

@vruello
Copy link
Contributor

vruello commented Dec 13, 2024

Cool! I will check it out as soon as possible 😄

Copy link
Contributor

@vruello vruello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! (and sorry for being a bit slow to review it)

I think it is much better than before. I wrote some comments. Most of the remaining work seems to be adding a new export version and adding some tests for filters.

I have some doubts about whether the "type/kind" (TLS subject, Kerberos principal, Machine ID) and the "flags" (case sensitive, "exact" or "glob") of targets should be global or target-specific. In fact, it feels more natural (and powerful) to specify them "per target", but I am not sure of how to format them in configuration files.

Do you have an opinion on this? (we need to decide this before (re-)working on the implementation)

# [filter]
# operation = "Only"
# princs = ["courgette@REALM", "radis@REALM"]
# type = "MachineID"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the targets are Kerberos principals so this is odd. We should either use the KerberosPrinc type or change the targets to computer names.

Only,
Except,
}

impl Display for PrincsFilterOperation {
impl Display for ClientFilterOperation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably replace this by a strum macro.

}
}
}

impl PrincsFilterOperation {
pub fn opt_from_str(op: &str) -> Result<Option<PrincsFilterOperation>> {
impl FromStr for ClientFilterOperation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be replaced by strum (https://docs.rs/strum/latest/strum/derive.EnumString.html).

pub kind: crate::subscription::ClientFilterType,
#[serde(default)]
pub flags: crate::subscription::ClientFilterFlags,
#[serde(default)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we should allow a filter without targets.

Copy link
Contributor Author

@MrAnno MrAnno Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is optional only because I had to implement backward-compatibility for the old princs field, where case case-sensitivity is on by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be easier to keep the filter case-sensitive by default even with the new syntax. What do you think?

#[serde(default)]
pub flags: crate::subscription::ClientFilterFlags,
#[serde(default)]
pub targets: HashSet<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use a serde alias "princs" instead of another attribute (https://serde.rs/field-attrs.html#alias)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to use an alias, but I had to differentiate the new and the old configuration mode so that I can be backward compatible with case-sensitivity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same answer as just above: what about keeping case-sensitive filters by default?)

let client_filter = match client_filter_op {
Some(op) => {
let client_filter_type: Option<_> = row.get("client_filter_type")?;
let client_filter_type = client_filter_type.unwrap_or("KerberosPrinc".to_owned());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this hardcoded string KerberosPrinc. I think it is better to keep client_filter_type an option and let ClientFilter::from pick the default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you say if I:

  • updated the database column to use an enum type and set Kerberos as default; or
  • implemented a default method for ClientFilterType, which can be used here when the database contains old empty values?

Copy link
Contributor

@vruello vruello Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the database column to use an enum type and set Kerberos as default;

This is PostgreSQL-specific as there are no enums in SQLite. We could use a default value for a string field, but it does not feel right.

implemented a default method for ClientFilterType, which can be used here when the database contains old empty values?

If client_filter_op is set, client_filter_type MUST be set EXCEPT if we are processing an "old" filter. In this case, ClientFilter::from could easily use an unwrap_or_default() on ClientFilterType as you say, but... there is actually a third option:

Write the "legacy" value in the migration (i.e. put "KerberosPrinc" in client_filter_type if client_filter_op is not null in the migration).

Since this is only a "migration" issue, I think it should be handled properly in the migration code. We don't want to have to maintain weird cases caused by past migrations in the application code forever.

@@ -546,6 +556,11 @@ impl Database for SQLiteDatabase {

async fn store_subscription(&self, subscription: &SubscriptionData) -> Result<()> {
let subscription = subscription.clone();
let client_filter_op: Option<String> = subscription.client_filter().map(|f| f.operation().to_string());
let client_filter_type = subscription.client_filter().map(|f| f.kind().to_string());
let client_filter_flags = subscription.client_filter().map(|f| f.flags().to_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to store flags as a number in database.

}
}

pub fn eval(&self, client: &str, machine_id: Option<&str>) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cool to have some tests for filters as their logic gets a bit more complicated.

filter.set_princs(value.princs)?;
Ok(filter)
fn try_from(value: ClientFilter) -> std::prelude::v1::Result<Self, Self::Error> {
#[allow(deprecated)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this allow since new_legacy itself is not deprecated (and shouldn't be to support backwards compatibility)?

pub fn from(operation: String, kind: String, flags: Option<String>, targets: Option<String>) -> Result<Self> {
let flags: ClientFilterFlags = bitflags::parser::from_str_strict(flags.unwrap_or_default().as_str()).map_err(|e| anyhow!("{:?}", e))?;

let mut t = if flags.contains(ClientFilterFlags::GlobPattern) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like one-letter variables 😢

@MrAnno
Copy link
Contributor Author

MrAnno commented Dec 20, 2024

Do you have an opinion on this? (we need to decide this before (re-)working on the implementation)

I think it would be unnecessary (at least in our use cases) to allow per-target types and flags.
Specifying this per-target in the configuration would add a lot of boilerplates, so we would need to support a "default" as well to avoid repetition. It would be too much in my opinion.

My reasoning is something like this:
Filtering should be uniform and easy to understand. Enabling/disabling wildcards for specific entries may sound reasonable, but playing with case-sensitivity and where the filter input comes from (MachineID vs TLS/Kerberos) seems inconsistent and hard to oversee. If one wanted something like this, I would say a separate subscription would be reasonable (with the side note that I think this won't happen in 90% of the use cases, because most people will just decide on a unified filtering mechanism).

@vruello
Copy link
Contributor

vruello commented Dec 20, 2024

Your reasoning is quite convincing. We are probably in one of those cases where perfect is the enemy of good, and supporting limited filtering capabilities is probably the best choice for now. We'll always be able to add a more complete (and complex) filtering syntax later, if the need arises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants