Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPA: Implement row level filtering and column masking #20921

Merged
merged 5 commits into from
Mar 6, 2024

Conversation

vagaerg
Copy link
Member

@vagaerg vagaerg commented Mar 4, 2024

Description

Add row level filtering and column masking to the OPA System access control plugin

Additional context and related issues

This was originally discussed as a side-effort of #19532 , and a first cut was added to our fork in bloomberg#16

@mosabua should I create a separate PR for docs for this?

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text:

## Security
* Added support for row filtering and column masking in the OPA access control

@cla-bot cla-bot bot added the cla-signed label Mar 4, 2024
Copy link
Member

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Just some minor comments.

In the future, please separate changes like changing the testing setup code from the new feature. It make reviewing quite painful to mix them.

plugin/trino-opa/pom.xml Show resolved Hide resolved
Comment on lines +31 to +32
private Optional<URI> opaRowFiltersUri = Optional.empty();
private Optional<URI> opaColumnMaskingUri = Optional.empty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, it would be nice if the OPA server had a configration endpoint (like Oauth), which told us these URI along with the batch URI. We could then either call this once during setup, or periodically in the background.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea, we were playing with a similar concept. The one thing I'm unsure about is allowing a configuration endpoint to determine whether row filtering is applied at all, for instance - as that feels more like a configuration responsibility. Kind of like if the OAuth config endpoint allowed you to disable auth altogether.

Not opposed to it, just some thoughts.


An alternative:

What's interesting is that we don't technically need these to be different URIs, they could just be different keys from the response and we could do everything OPA-side without including another type of request.

Suppose we had:

  • opa.policy.uri = https://opa/v1/data/my_policy/allow
  • opa.policy.batched-uri = https://opa/v1/data/my_policy/batchAllow
  • opa.policy.row-filters-uri = https://opa/v1/data/my_policy/rowFilters
  • opa.policy.column-masking-uri = https://opa/v1/data/my_policy/columnMask

This is equivalent to querying https://opa/v1/data/my_policy for all possible types of request and pulling the relevant field from the response (allow, batchAllow, rowFilters or columnMask).

The OPA policy itself can even directly determine what result to provide so that Trino doesn't need to pop fields from the response manually. This is because the request unequivocally determines what Trino is looking for:

  • If operation is GetRowFilters: the relevant result is the row filters decision
  • If operation is GetColumnMask: the relevant result is the column mask decision
  • If action.filterResources is present: the relevant result is the batch filtering decision
  • Otherwise, the relevant result is the standard boolean allow decision

Some care would need to be exercised for performance (but we can provide a small rego wrapper to deal with this)

The config would then become:

  • opa.policy.unified-uri (or whatever we call this new "catch-all" URI)
  • opa.policy.enable-row-filtering
  • opa.policy.enable-column-masking

The last two would allow us to disable row filtering and column masking from the config in case there's performance concerns

Maybe we could take this into an issue? CC @sbernauer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds interesting. Let's discuss on slack

Comment on lines 74 to 81
ConfigurationMetadata<OpaConfig> metadata = ConfigurationMetadata.getValidConfigurationMetadata(OpaConfig.class);
ImmutableMap.Builder<String, String> opaConfigBuilder = ImmutableMap.builder();
try {
for (ConfigurationMetadata.AttributeMetadata attribute : metadata.getAttributes().values()) {
convertPropertyToString(attribute.getGetter().invoke(config)).ifPresent(
propertyValue -> opaConfigBuilder.put(attribute.getInjectionPoint().getProperty(), propertyValue));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like a good idea. This makes this code dependent on the internal details of the configuration system (and I think this is the only code in Trino that does something like this). This will make maitaining the Airlift configuration system more complex if this pattern catches on as changes to it will require changes to lots of Trino tests.

Instead, I suggest you either manually build the map, or just use the properties in the tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I was conflicted about whether I liked this or not, I was only trying to avoid having to keep duplicates of every config key.

I'll change this function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note, do you think there's any merit to trying to contribute a function that does this to Airlift? Or is it intentionally not provided?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Airlift supports configurations that can't be reversed like this

@vagaerg
Copy link
Member Author

vagaerg commented Mar 5, 2024

In the future, please separate changes like changing the testing setup code from the new feature. It make reviewing quite painful to mix them.

Sorry about that! This was rebased from our fork and I squashed without too much thought. Will avoid it going forward

Comment on lines +31 to +32
private Optional<URI> opaRowFiltersUri = Optional.empty();
private Optional<URI> opaColumnMaskingUri = Optional.empty();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds interesting. Let's discuss on slack

@dain dain merged commit 3d3a0f0 into trinodb:master Mar 6, 2024
20 checks passed
@github-actions github-actions bot added this to the 440 milestone Mar 6, 2024
@Zahid-Iqbal-Marth
Copy link

Is there any documentation on the usage of this feature? (Row filtering and Column Masking)

@tiagotxm
Copy link

@Zahid-Iqbal-Marth you can get an example here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants