-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data masking and filtering to the OPA plugin #16
Add data masking and filtering to the OPA plugin #16
Conversation
- Simplify tests and remove some parameterization - Make more fields final - Remove unnecessary annotations - Make tests more parallelizable - Remove permissioning operations and replace them by a blanket allow/deny setting And rebase - bump pom version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the awesome work and continuing on the implementation!
This is really helpful!
Left one relevant comment regarding batched API
|
||
import static java.util.Objects.requireNonNull; | ||
|
||
public record TrinoColumn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure whats the convention in Trino on javadoc, but it might sense to include your helpful comments on TrinoColumn here:
TrinoColumn is (perhaps counterintuitively) not used for FilterColumns for three easons:
- API stability between the batch & non-batch modes: we don't want to send a full TrinoColumn object (which includes the catalog, schema and table names) per column when performing a FilterColumns operation as this would make the requests rather large
- TrinoColumn may be changed to include additional information about column-specific operations in the future, which are not relevant to FilterColumns
- Backwards compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense, will do!
public OpaContainer() | ||
{ | ||
this.container = new GenericContainer<>(DockerImageName.parse("openpolicyagent/opa:latest-rootless")) | ||
.withCommand("run", "--server", "--addr", ":%d".formatted(OPA_PORT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Sorry to bother you with the same again, but we could test we don't fail on the decision_id
field being present:
.withCommand("run", "--server", "--addr", ":%d".formatted(OPA_PORT)) | |
.withCommand("run", "--server", "--addr", ":%d".formatted(OPA_PORT) | |
"--set", "decision_logs.console=true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, sorry if I missed this on the other PR. Will add it!
@@ -709,6 +714,23 @@ public void checkCanDropFunction(SystemSecurityContext systemSecurityContext, Ca | |||
OpaQueryInputResource.builder().function(TrinoFunction.fromTrinoFunction(functionName)).build()); | |||
} | |||
|
|||
@Override | |||
public List<ViewExpression> getRowFilters(SystemSecurityContext context, CatalogSchemaTableName tableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding getRowFilters
and getColumnMask
: Is my understanding correct that this is currently only implemented by having an OPA call per table (for row filter) / column (for column mask)?
Do you have any plans of implementing a batch variant as well?
Would you even go one step further to assume users using these two functionalities are advanced enough to justify only offering a batched API for code complexity reasons?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinged you on Slack to discuss a bit more, will update this PR comment once I understand the options available a bit better
fd6d65c
to
54873d2
Compare
This is now being pushed upstream |
trinodb#9787 Implement column masking and row level filtering to the proposed plugin in trinodb#19532
This PR adds row level filtering and column masking. It creates a
TrinoColumn
type, which is only used for these operations.TrinoColumn
is (perhaps counterintuitively) not used forFilterColumns
for three easons:TrinoColumn
object (which includes the catalog, schema and table names) per column when performing aFilterColumns
operation as this would make the requests rather largeTrinoColumn
may be changed to include additional information about column-specific operations in the future, which are not relevant toFilterColumns
This PR also adds system tests for row level filtering & column masking.
The idea is to take this PR upstream once trinodb#19532 is merged
CC @mosiac1 as co-developer