Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for row filtering and column masking access control #24278

Open
BryanCutler opened this issue Dec 18, 2024 · 0 comments
Open

Add support for row filtering and column masking access control #24278

BryanCutler opened this issue Dec 18, 2024 · 0 comments

Comments

@BryanCutler
Copy link
Contributor

I would like to discuss adding row filtering and column masking to access control as part of governance requirements. This has been discussed several times before, but hasn't reached on consensus on implementation, see #20572, #21913 and #18119.

I propose using the following commits cherry-picked from TrinoDB as a basis:

This implementation has existed for a while, it is straight-forward, has been in use and compatible with current production systems. There are existing extensions with Ranger https://github.com/trinodb/trino/blob/9499dc82f2d23314dbc76b0443bedd121e6400eb/plugin/trino-ranger/src/main/java/io/trino/plugin/ranger/RangerSystemAccessControl.java#L822, Opa https://github.com/trinodb/trino/blob/9499dc82f2d23314dbc76b0443bedd121e6400eb/plugin/trino-opa/src/main/java/io/trino/plugin/opa/OpaAccessControl.java#L732, etc. These could also be ported to Presto.

Expected Behavior or Use Case

Presto Component, Service, or Connector

Access control SPIs to add interfaces to get row filters and column masks. Changes required in Presto main to apply filters/masks to queries.

Possible Implementation

Below are the major changes to Presto from this implementation:

Changes to SPI

The major changes to the SPI are for access control to add interfaces for getting row filters and column masks

    /**
     * Get a row filter associated with the given table and identity.
     *
     * The filter must be a scalar SQL expression of boolean type over the columns in the table.
     *
     * @return the filter, or {@link Optional#empty()} if not applicable
     */
    Optional<ViewExpression> getRowFilter(ConnectorTransactionHandle transactionHandle, ConnectorIdentity identity, AccessControlContext context, SchemaTableName tableName)
    {
        return Optional.empty();
    }

    /**
     * Get a column mask associated with the given table, column and identity.
     *
     * The mask must be a scalar SQL expression of a type coercible to the type of the column being masked. The expression
     * must be written in terms of columns in the table.
     *
     * @return the mask, or {@link Optional#empty()} if not applicable
     */
    default Optional<ViewExpression> getColumnMask(ConnectorTransactionHandle transactionHandle, ConnectorIdentity identity, AccessControlContext context, SchemaTableName tableName, String columnName, Type type)
    {
        return Optional.empty();
    }

Class ViewExpression to hold the filter/mask expression

public ViewExpression(String identity, Optional<String> catalog, Optional<String> schema, String expression)

Changes to Presto main

The major changes to presto-main are done in StatementAnalyzer.java to retrieve filter/masks from access control during the analysis phase and translate the filter or mask into an Expression.

Then in RelationPlanner.java the filter/mask Expressions are applied during a rewrite of the plan.

Draft PR with cherry-picked commits

#24277

Alternate design considered

An alternative way to rewrite the query to apply column masks and row filters is to use the existing SPI for connector plan optimization. This allows the query plan to be rewritten during the optimization phase. This approach has been discussed before and the major downside is that each connector would need to add the feature to enable row filtering and column masking, making it not as centralized as the design above.

Context

Governance is required in most production systems that include the need to filter and mask sensitive data. Presto should have this functionality built-in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant