Data Access Request flow #639
Replies: 4 comments 2 replies
-
Copied from the Slack channel, @harshach wrote: (...) most of orgs/teams have a central policy store such as https://ranger.apache.org or their custom one. Wouldn’t it be beneficial if we do the following workflow
|
Beta Was this translation helpful? Give feedback.
-
This feature would be awesome for our use case too, we are really struggling to find something that can take care of this "request access workflow" |
Beta Was this translation helpful? Give feedback.
-
FYI, also looking for something like this "request access button" + workflow |
Beta Was this translation helpful? Give feedback.
-
Definietly usefull feature. |
Beta Was this translation helpful? Give feedback.
-
I would like to propose a data access request/grant flow:
Imagine a data scientist, Sandra, goes to the metadata catalog and finds a table owned by the Finance department. The data in this table is highly sensitive, so the data scientist does not have permission to access it, not even a sample (unless a washed/masked sample is made available). She can tell from the schema that it likely contains the data she needs for her analysis, so she clicks a “Request access”
button right there in the catalog.
Sandra fills in a short request form:
This request now appears in her Data Access Requests list, and the data owner is notified of the request. The data owner can now open the Request details page and either:
Once a request is granted, the metadata platform would need a way to enforce it (or alternatively, a way to allow an external system to do so); I see a few options:
The reason for implementing the above flow in the metadata catalog should be obvious; no other system contains data about data assets, streaming topics, and even dashboards, as well as data ownership metadata for all of them. This would also create a useful authorization audit trail for checking exactly who had access to what data asset at any given point - and for what reason.
And while there are other solutions for data access control - Kerberos, Active Directory, Lake Formation - these are all tied to specific vendors / technology stacks which makes them unsuitable for an open standard.
Beta Was this translation helpful? Give feedback.
All reactions