Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #16573: get table owners for databaricks & unitycatalog tables #17282

Merged
merged 4 commits into from
Aug 10, 2024

Conversation

harshsoni2024
Copy link
Contributor

@harshsoni2024 harshsoni2024 commented Aug 2, 2024

Describe your changes:

Fix #16573

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@harshsoni2024 harshsoni2024 requested a review from a team as a code owner August 2, 2024 13:16
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Aug 2, 2024

def _check_if_email(self, email_id: str) -> bool:
"""check if email string is valid"""
email_pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do this here? Seems like a more generic check rather than keeping it only for databricks. How are we solving it for other systems such as postgres where usernames are not necessarily emails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the databricks, we get owner name in both name/email type. So that I have applied here to filter out based on email first and then name.
While in other db/dashboard connection we have confirm that we get owner's email(from api/inspector method) and we're passing directly to get reference.
domodb, iceberg, metabase.
I think we can put this method in DatabaseServiceSource or DashboardServiceSource & check with valid email whenever we're not sure. WDYT ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we are already installing the extra, if not OK to add it. better than a regex

@harshsoni2024 harshsoni2024 changed the title Fix #16573: get table owners for databaricks table Fix #16573: get table owners for databaricks & unitycatalog tables Aug 6, 2024
try:
owner_email = EmailStr._validate(owner)
owner_ref = self.metadata.get_reference_by_email(email=owner_email)
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. we dont need to catch ANY exception here no? only the one triggered by wrong strings inside EmailStr
  2. also, I don't think we need to run ._validate. You might just be able to do EmailStr(<string>) and see if it blows up or not

Copy link
Contributor Author

@harshsoni2024 harshsoni2024 Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I checked with EmailStr(string) but it's not working in that way (maybe with v2 it's changed)
    change

Copy link

sonarcloud bot commented Aug 9, 2024

Quality Gate Failed Quality Gate failed for 'open-metadata-ingestion'

Failed conditions
7.8% Coverage on New Code (required ≥ 20%)

See analysis details on SonarCloud

@ulixius9 ulixius9 merged commit 1b04f1f into open-metadata:main Aug 10, 2024
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingestion safe to test Add this label to run secure Github workflows on PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support ownership ingestion for Databricks
4 participants