Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(github-invite): email nudge task #55703

Merged
merged 12 commits into from
Sep 11, 2023

Conversation

cathteng
Copy link
Member

@cathteng cathteng commented Sep 5, 2023

Every month, we will nudge owners and managers from orgs with Github integrations to invite their missing members -- commit authors that have committed to active repositories in the org but are not org members.

For ER-1794

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Sep 5, 2023
Base automatically changed from cathy/github/growth/email-nudge-notification to master September 5, 2023 19:01
@codecov
Copy link

codecov bot commented Sep 5, 2023

Codecov Report

Merging #55703 (dea4422) into master (7300946) will increase coverage by 0.00%.
Report is 2 commits behind head on master.
The diff coverage is 98.46%.

@@           Coverage Diff           @@
##           master   #55703   +/-   ##
=======================================
  Coverage   79.98%   79.98%           
=======================================
  Files        5060     5062    +2     
  Lines      217635   217701   +66     
  Branches    36848    36857    +9     
=======================================
+ Hits       174066   174124   +58     
- Misses      38228    38235    +7     
- Partials     5341     5342    +1     
Files Changed Coverage
src/sentry/conf/server.py ø
.../api/endpoints/organization_missing_org_members.py 96.87%
src/sentry/tasks/invite_missing_org_members.py 100.00%

Copy link
Contributor

@nhsiehgit nhsiehgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No glaring issues afaict
looks good!

a little hesitant to sign off since i don't have full context though 😅

@@ -28,73 +29,69 @@

class MissingOrgMemberSerializer(Serializer):
def serialize(self, obj, attrs, user, **kwargs):
return {"email": obj.email, "externalId": obj.external_id, "commitCount": obj.commit_count}
return {"email": obj.email, "externalId": obj.external_id, "commitCount": obj.commit__count}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 did yo umean to make this a double __?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes -- since i annotated the queryset using .annotate(Count("commit")), the way to get the value of the annotation is with commit__count. i was previously setting the annotation value manually to commit_count by doing .annotate(commit_count=Count("commit")) but i think it's better to use the default naming

organization: Organization, provider: str, integration_ids: Sequence[int]
) -> QuerySet[WithAnnotations[CommitAuthor]]:
member_emails = set(organization.member_set.exclude(email=None).values_list("email", flat=True))
member_emails.update(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh.
Do you need to do this in 2 steps?
maybe instead of an update you could splat both of these into the same set construction? 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i can do set union with set(A) | set(B)


return (
nonmember_authors.filter(
commit__repository_id__in=set(org_repos),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting.

Not sure this is really an issue, but I wonder if there's a way to make this more efficient - rather than querying for both tables, and instead FK-ing the tables or something 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. They're relatively small queries and since this is offline computation I think we can get away with the current logic.

name="sentry.tasks.invite_missing_members.send_nudge_email",
silo_mode=SiloMode.REGION,
)
def send_nudge_email(org_id):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought.

It looks like this might be very bursty and trigger a bunch of queries every month.

Do we want to introduce a jitter or some sort of batching so we don't attempt to do this for every org every month, but instead do every org.id%2 bi monthly or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll filter for Github org integrations first and get the orgs from there. doing that will queue ~50k orgs vs the ~300k that would be queued by filtering for Github repos

set(organization.member_set.exclude(user_email=None).values_list("user_email", flat=True))
)
member_emails = set(
organization.member_set.exclude(email=None).values_list("email", flat=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh - does | join them?
you could also do set1.union(set2) right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or that yes

Copy link
Contributor

@nhsiehgit nhsiehgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool cool cool

+1

Copy link
Member

@sentaur-athena sentaur-athena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A few notes:


return (
nonmember_authors.filter(
commit__repository_id__in=set(org_repos),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. They're relatively small queries and since this is offline computation I think we can get away with the current logic.

@@ -125,7 +122,7 @@ def provider_reducer(dict, integration):
if integration_provider != "github":
continue

queryset = self._get_missing_members(
queryset = _get_missing_organization_members(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases this would be empty so you can early exit here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do have to return something even if it's empty though

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can do extra processing only if the queryset exists

Copy link
Member

@JoshFerge JoshFerge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we going to gate this behind some kind of feature flag? excuse me if that gating is in a separate PR.

@cathteng
Copy link
Member Author

@JoshFerge the gating right now is per org. do you mean some kind of option that prevents the task from scheduling all of the other tasks in the first place?

@cathteng cathteng force-pushed the cathy/github-growth/email-nudge-task branch from c6ef1f1 to dea4422 Compare September 11, 2023 19:35
@JoshFerge
Copy link
Member

@JoshFerge the gating right now is per org. do you mean some kind of option that prevents the task from scheduling all of the other tasks in the first place?

basically: when this is merged, will it begin to send notifications out for every org with a github integration?

@cathteng
Copy link
Member Author

cathteng commented Sep 11, 2023

@JoshFerge this task will run once a month on the first of the month and will only run for orgs with the organizations:integrations-gh-invite flag. although the queueing will still happen for all of them with this current implementation

@JoshFerge
Copy link
Member

invite

got it. great. thanks!

@cathteng cathteng merged commit c984fd2 into master Sep 11, 2023
58 checks passed
@cathteng cathteng deleted the cathy/github-growth/email-nudge-task branch September 11, 2023 21:29
@cathteng cathteng added the Trigger: Revert Add to a merged PR to revert it (skips CI) label Sep 11, 2023
@getsentry-bot
Copy link
Contributor

PR reverted: 22deb13

getsentry-bot added a commit that referenced this pull request Sep 11, 2023
This reverts commit c984fd2.

Co-authored-by: cathteng <70817427+cathteng@users.noreply.github.com>
@cathteng cathteng restored the cathy/github-growth/email-nudge-task branch September 11, 2023 23:25
@github-actions github-actions bot locked and limited conversation to collaborators Sep 27, 2023
@cathteng cathteng changed the title feat(github-growth): email nudge task feat(github-invite): email nudge task Dec 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components Trigger: Revert Add to a merged PR to revert it (skips CI)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants