-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(github-invite): email nudge task #55703
Conversation
Codecov Report
@@ Coverage Diff @@
## master #55703 +/- ##
=======================================
Coverage 79.98% 79.98%
=======================================
Files 5060 5062 +2
Lines 217635 217701 +66
Branches 36848 36857 +9
=======================================
+ Hits 174066 174124 +58
- Misses 38228 38235 +7
- Partials 5341 5342 +1
|
f0c9348
to
510e310
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No glaring issues afaict
looks good!
a little hesitant to sign off since i don't have full context though 😅
@@ -28,73 +29,69 @@ | |||
|
|||
class MissingOrgMemberSerializer(Serializer): | |||
def serialize(self, obj, attrs, user, **kwargs): | |||
return {"email": obj.email, "externalId": obj.external_id, "commitCount": obj.commit_count} | |||
return {"email": obj.email, "externalId": obj.external_id, "commitCount": obj.commit__count} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 did yo umean to make this a double __
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes -- since i annotated the queryset using .annotate(Count("commit"))
, the way to get the value of the annotation is with commit__count
. i was previously setting the annotation value manually to commit_count
by doing .annotate(commit_count=Count("commit"))
but i think it's better to use the default naming
organization: Organization, provider: str, integration_ids: Sequence[int] | ||
) -> QuerySet[WithAnnotations[CommitAuthor]]: | ||
member_emails = set(organization.member_set.exclude(email=None).values_list("email", flat=True)) | ||
member_emails.update( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh.
Do you need to do this in 2 steps?
maybe instead of an update you could splat both of these into the same set construction? 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i can do set union with set(A) | set(B)
|
||
return ( | ||
nonmember_authors.filter( | ||
commit__repository_id__in=set(org_repos), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting.
Not sure this is really an issue, but I wonder if there's a way to make this more efficient - rather than querying for both tables, and instead FK-ing the tables or something 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. They're relatively small queries and since this is offline computation I think we can get away with the current logic.
name="sentry.tasks.invite_missing_members.send_nudge_email", | ||
silo_mode=SiloMode.REGION, | ||
) | ||
def send_nudge_email(org_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random thought.
It looks like this might be very bursty and trigger a bunch of queries every month.
Do we want to introduce a jitter or some sort of batching so we don't attempt to do this for every org every month, but instead do every org.id%2
bi monthly or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll filter for Github org integrations first and get the orgs from there. doing that will queue ~50k orgs vs the ~300k that would be queued by filtering for Github repos
set(organization.member_set.exclude(user_email=None).values_list("user_email", flat=True)) | ||
) | ||
member_emails = set( | ||
organization.member_set.exclude(email=None).values_list("email", flat=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh - does |
join them?
you could also do set1.union(set2)
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or that yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool cool cool
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. A few notes:
|
||
return ( | ||
nonmember_authors.filter( | ||
commit__repository_id__in=set(org_repos), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. They're relatively small queries and since this is offline computation I think we can get away with the current logic.
@@ -125,7 +122,7 @@ def provider_reducer(dict, integration): | |||
if integration_provider != "github": | |||
continue | |||
|
|||
queryset = self._get_missing_members( | |||
queryset = _get_missing_organization_members( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most cases this would be empty so you can early exit here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do have to return something even if it's empty though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can do extra processing only if the queryset exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we going to gate this behind some kind of feature flag? excuse me if that gating is in a separate PR.
@JoshFerge the gating right now is per org. do you mean some kind of option that prevents the task from scheduling all of the other tasks in the first place? |
c6ef1f1
to
dea4422
Compare
basically: when this is merged, will it begin to send notifications out for every org with a github integration? |
@JoshFerge this task will run once a month on the first of the month and will only run for orgs with the |
got it. great. thanks! |
PR reverted: 22deb13 |
This reverts commit c984fd2. Co-authored-by: cathteng <70817427+cathteng@users.noreply.github.com>
Every month, we will nudge owners and managers from orgs with Github integrations to invite their missing members -- commit authors that have committed to active repositories in the org but are not org members.
For ER-1794