Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve user in-guild sync process #165

Merged
merged 4 commits into from
Jul 1, 2024
Merged

Conversation

ChrisLovering
Copy link
Member

@ChrisLovering ChrisLovering commented Jun 30, 2024

Closes #157

Previously we set all users in_guild to False, and relied on users being set back to in_guild when iterating through guild.members

However, this caused two problems

  1. For a short window a users in_guild status was incorrect
  2. It required an update for all users in_guild to be sent to postgres to update in_guild back to True.

This diff changes that, so instead only users who are not found in the guild have in_guild set to False.

The bottleneck for this process currently is the number of users that need to be committed back to the database.
Testing locally, with 380k users off guild, the perf logging outputs: in_guild sync: total time 19.140096s, query 3.486625s, processing 0.540042s, commit 15.113428s. The commit timing isn't reliabale, as it is dependent on how long it takes for control to be passed back after awaiting

@ChrisLovering ChrisLovering requested a review from jb3 as a code owner June 30, 2024 17:58
@ChrisLovering ChrisLovering force-pushed the in_guild-sync-perf-improvement branch from 05e7053 to b8a6198 Compare June 30, 2024 18:00
@jchristgit
Copy link
Member

jchristgit commented Jun 30, 2024 via email

@ChrisLovering ChrisLovering force-pushed the in_guild-sync-perf-improvement branch from c298d9a to e47b967 Compare June 30, 2024 18:28
Previously we set all users in_guild to False, and relied on users being set back to in_guild when iterating through guild.members

However, this caused two problems
1. For a short window a users in_guild status was incorrect
2. It required an update for all users in_guild to be sent to postgres to update in_guild back to True.

This diff changes that, so instead only users who are not found in the guild have in_guild set to False.

The bottleneck for this query is the number of users that are currently in_guild=False.
Testing locally, with 360k users off guild, this took 7.4s to query out, and 0.5s to process & 15.1 s to commit.
@ChrisLovering ChrisLovering force-pushed the in_guild-sync-perf-improvement branch from e47b967 to c2fe170 Compare June 30, 2024 18:29
@ChrisLovering
Copy link
Member Author

Shouldn't the condition be if user.id _not_ in guild_member_ids, since guild_member_ids contains the users that are on the guild? From my understanding this currently sets in_guild=False for all users that are on the guild (and found in the queryset). Also, nitpicking but it would be nice to remove the space before %d in the log.info line.

Yeah, should have marked this as draft as there were a few more things I wanted to do. I've pushed some new commits and updated the PR description with perf stats.

This means the other, larger, columns do not need to be deserialised, bringing the query time down from 7.4s down to 3.5s.

I couldn't simply do select(models.User.id) here, as we need the full User object in order to mutate and update it later inthe process.
@ChrisLovering ChrisLovering force-pushed the in_guild-sync-perf-improvement branch from c2fe170 to db22e38 Compare June 30, 2024 18:35
@jchristgit
Copy link
Member

jchristgit commented Jul 1, 2024 via email

Copy link
Member

@jchristgit jchristgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@jb3 jb3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chris, this is fantastic, and changes everything. Thank you for everything you do.

@jb3 jb3 merged commit 95d2227 into main Jul 1, 2024
4 checks passed
@jb3 jb3 deleted the in_guild-sync-perf-improvement branch July 1, 2024 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Users table has all users set to in_guild=False on user upsert
3 participants