Skip to content

Commit

Permalink
Limit Bundles being backfilled (#55835)
Browse files Browse the repository at this point in the history
This limits the number of ArtifactBundles that are being processed in a
single backfilling operation.

We have seen customers uploading thousands of single-file bundles which
are all scheduled to be backfilled. However loading *all* the bundles in
a single query, opening them up and collecting their manifests is
costly, so lets limit that to the BATCH_SIZE.
  • Loading branch information
Swatinem authored Sep 7, 2023
1 parent bc87a55 commit 05be00d
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion src/sentry/debug_files/artifact_bundle_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,8 @@ def backfill_artifact_index_updates() -> bool:
# we will randomize the order in which we update the indexes
random.shuffle(indexes_needing_update)

index_not_fully_updated = False

# First, we are processing all the indexes that need bundles *added* to them,
# we also process *removals* at the same time.
for index in indexes_needing_update:
Expand All @@ -278,7 +280,10 @@ def backfill_artifact_index_updates() -> bool:
artifact_bundles = ArtifactBundle.objects.filter(
flatfileindexstate__flat_file_index=index,
flatfileindexstate__indexing_state=ArtifactBundleIndexingState.NOT_INDEXED.value,
).select_related("file")
).select_related("file")[:BACKFILL_BATCH_SIZE]

if len(artifact_bundles) >= BACKFILL_BATCH_SIZE:
index_not_fully_updated = True

bundles_to_add = []
for artifact_bundle in artifact_bundles:
Expand Down Expand Up @@ -337,6 +342,7 @@ def backfill_artifact_index_updates() -> bool:
return (
len(indexes_needing_update) >= BACKFILL_BATCH_SIZE
or len(deletion_keys) >= BACKFILL_BATCH_SIZE
or index_not_fully_updated
)


Expand Down

0 comments on commit 05be00d

Please sign in to comment.