-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to delete more than a few hundred items #1090
Comments
This is due to the number of related objects involved, so there's no simple way to know without counting all of the related objects first since some items may have far more assets, transcriptions, tags, etc. than others. If this is something you need to do regularly, we can buy some performance by optimizing the query pattern (the confirmation display is slow because it involves looking up information for lots of related objects one by one unless those are prefetched, and the deletion process uses that same path to log the object deletion history) but the would need to write a custom admin delete method which could do that more efficiently with some of the data integrity safeguards. |
Thanks for info @acdha! Deleting may not be as common as unpublishing but in the case of Whitman it was an honest mistake that could earnestly happen again. Ultimately, keeping the application organized and holding items we intend to use makes it easier to keep track (especially as we continue to add more) and reduces risk of publishing items we are not supposed to. What is the level of effort to creating a custom delete, @acdha @rstorey? This would be helpful so we can prioritize this against other work. |
We need to write a custom delete anyway because otherwise the stuff for the items / assets being deleted that's in S3 becomes orphaned. |
Just an update that I'm currently able to delete around 500-1000 items every time I try before I get the above error message. At the current rate that will take me roughly 88 days to do manually. I seem to be able to only do this roughly once a day. 2x a day if I'm lucky. |
Is there any harm in leaving them there until someone has time to write a custom bulk delete handler which will also prune the corresponding S3 objects? If these have to go now, the fastest way would be to delete the transcriptions and/or assets first so the item deletion doesn't have to process those related records first. |
These can stay put for now. It just means we need to subtract them from our total items count when we talk about our data. To clarify, these were imported but never published so they won't have associated transcriptions. |
This issue of being unable to delete efficiently also affects Catt, NAWSA, and Blackwell. We've discovered a duplication issue which was replicated into BTP from loc.gov. A significant number of assets will have to be deleted from each of these campaigns. This issue is not pressing, but the deletion will need to occur before export for ingest to loc.gov. |
Work this period has been on understanding the information presented above, also looking at the work done (code and conversation) on #1257 and researching and trying to work through how to add to the various pieces to the code base in order to add the deletion of the related S3 objects. This is also in line with 'Sunsetting of Campaigns'. |
Tried to delete Whitman data at a relatively quiet time--end of a day on a Friday. Selecting 100 items at a time I successfully deleted around 300 items (consisting of several thousand pages). This worked ok a few times, but at the 400 or 500 mark Cloudflare 502 bad gateway notification appeared suggesting I try again in a few minutes. I did had had the same result 2x. Gave around 10 minutes between the 2nd and 3rd try. I sometimes get a 504 error.
How can we reproduce the bug?
Steps to reproduce the behavior:
What is the expected behavior?
Maybe this is the expected behavior?
I'd love to be able to delete more items at once. That might not be possible, but would be helpful to know roughly how much a CM can expect to do in a given window.
The text was updated successfully, but these errors were encountered: