Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documents not showing on some records #1091

Closed
RFK250 opened this issue Aug 28, 2023 · 2 comments
Closed

Documents not showing on some records #1091

RFK250 opened this issue Aug 28, 2023 · 2 comments

Comments

@RFK250
Copy link
Collaborator

RFK250 commented Aug 28, 2023

Describe the Bug
SInce the keycloak migration, some records that show as having attachments - and that previously showed attachments - are not showing their attachments.

At the time, I noticed that all the roles except "public" had been recreated in the new Common Hosted SSO interface. So, I created the public role. The public role affects whether or not Issued To can be seen, and whether or not Documents can be seen.

Expected Behaviour
When I view a record with a paperclip on NRCED, I should be able to expand the record and then open the attachment.

Actual Behaviour
For some records - and i haven't been able to determine an exhaustive list - records with a paperclip are not allowing the user to view the record. See Steps to Reproduce for an example below.

Implications
Low risk, however it is a confusing experience to the user. Additionally, documents are important records that are published for transparency - the fact that they are currently not visible to users diminishes the value of NRCED.

** Steps To Reproduce**
This is just one example, but there may be others in the event this is a more global issue.

  1. Go to here
  2. Notice that most records in the list show a paperclip indicating a document.
  3. Expand a record, e.g. the Canadian National Railway record.
  4. Note that no option for viewing the document actually exists. Most, if not all, of these records did originally have documents attached to them. They probably still do.
@davidclaveau
Copy link
Contributor

davidclaveau commented Aug 31, 2023

For future context: it looks like these documents were all deleted back in April by a user. The document's ObjectID exists in the NRCED collection, but the main, master collection of NRPTI no longer has the corresponding document record and S3 link. NRCED collection is the logic which shows the "paperclip" icon, but the NRPTI collection is what loads the S3 url - so when viewing the record on NRCED we see the paperclip, but opening the record (to see more details) is when it shows "This record does not have documents".

The list of documents that were deleted match the missing attachments on NRCED for those records (filtered by Order and Administrative Penalty types and the Wildfire act). These deleted files are not currently in the S3 bucket anymore either. Recovering doesn't look like a possibility, as the S3 bucket doesn't have a "soft-delete" option, and a snapshot or backup of the document(s) isn't available past 60 days.

Next steps would be to remove the orphaned ObjectID in the NRCED collection, which will remove the paperclip icon on NRCED frontend - this will resolve any confusion on an attachment's availability to the user. This would require users to re-upload the documents that were previously deleted.

@LolandaE LolandaE added the Ready label Sep 8, 2023
@davidclaveau
Copy link
Contributor

davidclaveau commented Sep 8, 2023

Was able to find a good solution for pulling the missing doc IDs from the redacted_record_subset. Explained a bit more in depth in the PR changelog

Output log from running the migration:

dclaveau api % db-migrate up 20230901215837-updateDeletedDocRecords -e local
**** Started tracking all documents in redacted_record_subset ****
**** Found 42605 documents with an existing, non-empty "documents" array in redacted_record_subset collection ****
**** Found 43665 ObjectIDs in the "documents" arrays ****
**** Checking if each ObjectID exists in the "nrpti" or "redacted_record_subset" collection ****
**** Found 43442 matching ObjectIDs ****
**** Found 223 non-matching ObjectIDs ****
**** Updating "documents" arrays in redacted_record_subset collection ****
Removed 1 instances of 60132db39fe081001be3386b from "redacted_record_subset" collection.

[ Continues for 222 additional lines... ]

[INFO] Processed migration 20230901215837-updateDeletedDocRecords
[INFO] Done

And then, you can run the migration once more to see if any documents still exist. We expect (and see) that there are 0 non-matching ObjectIDs, showing that all document ObjectIDs are accounted for between nrpti and redacted:

dclaveau api % db-migrate up 20230901215837-updateDeletedDocRecords -e local
**** Started tracking all documents in redacted_record_subset ****
**** Found 42501 documents with an existing, non-empty "documents" array in redacted_record_subset collection ****
**** Found 43442 ObjectIDs in the "documents" arrays ****
**** Checking if each ObjectID exists in the "nrpti" or "redacted_record_subset" collection ****
**** Found 43442 matching ObjectIDs ****
**** Found 0 non-matching ObjectIDs ****
**** Updating "documents" arrays in redacted_record_subset collection ****
[INFO] Processed migration 20230901215837-updateDeletedDocRecords
[INFO] Done
dclaveau api %

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants