You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have discovered yet another bug w/r/t to the Nuxeo API. This query doesn’t actually produce results ordered by lastModified:
query = (
"SELECT * FROM SampleCustomPicture, CustomFile, CustomVideo, CustomAudio, CustomThreeD "
f"WHERE ecm:ancestorId = '{collection['uid']}' AND "
"ecm:isVersion = 0 AND "
"ecm:isTrashed = 0 "
"ORDER BY lastModified desc"
)
The results are completely out of order when it comes to lastModified, and are in the same order as when you leave off the ORDER BY clause entirely. I've tried different API endpoints and also querying on the path rather than the UID, and get the same results. This happens for small collections with fewer than 100 records.
Given the unreliability of the API general (it doesn't reliably return the same results set for very large collections), I think we should just bypass the API altogether and figure out how to query the DB directly.
It’ll take a little work to figure out the schema and also it’s a bit of extra infrastructure work because the DB is locked down in a VPN in the pad-dsc account. The nuxeo-merritt job runs in the pad-prd account because that’s where Airflow is. I’m not sure how to go about giving cross-account access to the DB, but I’m hoping/guessing it can be done.
The text was updated successfully, but these errors were encountered:
Barbara will ask IAS about the possibility of cross-account database sharing.
Can also ask IAS if it's possible to move the Nuxeo data s3 bucket?
barbarahui
changed the title
Update nuxeo-merritt feed to query the DB directly rather than using the API
Update nuxeo-merritt feed and rikolti nuxeo fetcher to query the DB directly rather than using the API
Oct 28, 2024
barbarahui
changed the title
Update nuxeo-merritt feed and rikolti nuxeo fetcher to query the DB directly rather than using the API
Create mechanism for querying Nuxeo DB directly rather than using API
Oct 28, 2024
We have discovered yet another bug w/r/t to the Nuxeo API. This query doesn’t actually produce results ordered by
lastModified
:The results are completely out of order when it comes to lastModified, and are in the same order as when you leave off the ORDER BY clause entirely. I've tried different API endpoints and also querying on the path rather than the UID, and get the same results. This happens for small collections with fewer than 100 records.
Given the unreliability of the API general (it doesn't reliably return the same results set for very large collections), I think we should just bypass the API altogether and figure out how to query the DB directly.
It’ll take a little work to figure out the schema and also it’s a bit of extra infrastructure work because the DB is locked down in a VPN in the
pad-dsc
account. The nuxeo-merritt job runs in thepad-prd
account because that’s where Airflow is. I’m not sure how to go about giving cross-account access to the DB, but I’m hoping/guessing it can be done.The text was updated successfully, but these errors were encountered: