You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a discussion that came out in #156. Replicated here to keep track of what we should do.
@davidwaroquiers
One thing I'm wondering is: if we delete the outputs (in the database), but not the files (on the worker), we somehow have no way to access (easily) these files ... Should we either
always delete the files when we delete the outputs (maybe with an option to keep the files?), but maybe it's dangerous ..
delete most of the content of the dict in the database, but keep uuid, index, db_id and run directory somehow ?
do nothing
In any case, I think we should document this carefully, whichever option we choose above.
@gpetretto
Good point. I think the preferred way would depend a lot on users preferences, so probably we should have the least invasive one as default.
@davidwaroquiers
So maybe something like the second option ? i.e. "delete most of the content of the dict in the database, but keep uuid, index, db_id and run directory somehow ?"
@gpetretto
Looking more into this, the problem with that approach is that the changes to the output documents are done through the JobStore (and cannot be otherwise). Since the maggma Store only allows full deletion or updates, it means that to leave only a minimal version of the outputs the documents should all be fetched from the Store, cleaned up and resinserted into the Store. If there are many this may be an expensive operation and I am not sure if it is worth.
The text was updated successfully, but these errors were encountered:
This is a discussion that came out in #156. Replicated here to keep track of what we should do.
@davidwaroquiers
One thing I'm wondering is: if we delete the outputs (in the database), but not the files (on the worker), we somehow have no way to access (easily) these files ... Should we either
In any case, I think we should document this carefully, whichever option we choose above.
@gpetretto
Good point. I think the preferred way would depend a lot on users preferences, so probably we should have the least invasive one as default.
@davidwaroquiers
So maybe something like the second option ? i.e. "delete most of the content of the dict in the database, but keep uuid, index, db_id and run directory somehow ?"
@gpetretto
Looking more into this, the problem with that approach is that the changes to the output documents are done through the JobStore (and cannot be otherwise). Since the maggma Store only allows full deletion or updates, it means that to leave only a minimal version of the outputs the documents should all be fetched from the Store, cleaned up and resinserted into the Store. If there are many this may be an expensive operation and I am not sure if it is worth.
The text was updated successfully, but these errors were encountered: