Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Rucio DID finder to emit partial results #871

Open
ponyisi opened this issue Oct 1, 2024 · 3 comments
Open

Allow Rucio DID finder to emit partial results #871

ponyisi opened this issue Oct 1, 2024 · 3 comments

Comments

@ponyisi
Copy link
Collaborator

ponyisi commented Oct 1, 2024

The Rucio DID finder will currently not return any files until the entire lookup (including of all file replicas) is complete. For the data18 PHYSLITE this is something like 7 minutes. However the DID finding infrastructure has the ability to handle partial results and there's no reason the Rucio DID finder couldn't return file replicas for large dataset containers on a dataset-by-dataset basis instead of waiting until the entire container is looked up. If we sort the dataset names this will even be reasonably reproducible between runs.

@ponyisi ponyisi added this to the 1.6 new features milestone Oct 1, 2024
@ivukotic
Copy link
Member

ivukotic commented Oct 1, 2024

Rucio results are returned as a metadata file (there is a reason behind this). So, they are delivered in one big chunk.

@ponyisi
Copy link
Collaborator Author

ponyisi commented Oct 1, 2024

Hi @ivukotic - we still query each individual dataset within a container separately, yes? So the metadata file is still on a per-dataset level, yes, not per-container, correct? So is there a reason we cannot yield the result for each container in lookup_request:lookup_files instead of concatenating them all together and then returning the full file list?

@ivukotic
Copy link
Member

ivukotic commented Oct 1, 2024

I don't really remember ... It could be that I handle each dataset inside data container separately. In that case it would have sense to yield them separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants