-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHYSLITE schema and EnergyPerSampling branch #1074
Comments
FYI a fresh de-bugged ATLAS public file (MC PHYSLITE) for testing purposes can be downloaded by: |
I think the explanation here is related to what is happening in #1073 I'm trying to explain it from the perspective of how i remember it from pre-dask times: The actual "schema" is what went into the Unfortunately it often happens that this is one of these expensive-to-read double-jagged branches. I had experimented modifying the schema such that it tries to avoid using a double-jagged branch (like here, where it ends up being master...nikoladze:coffea:dev-avoid-doublejagged-physlite ... but never properly tested it so i didn't merge it. Now reading the code i also notice |
Ok, now i notice i didn't read @alexander-held's description carefully enough. He actually wants to read the the double-jagged branch We have a long list of branches for the Jets, iterating through them they appear in this order
The PHYSLITE schema will make one common form out of this where all these branches share a common offset array and the first branch ( |
We should be able to not read the additional data and only the offsets with the way that |
@nikoladze is being a hero and working on an AwkwardForth solution with @jpivarski to avoid overly hardcoding things. |
Describe what you want to do
I am trying to read a specific branch in PHYSLITE files,
Jets.EnergyPerSampling
, and am seeing that the reading of this branch triggers the reading of another branch when using the PHSYLITE schema. I would like to understand whether this is intended. The same additional branch is not getting read with BaseSchema.reproducer:
which results in
Note the difference in the required branches.
This reproducer unfortunately relies on an ATLAS-internal file sitting at the UChicago AF behind an ATLAS login. We also have a public PHYSLITE file available which can be used to reproduce the same
dak.necessary_columns
behavior (see the commented out lines in the script), however it will crash at task graph execution time for reasons I do not understand. Perhaps it is an earlier iteration of PHYSLITE and no longer supported by the current schema version or the current version of uproot.cc @nikoladze as expert for this schema
Explain what documentation is missing
This is admittedly a very technical question, might go beyond something that is all that useful in documentation but I'd just like to understand if behavior is as intended.
The text was updated successfully, but these errors were encountered: