-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Size of array is less than size of form with PHYSLITE schema #1083
Comments
Some more examples that might be useful when debugging: |
Don't know if this is relevant here, but we've seen the same issue (with other files) if the number of entries in the tree is not the same as the actual dimension of the first dimension of a branch. |
It looks like these are actually buggy samples in the sense that different fields of the same collection don't have the same length in an event. I've seen this before and it has been explained to me this can happen due to a mechanism in athena that attempts to backfill branches only created later in the event loop with empty vectors. Now, when due to a bug in the code a branch is forgot to be filled somewhere then this mechanism can lead to wrong length 0 vectors for certain branches. Checking if this is happening in one of the example files: import uproot
import awkward as ak
import numpy as np
def check_collection(tree, collection_name, ref_name):
keys = [k for k in tree.keys() if k.startswith(collection_name)]
arrays = tree.arrays(keys)
ref = tree[ref_name].array()
for array, field in zip(ak.unzip(arrays), arrays.fields):
if array.fields:
continue
if "/" in field:
field = field.split("/")[1]
field = field.split(".", maxsplit=1)[1]
different_num = ak.num(ref) != ak.num(array)
if ak.any(different_num):
print(f"Different number of entries for {field}: {ak.num(array)[different_num]} vs {ak.num(ref)[different_num]} in ref, at entries {np.where(different_num)[0].tolist()}")
treename = "CollectionTree"
fname = "root://192.170.240.143:1094//root://fax.mwt2.org:1094//pnfs/uchicago.edu/atlaslocalgroupdisk/rucio/data18_13TeV/1f/87/DAOD_PHYSLITE.37020635._000031.pool.root.1"
tree = uproot.open({fname: treename})
check_collection(tree, "AnalysisPhotonsAuxDyn", "AnalysisPhotonsAuxDyn.pt") gives
So this will eventually have to be fixed upstream. Of course we can't easily fix it in already produced physlite files. Currently i don't have great ideas for a workaround since we can't zip arrays with different length lists. We could fill the empty lists with None values (using masked arrays) or arbitrary values like -999 or NaN, but that would need to happen at the level when the arrays are read. Maybe one could put in something using the coffea nanoevents transforms, but it would make everything a bit ugly since every form key evaluation now would also need to process the offset array of a reference branch to figure out to which length actually fill the lists ... |
Hello, @alexander-held suggested I post here in case it can be useful since I'm seeing similar behaviour (i.e. sporadic errors or the form below that are not entirely reproducible). I'm running on an internal ATLAS file format (not PHYSLITE) and don't see any issues running something like @nikoladze's script on it. I'm happy to share any additional details or files of course. Script to reproduce the error (might need to run Schema used and preprocessing file for completeness:
End of error stack trace:
|
@sebastien-rettie it seems you are trying to zip together the from coffea.nanoevents import NanoEventsFactory
from schema import NtupleSchema
events = NanoEventsFactory.from_root({"user.caiyi.40860313._002582.output-tree.root": "AnalysisMiniTree"}, schemaclass=NtupleSchema).events()
events.compute() This raises a similar exception - if i go into the debugger and step up until i hit import pprint
pprint.pprint(form.to_dict()) There one can see:
so, a zipped collection and in the contents there are both fields starting with |
Hi @nikoladze, thanks a lot for the follow-up, that makes sense! In that case I guess I need to update the schema to group the two jet collections into separate arrays, is that right? Would you have an example of how to do this by any chance? |
since this is unrelated to the issue reported here, maybe we can continue the discussion in your gitlab repo, i took the freedom to open an issue for that |
@nikoladze Okay, I think this is enough to say that this isn't a |
Yes, this is a known bug. It will essentially never be fully "fixed" in Athena, the best we can do is detect that it happened and then fix the specific instance of this problem inside Athena. And I thought we already ran some tests during derivation production that would flag this. Essentially nobody considers this an Ok or healthy xAOD file, not just us working on columnar analysis. I'd say open up a ticket in the AMG JIRA for the component "Derivation Framework" (maybe add "Columnar Analysis" as a second component): https://its.cern.ch/jira/projects/ATLASG/issues |
Describe the bug
I've ran into another bug that seems PHYSLITE schema related and occurs somewhat infrequently. Unfortunately I am not aware of suitable public files at the moment to reproduce, so I will point to the information relevant for finding them within ATLAS. If needed we can hopefully find a mechanism to share a relevant file. cc @nikoladze as PHYSLITE schema expert.
To Reproduce
The read when using the schema fails, it succeeds with plain uproot. The trace ends in
with the full trace attached below.
Examples of files to test with, all are in the
container:
With plain uproot, both files work fine. I've also ran over many other files and have seen similar
TypeError
exceptions. I have not looked into their origin and tracked down whether the root cause may be similar.Expected behavior
Successful branch reading.
Output
Desktop (please complete the following information):
Additional context
n/a
The text was updated successfully, but these errors were encountered: