-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors reported incorrectly when handling too many files #2081
Comments
I understand that legacy validator may be of lower priority, but is there anything I can do to fix this? |
I personally don't have any notion how to start investigating this. Are you also seeing the issue with the schema validator? |
If you mean the Deno based validator, I'm planning to execute it. I'm fetching the data now. Once I do it, I will report, but possible next week. Thanks. |
I've tried the Deno validator. It also reports weird issues for most of the files, which should not be reported. I checked some of the reported files manually and there should be no issue. I used this command:
I also get "[WARNING] The onset column in events.tsv files should be sorted. (EVENT_ONSET_ORDER)". My guess is, that this is because we have negative onsets at the beginning. The number of files seems to be reported correctly in the summary of both validators: 4428 Files. Update: These issues show even if I bidsignore all subjects but one. |
We had some invalid json files in the dataset. We new about them. Fixing them caused the legacy validator to work more as expected, not reporting the errors, which I listed originally. It did not help the issues reported by the Deno validator. It seems that something is happening internally, which causes throwing other errors when some json files are invalid. I remember reading about similar issues. Are there maybe any errors set by default and reported when something else breaks? |
We've been making a bunch of fixes in the last week or so since that release, so can you re-test with:
This means that either python -c "import sys, nibabel; print(nibabel.load(sys.argv[1]).to_bytes()[:348])" <PATH>
This is almost impossible not to have, and a result of the schema validator systematically reporting RECOMMENDED sidecar fields. It's going to be noisy because BIDS has a lot of these, but they've only been selectively applied in the legacy validator. (Preview: I'm going to be advocating for reducing many fields to OPTIONAL.)
This could be a rounding problem: https://github.com/bids-standard/bids-validator/issues/2091
This is a problem in the schema, I'm going to submit a patch today.
This is surprising. I've just checked the sorting function and it should handle negative numbers fine. Would you open an issue with a failing file? |
Now I'm getting the following error with the legacy validator, which doesn't make sense to me, because why the json files in phenotype directory should be validated against a schema and have the listed properties.
|
That looks like it's being picked up by a microscopy rule. |
I don't know about that, but I've only been tangentially involved in the legacy validator. @rwblair might remember something here? |
That is weird but thank you. Renaming the file resolves this issue but it's not really a solution. |
This issue grew too much 🙈 We can split it. |
Damn. Okay, apparently that method was disabled by #2077. Use https://github.com/bids-standard/bids-validator/raw/deno-build/bids-validator.js. |
This problem we actually found in our data 🙈 |
The command I used:
The entire output:
The NIFTI header warnings are not there anymore. We are aware of some of the issues but some seem not right:
|
Agreed, it looks like the schema needs tightening up to avoid trying to apply sidecar rules for data files to |
@effigies, thank you for all your help. Many things happened in this issue. I'm wandering how to proceed. Feel free to rename this issue to reflect more its contents. As a summary, from my side I see the following remaining problems:
Please let me know if you'd like me to report anything else or open separate issues for any of these problems. |
I'm executing BIDS Validator v1.14.8 on a large dataset (~800GB, ~4500 files). The validator reports incorrectly the following errors:
These values are present in the JSON files.
It seems that the validator doesn't consider the JSON files. Is it possible that I reach some limit. It's not memory because the validator executes and finishes.
When I bidsignore half of the subjects, the validation passes.
I'm happy to do more investigation but I'd need to know what.
The text was updated successfully, but these errors were encountered: