Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVA-3275: Improvements to validation report #9

Merged
merged 2 commits into from
Sep 7, 2023

Conversation

apriltuesday
Copy link
Contributor

@apriltuesday apriltuesday commented Aug 31, 2023

To view an example report, download expected_report.html and open in your browser.

@apriltuesday apriltuesday self-assigned this Aug 31, 2023
@apriltuesday apriltuesday marked this pull request as ready for review August 31, 2023 13:16
<section>
<h2>Sample name concordance check</h2>
<div class="description">
Checks whether information in the metadata is concordant with that contained in the VCF files, in particular sample names.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Checks whether information in the metadata is concordant with that contained in the VCF files, in particular sample names.
Checks whether information in the metadata (ex: Sample names) is concordant with that contained in the VCF files.

Copy link
Member

@tcezard tcezard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Only improvement I would like is a better indication that the collapsible section can be expanded.
I think we can do that in a later ticket

@sundarvenkata-EBI sundarvenkata-EBI merged commit 550c293 into EBIvariation:main Sep 7, 2023
@Dona094
Copy link
Contributor

Dona094 commented Sep 8, 2023

  1. Plus one on the general feedback on adding a collapsible sign as it was not so clear in the first instance.

  2. Little confused about the Full report:/path/to/metadata/report. Would this be a future task to do please?

  3. Comments on the table generated for the metadata validation results

  • For me the property field was a bit confusing. For instance, the project looks like /project.description , /project.taxId whilst the analysis looks like /analysis/0.experimentType, /analysis/0.referenceGenome (Why there is an occurrence of Zero for the analysis)

  • Property: .files Error: should have required property 'files' - does this error imply that the Files tab does not contain anything? If so, it was a bit difficult for me to understand that

  • /sample/0.bioSampleAccession, /sample/0.bioSampleObject : I could be entirely wrong here, but my understanding from the metadasheet help section for the Sample accession is that the accession can be from Biosamples, ENA or EGA so would it refer to the ENA/EGA accession if the input would be that?

  • Property: /sample/0 Error: should match exactly one schema in oneOf - I was not able to figure out the error using this. As in, little confused on what /sample/0 means, and also the error seems to be incomplete.

  • In general, there are inconsistencies between the field names mentioned in the Error column vs. the column headers in the metadata sheet. Maybe it would be better if we bring in the consistency between both

should have required property 'title' -  Referenced in the metadata sheet as Project Title
should have required property 'description' - Referenced in the metadata sheet as Description
should have required property 'experimentType' - Referenced in the metadata sheet as Experiment Type 
should have required property 'taxId' - Referenced in the metadata sheet as Tax ID
should have required property 'centre' - Referenced in the metadasheet as Center 
should have required property 'analysisTitle' - Referenced in the metadata sheet as Analysis Title 
should have required property 'referenceGenome' - Referenced in the metadata sheet as Reference 
should have required property 'bioSampleAccession' - Referenced in the metadata sheet as Sample Accession
/sample/0.bioSampleObject	should have required property 'bioSampleObject' - Not sure if this would be the Sample ID? 

Suggestions

  1. Just a small suggestion maybe we could have three columns - Metadata sheet tab, Column header, Error
  2. I can see the table at the moment, only shows errors in the scenario where they are missing. Interested to see the errors in case of other common submitter errors. Would these have a further explanation as well? For example, how would the error look like if the user has used the wrong format in the Publication section or if they have not used the Scientific Name name exactly (homo sapiens instead of Homo sapiens)
  3. Maybe we could auto-populate the collection date and geographical population as "not provided" if they are missing.

VCF Validation results

  1. Here again little confused on the Full report: /path/to/assembly_failed/report. I am not entirely sure what this path would be, but it would be great to have a downloadable text file on the assembly report generation for the submitter
  2. I feel there is no clear indication for a user to distinguish/understand if they need to fix a non-critical error or a warning. For eg- In the case of a VCF spec failure in the meta lines.

Suggestions

  1. For the assembly check, maybe we could provide one or two sentences on what exactly we do, and what reference assembly we have used in this instance to run the report (as sometimes we might not be using the version the user has specified?).
  2. What if there are a significant number of VCF files, maybe we could provide all the details downloadable as a zip file?

Sample name concordance check

  1. Maybe here we could put more emphasis on the fact that we require the Sample ID in the spreadsheet to match the ones in the VCF as this is a prevalent error
  2. In the sample name concordance check, what would AA: mean? AA: Sample names concordance check 🤔

Please let me know if any of the above feedback is not clear. Happy to test out different scenarios in the future 😀

@tcezard
Copy link
Member

tcezard commented Sep 8, 2023

@Dona094 Thanks you for your comments they are very useful.
Some of them are addressed in #10 which you should check out.
For the rest I'll create an additional ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants