diff --git a/tests/resources/validation_reports/expected_report_metadata_json.html b/tests/resources/validation_reports/expected_report_metadata_json.html
index 53b45f7..db260a2 100644
--- a/tests/resources/validation_reports/expected_report_metadata_json.html
+++ b/tests/resources/validation_reports/expected_report_metadata_json.html
@@ -19,4 +19,4 @@
.fail { background-color: #FFB6C1; }
.pass { background-color: #90EE90; }
.info { background-color: #dadada; }
- .error-list, .no-show { display: none; }
Project Summary
General details about the project
Project Title: My cool project
Validation Date: 2023-08-31 12:34:56
Submission Directory: /test/submission/dir
▶ Files mapping
VCF File | Fasta File | Analysis |
---|
input_fail.vcf | input_fail.fa | A |
input_pass.vcf | input_pass.fa | B |
input_test.vcf | input_test.fa | could not be linked |
Metadata validation results
Ensures that required fields are present and values are formatted correctly. For requirements, please refer to the
EVA website.
▶ ❌ Metadata validation check
Full report: /path/to/json/metadata/report
Json Property | Error Description |
---|
.files | should have required property 'files' |
/project.title | should have required property 'title' |
/project.description | should have required property 'description' |
/project.taxId | should have required property 'taxId' |
/project.centre | should have required property 'centre' |
/analysis/0.analysisTitle | should have required property 'analysisTitle' |
/analysis/0.description | should have required property 'description' |
/analysis/0.experimentType | should have required property 'experimentType' |
/analysis/0.referenceGenome | should have required property 'referenceGenome' |
/sample/0.bioSampleAccession | should have required property 'bioSampleAccession' |
/sample/0.bioSampleObject | should have required property 'bioSampleObject' |
/sample/0 | should match exactly one schema in oneOf |
VCF validation results
Checks whether each file is compliant with the
VCF specification. Also checks whether the variants' reference alleles match against the reference assembly.
input_fail.vcf
▶ ❌ Assembly check: 26/36 (72.22%)
First 10 errors per category are below. Full report: /path/to/assembly_failed/report
Category | Error |
---|
Parsing Error | The assembly checking could not be completed: Contig 'chr23' not found in assembly report |
mismatch error | Chromosome 1, position 35549, reference allele 'G' does not match the reference sequence, expected 'c' |
mismatch error | Chromosome 1, position 35595, reference allele 'G' does not match the reference sequence, expected 'a' |
mismatch error | Chromosome 1, position 35618, reference allele 'G' does not match the reference sequence, expected 'c' |
mismatch error | Chromosome 1, position 35626, reference allele 'A' does not match the reference sequence, expected 'g' |
mismatch error | Chromosome 1, position 35639, reference allele 'T' does not match the reference sequence, expected 'c' |
mismatch error | Chromosome 1, position 35643, reference allele 'T' does not match the reference sequence, expected 'g' |
mismatch error | Chromosome 1, position 35717, reference allele 'T' does not match the reference sequence, expected 'g' |
mismatch error | Chromosome 1, position 35819, reference allele 'T' does not match the reference sequence, expected 'a' |
mismatch error | Chromosome 1, position 35822, reference allele 'T' does not match the reference sequence, expected 'c' |
▶ ❌ VCF check: 1 critical errors, 1 non-critical errors
First 10 errors per category are below. Full report: /path/to/vcf_failed/report
Category | Error |
---|
critical error | Line 4: Error in meta-data section. |
non-critical error | Sample #11, field AD does not match the meta specification Number=R (expected 2 value(s)). AD=.. |
input_passed.vcf
✔ Assembly check: 247/247 (100.0%)
✔ VCF check: 0 critical errors, 0 non-critical errors
Sample name concordance check
Checks whether information in the metadata is concordant with that contained in the VCF files, in particular sample names.
▶ ❌ Analysis A: Sample names in metadata do not match with those in VCF files
Category | First 5 Errors For Category | Link To View All Errors |
---|
Samples described in the metadata but not in the VCF files | SampleA1, SampleA2 , SampleA3, SampleA4, SampleA5 | Show All Errors For Category |
Samples in the VCF files but not described in the metadata | A1Sample , A2Sample, A3Sample, A4Sample, A5Sample | Show All Errors For Category |
All Errors For Category - Samples in the VCF files but not described in the metadata:
- A1Sample•
- •A2Sample
- A3Sample
- A4Sample
- A5Sample
- A6Sample
- A7Sample
- A8Sample
- A9Sample
- A10Sample
Hide ✔ Analysis B: Sample names in metadata match with those in VCF files
▶ ❌ Analysis C: Sample names in metadata do not match with those in VCF files
Category | First 5 Errors For Category | Link To View All Errors |
---|
Samples described in the metadata but not in the VCF files | SampleC1 , SampleC2, SampleC3, SampleC4 | Show All Errors For Category |
Samples in the VCF files but not described in the metadata | C1Sample , C2Sample, C3Sample, C4Sample | Show All Errors For Category |
All Errors For Category - Samples in the VCF files but not described in the metadata:
- C1Sample•
- •C2Sample
- C3Sample
- C4Sample
HideReference genome INSDC check
Checks that the reference sequences in the FASTA file used to call the variants are accessioned in INSDC. Also checks if the reference assembly accession in the metadata matches the one determined from the FASTA file.
metadata_asm_match.fa
✔ All sequences are INSDC accessioned
✔ Analysis A: Assembly accession in metadata is compatible
metadata_asm_not_found.fa
✔ All sequences are INSDC accessioned
▶ ❌ No assembly accession found in metadata
Full report: /path/to/metadata_asm_not_found.yml
Category | Accessions |
---|
Assembly accession found in metadata | Not found |
Assembly accession(s) compatible with FASTA | GCA_1 |
metadata_asm_not_match.fa
✔ All sequences are INSDC accessioned
▶ ❌ Analysis B: Assembly accession in metadata is not compatible
Full report: /path/to/metadata_asm_not_match.yml
Category | Accessions |
---|
Assembly accession found in metadata | GCA_2 |
Assembly accession(s) compatible with FASTA | GCA_1 |
metadata_error.fa
Warning: The following results may be incomplete due to problems with external services. Please try again later for complete results.
Error message: 500 Server Error: Internal Server Error for url: https://www.ebi.ac.uk/eva/webservices/contig-alias/v1/chromosomes/md5checksum/hjfdoijsfc47hfg0gh9qwjrve
✔ All sequences are INSDC accessioned
✔ Analysis C: Assembly accession in metadata is compatible
not_all_insdc.fa
▶ ❌ Some sequences are not INSDC accessioned
First 10 sequences not in INSDC. Full report: /path/to/not_all_insdc_check.yml
Sequence name | Refget md5 |
---|
2 | hjfdoijsfc47hfg0gh9qwjrve |
✔ Analysis A: Assembly accession in metadata is compatible
\ No newline at end of file
+ .error-list, .no-show { display: none; }Project Summary
General details about the project
Project Title: My cool project
Validation Date: 2023-08-31 12:34:56
Submission Directory: /test/submission/dir
▶ Files mapping
VCF File | Fasta File | Analysis |
---|
input_fail.vcf | input_fail.fa | A |
input_pass.vcf | input_pass.fa | B |
input_test.vcf | input_test.fa | could not be linked |
Metadata validation results
Ensures that required fields are present and values are formatted correctly. For requirements, please refer to the
EVA website.
▶ ❌ Metadata validation check
Full report: /path/to/json/metadata/report
Json Property | Error Description |
---|
.files | should have required property 'files' |
/project.title | should have required property 'title' |
/project.description | should have required property 'description' |
/project.taxId | should have required property 'taxId' |
/project.centre | should have required property 'centre' |
/analysis/0.analysisTitle | should have required property 'analysisTitle' |
/analysis/0.description | should have required property 'description' |
/analysis/0.experimentType | should have required property 'experimentType' |
/analysis/0.referenceGenome | should have required property 'referenceGenome' |
/sample/0.bioSampleAccession | should have required property 'bioSampleAccession' |
/sample/0.bioSampleObject | should have required property 'bioSampleObject' |
/sample/0 | should match exactly one schema in oneOf |
VCF validation results
Checks whether each file is compliant with the
VCF specification. Also checks whether the variants' reference alleles match against the reference assembly.
input_fail.vcf
▶ ❌ Assembly check: 26/36 (72.22%)
First 10 errors per category are below. Full report: /path/to/assembly_failed/report
Category | Error |
---|
Parsing Error | The assembly checking could not be completed: Contig 'chr23' not found in assembly report |
mismatch error | Chromosome 1, position 35549, reference allele 'G' does not match the reference sequence, expected 'c' |
mismatch error | Chromosome 1, position 35595, reference allele 'G' does not match the reference sequence, expected 'a' |
mismatch error | Chromosome 1, position 35618, reference allele 'G' does not match the reference sequence, expected 'c' |
mismatch error | Chromosome 1, position 35626, reference allele 'A' does not match the reference sequence, expected 'g' |
mismatch error | Chromosome 1, position 35639, reference allele 'T' does not match the reference sequence, expected 'c' |
mismatch error | Chromosome 1, position 35643, reference allele 'T' does not match the reference sequence, expected 'g' |
mismatch error | Chromosome 1, position 35717, reference allele 'T' does not match the reference sequence, expected 'g' |
mismatch error | Chromosome 1, position 35819, reference allele 'T' does not match the reference sequence, expected 'a' |
mismatch error | Chromosome 1, position 35822, reference allele 'T' does not match the reference sequence, expected 'c' |
▶ ❌ VCF check: 1 critical errors, 1 non-critical errors
First 10 errors per category are below. Full report: /path/to/vcf_failed/report
Category | Error |
---|
critical error | Line 4: Error in meta-data section. |
non-critical error | Sample #11, field AD does not match the meta specification Number=R (expected 2 value(s)). AD=.. |
input_passed.vcf
✔ Assembly check: 247/247 (100.0%)
✔ VCF check: 0 critical errors, 0 non-critical errors
Sample name concordance check
Checks whether information in the metadata is concordant with that contained in the VCF files, in particular sample names.
▶ ❌ Analysis A: Sample names in metadata do not match with those in VCF files
Category | First 5 Errors For Category | Link To View All Errors |
---|
Samples described in the metadata but not in the VCF files | SampleA1, SampleA2 , SampleA3, SampleA4, SampleA5 | Show All Errors For Category |
Samples in the VCF files but not described in the metadata | A1Sample , A2Sample, A3Sample, A4Sample, A5Sample | Show All Errors For Category |
All Errors For Category - Samples in the VCF files but not described in the metadata:
- A1Sample•
- •A2Sample
- A3Sample
- A4Sample
- A5Sample
- A6Sample
- A7Sample
- A8Sample
- A9Sample
- A10Sample
Hide ✔ Analysis B: Sample names in metadata match with those in VCF files
▶ ❌ Analysis C: Sample names in metadata do not match with those in VCF files
Category | First 5 Errors For Category | Link To View All Errors |
---|
Samples described in the metadata but not in the VCF files | SampleC1 , SampleC2, SampleC3, SampleC4 | Show All Errors For Category |
Samples in the VCF files but not described in the metadata | C1Sample , C2Sample, C3Sample, C4Sample | Show All Errors For Category |
All Errors For Category - Samples in the VCF files but not described in the metadata:
- C1Sample•
- •C2Sample
- C3Sample
- C4Sample
HideReference genome INSDC check
Checks that the reference sequences in the FASTA file used to call the variants are accessioned in INSDC. Also checks if the reference assembly accession in the metadata matches the one determined from the FASTA file.
metadata_asm_match.fa
✔ All sequences are INSDC accessioned
✔ Analysis A: Assembly accession in metadata is compatible
metadata_asm_not_found.fa
✔ All sequences are INSDC accessioned
▶ ❌ No assembly accession found in metadata
Full report: /path/to/metadata_asm_not_found.yml
Category | Accessions |
---|
Assembly accession found in metadata | Not found |
Assembly accession(s) compatible with FASTA | GCA_1 |
metadata_asm_not_match.fa
✔ All sequences are INSDC accessioned
▶ ❌ Analysis B: Assembly accession in metadata is not compatible
Full report: /path/to/metadata_asm_not_match.yml
Category | Accessions |
---|
Assembly accession found in metadata | GCA_2 |
Assembly accession(s) compatible with FASTA | GCA_1 |
metadata_error.fa
Warning: The following results may be incomplete due to problems with external services. Please try again later for complete results.
Error message: 500 Server Error: Internal Server Error for url: https://www.ebi.ac.uk/eva/webservices/contig-alias/v1/chromosomes/md5checksum/hjfdoijsfc47hfg0gh9qwjrve
✔ All sequences are INSDC accessioned
✔ Analysis C: Assembly accession in metadata is compatible
not_all_insdc.fa
▶ ❌ Some sequences are not INSDC accessioned
First 10 sequences not in INSDC. Full report: /path/to/not_all_insdc_check.yml
Sequence name | Refget md5 |
---|
2 | hjfdoijsfc47hfg0gh9qwjrve |
✔ Analysis A: Assembly accession in metadata is compatible
\ No newline at end of file