-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EVA-3585 Cli documentation #45
Conversation
``` | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT [SampleIDs...] | ||
``` | ||
Here's a small example to illustrate the structure of a VCF file: Example VCF file archived at EVA to be inserted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example VCF file archived at EVA to be inserted is this a TODO ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nitin-ebi yes still searching for a good example vcf file that we have archived at EVA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! In particular, the sections on validation key points/common errors are something I think we've sorely needed for a while, and are very clearly written.
A VCF (Variant Call Format) file is a type of file used in bioinformatics to store information about genetic variants. It includes data about the differences (or variants) between a sample's DNA and a reference genome. Typically, generating a VCF file involves several steps: preparing your sample, sequencing the DNA, aligning it to a reference genome, identifying variants, and finally, formatting this information into a VCF file. The overall goal is to systematically capture and record genetic differences in a standardised format. A VCF file consists of two main parts: the header and the body. | ||
Header: The header contains metadata about the file, such as the format version, reference genome information, and descriptions of the data fields. Each line in the header starts with a double ##, except for the last header line which starts with a single #. | ||
|
||
File format version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guessing this is just a reminder for yourself? But somewhere in this section we could link to the VCF specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great idea @apriltuesday
Co-authored-by: April Shen <ashen@ebi.ac.uk> Co-authored-by: nitin-ebi <79518737+nitin-ebi@users.noreply.github.com>
Eva-sub-cli documentation review