Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVA-3585 Cli documentation #45

Merged
merged 4 commits into from
Jul 30, 2024
Merged

EVA-3585 Cli documentation #45

merged 4 commits into from
Jul 30, 2024

Conversation

Dona094
Copy link
Contributor

@Dona094 Dona094 commented Jul 16, 2024

Eva-sub-cli documentation review

Getting_Started_with_eva_sub_cli.md Outdated Show resolved Hide resolved
```
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT [SampleIDs...]
```
Here's a small example to illustrate the structure of a VCF file: Example VCF file archived at EVA to be inserted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example VCF file archived at EVA to be inserted is this a TODO ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nitin-ebi yes still searching for a good example vcf file that we have archived at EVA

Copy link
Contributor

@apriltuesday apriltuesday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! In particular, the sections on validation key points/common errors are something I think we've sorely needed for a while, and are very clearly written.

Getting_Started_with_eva_sub_cli.md Show resolved Hide resolved
A VCF (Variant Call Format) file is a type of file used in bioinformatics to store information about genetic variants. It includes data about the differences (or variants) between a sample's DNA and a reference genome. Typically, generating a VCF file involves several steps: preparing your sample, sequencing the DNA, aligning it to a reference genome, identifying variants, and finally, formatting this information into a VCF file. The overall goal is to systematically capture and record genetic differences in a standardised format. A VCF file consists of two main parts: the header and the body.
Header: The header contains metadata about the file, such as the format version, reference genome information, and descriptions of the data fields. Each line in the header starts with a double ##, except for the last header line which starts with a single #.

File format version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing this is just a reminder for yourself? But somewhere in this section we could link to the VCF specification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great idea @apriltuesday

Getting_Started_with_eva_sub_cli.md Outdated Show resolved Hide resolved
Getting_Started_with_eva_sub_cli.md Outdated Show resolved Hide resolved
Getting_Started_with_eva_sub_cli.md Outdated Show resolved Hide resolved
Getting_Started_with_eva_sub_cli.md Outdated Show resolved Hide resolved
Getting_Started_with_eva_sub_cli.md Outdated Show resolved Hide resolved
Co-authored-by: April Shen <ashen@ebi.ac.uk>
Co-authored-by: nitin-ebi <79518737+nitin-ebi@users.noreply.github.com>
@Dona094 Dona094 merged commit bf150ed into main Jul 30, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants