This repository contains the documentation and scripts necessary for generating a TE annotation and TE presence analysis for a diploid blueberry genome as part of a collaborative project with researchers from NC State. I used the EDTA by Shujun Ou to generate my TE annotations. EDTA installation directions can be found on that repository's main page.
EDTA version 1.9.7 was used to annotate and characterize the TEs in the genome. The EDTA pipeline utilized the Vce1.0.fasta
FASTA file and vcae1.4.cds.fa
CDS FASTA file as primary inputs to EDTA.pl
. Default options were used in all cases except for the usage of the --cds
, --sensitive 1
, and --anno
options. The --sensitive 1
option tells the program to use RepeatModeler to identify remaining TEs that were missed by structure-based methods following the normal progression of the pipeline. Version controlled documentation and all code related to recreating the analysis is located in the following https://github.com/sjteresi/W85_TE_Annotation GitHub repository. The script for recreating the analysis can be found within the file src/Annotate_W85_TEs_EDTA.sb
and the complete list of package versions can be found within the doc/requirements.txt
and doc/conda_list.log
files.