A targeted amplicon sequencing panel to simultaneously identify mosquito species and Plasmodium presence across the entire Anopheles genus
data
- annotation tables for 62 mosquito and 2 Plasmodium ampliconspipeline_dada2
- sequencing data processing and QC pipeline based on DADA2pipeline_seekdeep
- sequencing data processing and QC pipeline based on SeekDeeptracking
- sample manifests for pipelines grouped by Illumina MiSeq run with short descriptions of sample setswork
- analyses code and data files; each step has an associated conda environment fileenv.yml
listing the dependencies
- 1_panel_design - search for potential amplicon sites in the 21 Anopheles genomes alignment and annotation of the final amplicon set
- 2_plasmodium_rebalancing - search for Plasmodium primer concentrations optimal for parasite detection
- 3_plasmodium_qpcr - comparison of amplicon sequencing and qPCR for Plasmodium detection
- 4_ref_extraction - amplicon sequence extraction from reference genomes (supplementary step)
- 5_synteny_plot - plotting amplicon positions in three mosquito species genomes
- 6_ag1k_extraction - amplicon sequence extraction from Ag1000g Phase 2 haplotypes and within-species distances estimation (supplementary step)
- 7_species_id - multiple species dataset exploration, clustering-based species ID, species tree
- 8_ag1k_analysis - Ag1000g population structure and diversity based on amplicon sequences
- 9_coi_its - COI and ITS2 Sanger sequencing data analysis for species ID confirmation, within-species diversity estimates
- panel design: long version of amplicon annotation table, synteny plot
- multiple species dataset: haplotypes, sample metadata, clustering-based species predictions, species tree
- Ag1000g exploration: fsts, population diversity estimates
- species ID validation with COI and ITS2: summary table, within-species diversity across multiple species table, phylogenetic trees: COI, ITS2, amplicon sequencing