-
Notifications
You must be signed in to change notification settings - Fork 4
Directory tree
Clara Qin edited this page Apr 20, 2020
·
5 revisions
├── data # processed data - can be read into R using readRDS()
| |
| ├── NEON_ITS_phyloseq_DL08-13-2019.Rds # phyloseq object based on NEON ITS sequences
| | # which were downloaded on DL[date]
| ├── NEON_ITS_seqtab_nochim_DL08-13-2019.Rds # sequence table with chimeras removed
| |
| └── NEON_ITS_taxa_DL08-13-2019.Rds # taxa table based on UNITE database
|
├── raw_data # NONE OF THIS DIRECTORY IS PUSHED TO GITHUB - ACCESS ON SERVER (see below)
| |
| ├── sequence_metadata # metadata for linking ITS sequence data to soil and site data
| |
| ├── tax_ref # taxonomic reference tables to match sequences with taxonomy
| | └── sh_general_release_dynamic_02.02.2019.fasta
| |
| └── Illumina # raw fastq files from Illumina sequencing
| |
| └── NEON
| ├── 16S # contains raw fastq files
| └── ITS # contains raw fastq files, and directories with semi-processed files:
| ├── 0_unzipped # after unzipping files downloaded from NEON, and appending run sequence IDs
| ├── 1_filtN # after filtering out reads containing ambiguous bases ("N")
| ├── 2_cutadapt # after removing primers/adapters
| ├── 3_filtered # after passing quality filter
| ├── 4_filtered # sequencing run-specific sequence tables
| └── track_reads # tables summarizing the number of reads remaining at each processing step
|
└── code # if running R scripts in RStudio, set working directory
| # to be the git root directory (e.g. "NEON_DoB_analysis"),
| # not "code" subdirectory
|
├── params.R # contains parameters used throughout workflow
|
├── utils.R # contains various functions including one which downloads all
| # NEON raw microbial sequence data
└── workflow
|
├── 00_new_server_setup.R # downloads all raw sequence data into ./raw_data/Illumina/ITS
|
├── 01a_dada2_workflow_its.R # follows https://benjjneb.github.io/dada2/ITS_workflow.html
| # to process NEON ITS raw sequence data
|
└── 02a_dada2_to_phyloseq_its.R # assembles outputs of dada2_workflow_its.R, plus soil data
# and sequence metadata, to create phyloseq object