Skip to content

Directory tree

Clara Qin edited this page Apr 20, 2020 · 5 revisions
├── data  # processed data - can be read into R using readRDS()
|   |
|   ├── NEON_ITS_phyloseq_DL08-13-2019.Rds  # phyloseq object based on NEON ITS sequences
|   |                                       # which were downloaded on DL[date]
|   ├── NEON_ITS_seqtab_nochim_DL08-13-2019.Rds  # sequence table with chimeras removed
|   |
|   └── NEON_ITS_taxa_DL08-13-2019.Rds  # taxa table based on UNITE database
|
├── raw_data  # NONE OF THIS DIRECTORY IS PUSHED TO GITHUB - ACCESS ON SERVER (see below)
|   |
|   ├── sequence_metadata  # metadata for linking ITS sequence data to soil and site data
|   |
|   ├── tax_ref  # taxonomic reference tables to match sequences with taxonomy
|   |   └── sh_general_release_dynamic_02.02.2019.fasta
|   |
|   └── Illumina  # raw fastq files from Illumina sequencing
|       |   
|       └── NEON
|           ├── 16S  # contains raw fastq files
|           └── ITS  # contains raw fastq files, and directories with semi-processed files:
|               ├── 0_unzipped  # after unzipping files downloaded from NEON, and appending run sequence IDs
|               ├── 1_filtN  # after filtering out reads containing ambiguous bases ("N")
|               ├── 2_cutadapt  # after removing primers/adapters
|               ├── 3_filtered  # after passing quality filter
|               ├── 4_filtered  # sequencing run-specific sequence tables
|               └── track_reads # tables summarizing the number of reads remaining at each processing step
|
└── code  # if running R scripts in RStudio, set working directory
    |     # to be the git root directory (e.g. "NEON_DoB_analysis"), 
    |     # not "code" subdirectory
    |
    ├── params.R # contains parameters used throughout workflow
    |
    ├── utils.R  # contains various functions including one which downloads all
    |            # NEON raw microbial sequence data
    └── workflow
        |
        ├── 00_new_server_setup.R # downloads all raw sequence data into ./raw_data/Illumina/ITS
        |
        ├── 01a_dada2_workflow_its.R  # follows https://benjjneb.github.io/dada2/ITS_workflow.html
        |                             # to process NEON ITS raw sequence data
        |
        └── 02a_dada2_to_phyloseq_its.R  # assembles outputs of dada2_workflow_its.R, plus soil data
                                         # and sequence metadata, to create phyloseq object