Skip to content

New server setup

Clara Qin edited this page Feb 5, 2020 · 6 revisions

New server setup

Written for starting from scratch on a server of your choice. (For example, this is how I set everything up on the original server, UCSC socs-stats.) This works just as well on a personal computer.

Downloading raw sequence files as a batch

Start by modifying parameters in the params.R script -- particularly the parameters that begin with preset_outdir -- to make sure they match up with your local directory structure. An example is given in the params.R file on this repo.

Then, run new_server_setup.R.

This script downloads all of the raw sequence data and metadata which matches the parameters set in params.R. It then extracts all the downloaded zip files, reorganizes the unzipped .fastq files into "ITS" and "16S" subdirectories, and appends the sequencing run ID to the .fastq filenames.

Notes:

  • This process may take a while. If you're on a Unix system (e.g. Mac Terminal), consider using screen. Start a new screen by typing screen, and then start R by typing R. (You can detach the screen while waiting, using Ctrl+A, D. Later you can reattach by typing screen -r.) Read more about screen here.
  • To save time, you can instead run utils.R and then run the downloadRawSequenceData() function, changing the sites, startYrMo, or endYrMo arguments, e.g. downloadRawSequenceData(sites = "ONAQ" startYrMo = "2017-10", endYrMo = "2017-10"). These parameters can also be set in the params.R script.

Pre-processing

I had to rename a single file due to a capitalization error, though there may be more in the future:

cd ITS
rename PLate Plate *.fastq