Just a try for some workflow/structure setup in order to cope with the lack of scaffolding tools in R for big projects dedicated to bioinformatics analysis.
Project:
-
info: pdfs, powerpoints, docs, etc anything that is not affiliated with the scripts
-
data-input: data that will be used by the scripts but is not generated by them
-
data-output:data that will be generated by the scripts but is not in a form of proper report
-
reports: only files that can be shown to someone else in a proper form
-
src: all the R scripts, each one contating a prefilled description in comments
-
main.R: Sets the current working directory and calls the following scripts: initialize.R, load_data.R, pull_data_from_DB.R, build.R, analyze.R.
-
functions.R: All functions exist in this file. If they are too many, they can be separeted in several functions_XXX.R files.
-
explore.R: Scripts chuncks for testing things or data exploration go here. It is the only file that is allowed to be messy.
-
initialize.R: This file loads all the packages,libraries and data needed regarding the workspace, loads the functions.R script and sets the global variables.
-
load_data.R: This file loads all the csv/txt/xlsx/RDS/etc files needed and displays which files have been created/loaded in the workspace.
-
pull_data_from_DB.R: This file uses dbplyr package to fetch filtered and grouped tables from the DB. Here client side operations should be keeped to bare minimum and no server side operations. All the new files created in the workspace should be displayed. These variables should be saved so the can be releoaded faster. A query_db boolean variable can be used to switch between DB fetch requests and reading from an RDS for faster data reload.
-
build.R: Data wrangling with dplyr/tidyr/etc. All the magic happens here. Newly created files in workspace should be displayed. A build boolean variable can be used for data reload from RDS for faster data reload.
-
analyze.R: Here does all the analysis steps. Results are reported in xlsx/csv media format in the designated folders.
-
build_ppt.R: A template for powerpoint report using officer.
-
prepare_markdown.R: Sets the current working directory, loads the data sets the markdown parameters and renders it.
-
prepare_shiny.R: Sets the current working directory, loads the data sets the shiny parameters and runs the app.
-
markdown_report.Rmd: A report template.
-
shiny_report.Rmd: A Shiny app template.
-
-
Put the files [project_initializer.R, DESCRIPTION, file_descriptions.csv] in any folder with read/write privileges
-
Run source(project_initializer.R) inside the project folder and start coding!