TF and epi regulators
hs.TFs <- read.csv("https://github.com/leezx/RToolbox/raw/master/data/Homo_sapiens_TF.txt", sep = "\t")
hs.coTFs <- read.csv("https://github.com/leezx/RToolbox/raw/master/data/Homo_sapiens_TF_cofactors.txt", sep = "\t")
hs.TFs <- unique(c(hs.TFs$Symbol, hs.TFs$Symbol))
length(hs.TFs)
hs.epi.reg <- read.csv("https://github.com/leezx/RToolbox/raw/master/data/KAT6A-chromatin-regulators.csv", sep = ",")
hs.epi.reg <- unique(hs.epi.reg$Gene)
hs.epi.reg <- hs.epi.reg[!grepl("NonTargetingControlGuide",hs.epi.reg)]
length(hs.epi.reg)
gplots::venn(list(hs.TFs=hs.TFs, hs.epi.reg=hs.epi.reg))
source("https://github.com/leezx/Toolsets/raw/master/R/Toolsets.R")
source("https://github.com/leezx/Toolsets/raw/master/R/Plot.R")
- copy the template in tmpf.R
- develop the new function in local env
- add the new function to tmpf.R
- gitdoc
- gitpush
gitdoc
Rscript -e "devtools::document()" && cd .. && R CMD Rd2pdf Toolsets && mv Toolsets.pdf Toolsets && cd Toolsets
- go to Rstudio and build
- or
Rscript -e "devtools::document();devtools::check();devtools::build()"
beauty
- make function name shorter (. is better than _)
- make the description shorter
- word wrap for the example code
- move the newly functions to formal R file
- update the function name and description
- update the manual
- DO NOT change the API unless you are now develop it
怎么才能随时使用、学习和调试别人的代码,在充分理解之后,修改保存为自己的代码?(如果这项技能点满了,那在生信领域将大有可为!!!)
使用和调试分开,所有代码都是想通的,核心的导向是数据;不要害怕不同工具之间的隔阂,很有可能一个函数就可以连通。
all in one file
name system:
- prepare_
- clustering_
- trajectory_
- deg_
- toC_
- plot_
- tp_
objective:
- clear function
- clear input and output
- well documented
- good handbook
others:
- collect R packages
- collect useful graphs
- 无需安装,随时能够调用
- 随时生成PDF的handbook,方便查询
- 版本管理
- 动态修改
- 在实用性和完备性之间获得平衡,不要封装得太好,不要追求完善,最好是一个简单的功能封装成一个函数, 只有当最终封装成包时才追求完整性
- 画图 - 功能和款式
Coursera - Building R Packages
From course:
- Writing R Extensions
- Building R Packages Pre-Flight Check List
- devtools-cheatsheet.pdf
- Common roxygen2 tags
- testthat: Get Started with Testing
Building packages for Bioconductor
Learn from monocle, seurat, etc.
- 将单细胞的分析按功能分类(smart-seq and 10x,clustering,pseudotime,绘图模块);
- 标准化输入输出;
- 写好使用文档;
- subset(diff_test_res, qval < 0.01),之前取子集的方法太笨拙了;
- Q: 如何将别人的R包函数直接添加到我自己的包里?
A: 如果是纯函数,那直接copy即可;但是现在的大多数对象集成的,直接copy是不能使用的,要提前把class给导入。今天我就遇到了一个问题,直接将R文件夹里的函数文件copy,无法build,显示某个class没有define,搜索了好久没有解决,最终发现是Description文件的问题(原包可以正常编译,一个一个文件的删除,测试出来的),里面有个Depends,它不是写得好玩的,它决定了你编译前会import哪些包(随便移除一个就测试出来了)。NAMESPACE不需要手动编辑,roxygen会自动生成。
- Q: R中开发同一个包时,如何多版本共存,一起调试?
A: 同名的函数会覆盖,所以只能显式的使用域名。
- Q: 已经有很多降维的方法了,还需要特征选择吗?
A: 不知道
prepare_TPM_value_from_smart_seq_data
prepare_raw_count_matrix_from_10x_data
filtering_matrix
normalizing_matrix
batch_effect_correction
feature_selection
Highly_variable_genes_selection
dimension_reduction_by_PCA
dimension_reduction_by_diffusionMap
data_transform_matrix_to_CellDataSet
data_transform_matrix_to_SingleCellExperiment
data_transform_matrix_to_seurat
data_transform_CellDataSet_to_SingleCellExperiment
data_transform_SingleCellExperiment_to_CellDataSet
data_transform_mouse_ensemble_to_symbol
clustering_by_SC3
clustering_by_SIMLR
clustering_by_seurat
clustering_by_monocle
marker_identification_by_seurat
marker_identification_by_SC3
pseudotime_based_on_clustering
pseudotime_by_monocle1
pseudotime_by_monocle2
pseudotime_by_SLICER
module_detection_by_WGCNA
module_detection_by_MSIC
DEG_pseudotime
DEG_by_scde
DEG_by_edgeR
scoring_model_by_LR
dataset_human_TFs
dataset_mouse_TFs
dataset_human_Y_chromsome_genes
dataset_mouse_Y_chromsome_genes
dataset_human_to_mouse_genes
dataset_cell_cycle_genes
dataset_cell_cycle_phase_genes
dataset_migration_genes
dataset_ENS_markers
show_notes
tools_study_test
plotting_modules_smooth_curve_across_pseudotime
plotting_violin_plot_of_markers_across_clusters_by_seurat
plotting_boxplot
plotting_violin_plot
plotting_barplot
plotting_volcano_plot
plotting_heatmap
plots_collection_001: dot plot, The SplitDotPlotGG
function can be useful for viewing conserved cell type markers across conditions, showing both the expression level and the percentage of cells in a cluster expressing any given gene. Here we plot 2-3 strong marker genes for each of our 13 clusters. See link
plots_collection_002: scatter plots, highlighting genes that exhibit dramatic responses to interferon stimulation. See link
plots_collection_00X
(Source code from CNS paper)
Algorithms_collection_001
Algorithms_collection_00X