-
Notifications
You must be signed in to change notification settings - Fork 2
Plotting Normalized Counts
Sometimes, we want to look at specific gene(s) or a group of genes within common pathways and see how they are expressed in different conditions. We can do this through a number of different ways. BinfTools allows us to look at gene expression using heatmaps or violin plots using the following functions:
- zheat()
- count_plot()
To create a heatmap, use the zheat() function, which uses normalized counts and pheatmap() to make a heatmap of zscore-normalized gene expression. This function has the following arguments:
- genes A character vector of genes (matching rownames(norm_counts)) that indicates the genes to subset for the heatmap
- bnorm A Boolean indicating if the z-score normalization should occur before subsetting genes. Leave as "TRUE"
- counts A data.frame of normalized counts, with genes as rows and samples as columns.
- conditions A character vector indicating conditions belonging to each sample (same order as colnames(norm_counts))
- con A character matching the control condition in conditions. Default is "WT".
- title Character indicating the title of the heatmap
- labgenes A character vector (matching rownames(norm_counts)) indicating the genes to be labelled on the heatmap. If left NULL, all genes will be labelled. If no genes are to be labelled in the heatmap use labgenes="".
#If we want to subset the differentially expressed genes for a heatmap
DEG<-rownames(subset(res, padj<0.05 & abs(log2FoldChange)>log(1.5,2)))
#Genes to label = top 5 significant DEGs
genes<-rownames(subset(res[order(res$padj),]))[1:5]
#Create the heatmap of DEGs
zheat(genes=DEG, counts=norm_counts, conditions=cond, con="WT", title="DEGs", labgenes=genes)
To create a violin plot comparing normalized counts of a subset of genes between conditions, use count_plot(). This function will normalize the counts using either z-score or log10 normalization (recommended is z-score) and perform pairwise t-tests (with multiple correction, if applicable) between conditions. The arguments are as follows:
- counts The data.frame object of normalized counts
- scaling One of "zscore" or "log10" indicating how the gene expression across samples should be scaled. Default and recommended is "zscore"
- genes Character vector matching rownames(norm_counts) indicating the group of genes to subset for this analysis/plot
- condition Character vector of indicating conditions of each sample. In same order as colnames(norm_counts).
- title Character indicating the title of the plot
- compare List of character vectors of length two indicating the conditions for pairwise comparisons. If left NULL, all possible comparisons will be made.
#Look at gene expression between groups for genes involved in a given pathway (from gmt file)
genes<-unique(unlist(qusage::read.gmt("geneset.gmt")))
#Make the count plot
count_plot(counts=norm_counts, scaling="zscore", genes=genes, condition=cond, title="Rhodopsin Genes", compare=list(c("WT","KO)))