Skip to content

Plotting Normalized Counts

kevincjnixon edited this page Jan 12, 2021 · 2 revisions

Plotting Normalized Counts

Sometimes, we want to look at specific gene(s) or a group of genes within common pathways and see how they are expressed in different conditions. We can do this through a number of different ways. BinfTools allows us to look at gene expression using heatmaps or violin plots using the following functions:

  • zheat()
  • count_plot()

zheat()

To create a heatmap, use the zheat() function, which uses normalized counts and pheatmap() to make a heatmap of zscore-normalized gene expression. This function has the following arguments:

  • genes A character vector of genes (matching rownames(norm_counts)) that indicates the genes to subset for the heatmap
  • bnorm A Boolean indicating if the z-score normalization should occur before subsetting genes. Leave as "TRUE"
  • counts A data.frame of normalized counts, with genes as rows and samples as columns.
  • conditions A character vector indicating conditions belonging to each sample (same order as colnames(norm_counts))
  • con A character matching the control condition in conditions. Default is "WT".
  • title Character indicating the title of the heatmap
  • labgenes A character vector (matching rownames(norm_counts)) indicating the genes to be labelled on the heatmap. If left NULL, all genes will be labelled. If no genes are to be labelled in the heatmap use labgenes="".
#If we want to subset the differentially expressed genes for a heatmap
DEG<-rownames(subset(res, padj<0.05 & abs(log2FoldChange)>log(1.5,2)))
#Genes to label = top 5 significant DEGs
genes<-rownames(subset(res[order(res$padj),]))[1:5]
#Create the heatmap of DEGs
zheat(genes=DEG, counts=norm_counts, conditions=cond, con="WT", title="DEGs", labgenes=genes) 

Heatmap

count_plot()

To create a violin plot comparing normalized counts of a subset of genes between conditions, use count_plot(). This function will normalize the counts using either z-score or log10 normalization (recommended is z-score) and perform pairwise t-tests (with multiple correction, if applicable) between conditions. The arguments are as follows:

  • counts The data.frame object of normalized counts
  • scaling One of "zscore" or "log10" indicating how the gene expression across samples should be scaled. Default and recommended is "zscore"
  • genes Character vector matching rownames(norm_counts) indicating the group of genes to subset for this analysis/plot
  • condition Character vector of indicating conditions of each sample. In same order as colnames(norm_counts).
  • title Character indicating the title of the plot
  • compare List of character vectors of length two indicating the conditions for pairwise comparisons. If left NULL, all possible comparisons will be made.
#Look at gene expression between groups for genes involved in a given pathway (from gmt file)
genes<-unique(unlist(qusage::read.gmt("geneset.gmt")))
#Make the count plot
count_plot(counts=norm_counts, scaling="zscore", genes=genes, condition=cond, title="Rhodopsin Genes", compare=list(c("WT","KO)))

count plot