Iscandar (Interactive Single Cell Data Analysis Report) is a set of python scripts and html/javascript files used to create interactive report for single cell rna-seq analysis. It can be used as a standalone application to look up gene expression and gene set expression profiles on clustering which have already been performed (it takes PCA and TSNE coordinates as input, rather than performing any clustering itself). Download a demo (5Mb) with a full published dataset (unzip and open Report.html in your browser).
- Show two types of clustering (usually PCA and TSNE) on interactive scatter plots (using plotly). There is also a side-by-side view so that comparisons can be made between the two.
- Lasso points on the screen and define these as new clusters. On the side-by-side view of the plots, lassoing one plot automatically shows corresponding points in the other.
- Look up gene expression as a colour gradient on each plot.
- Look up mean expression to a gene set as a colour gradient on each plot.
- View gene vs gene or gene vs gene set (mean) scatter plots.
- Download a plot as a PNG image (plotly provided function).
Clone this repo and populate all the files in the input directory. Then run:
python create_data_model.py
That's all. This python script overwrites output/js/data-model.js file by reading all the files in the input directory. output directory contains all the files required for the report, and can be sent to the user. Just open Report.html in a browser to use the report.
Another option is to bypass input file creation and do the following in your own python script (or jupyter notebook):
from create_data_model import DataModel
dm = DataModel()
dm.metadata = {'name':'pera', ...}
dm.analysisMetadata = {...}
...
dm.saveJSFile()
Here we assign each of the required input variables directly to the DataModel class and invoke its saveJSFile() method to save the file to arrive at the same result.
Each file comes with example data so that its required format can be easily worked out. More detailed descriptions are here. All tables use tab as the column separator.
Contains description of the analysis performed, and is in a two column table format in a key-value relationship. None of the keys are required fields.
Clusters are sample assignments, often made computationally rather than originating from sample meta data (which belongs to sample groups - see below). This file contains names of clusters and which items belong to each cluster, as well as what colour to use for each item in the cluster.
Mapping of sample ids to cluster items. Ensure that column headers here match cluster names found in clusterItems.txt.
Expression matrix in the usual format of genes as row ids and sample ids as columns. Note that user can only search for gene expression if the gene occurs as a row id in this matrix. So if gene ids are used, the user has to use the same ids for search. Also note that larger this matrix is, the larger the report will be in size and longer the loading time, as this takes up the vast majority of the data.
Gene sets enable Iscandar to show mean expression of all gene in a gene set for each sample. This file contains a list of gene sets in 3 column table format, where first column is the name of the gene set, second is the list of genes joined by comma, and third is the mean expression value of the gene set for each cell joined by comma (so these should match the ordering of the sample ids in pca.txt). Note that genes here can actually be different to the row ids of expression matrix, as the report does not compute the mean expression values but just uses the values supplied in the 3rd column.
Description of the dataset in a two column table format. "name" key is the only required key, used by the app.
pca coordinates used by the report to show pca, in Nx2 format table, where N=number of samples. Note that ordering of sample ids in this data frame will be assumed for all lists requring sample id list. Even though this is called pca, it can be some other coordinates, such as MDS - just make a note in analysisMetadata.txt.
Contains names of sample groups and which items belong to each sample group, as well as what colour to use for each item in the sample group. A 'sampleGroup' is some experimental grouping of samples, such as celltype, timepoint, etc. And each sample group will comprise of the items specified in the second column. Note that same ordering of the items here will be used to draw each trace for that sample group in the report.
Mapping of sample ids to sample group items. Ensure that column headers here match sample group names found in sampleGroupItems.txt.
tsne coordinates used by the report to show tsne, in Nx2 format table, where N=number of samples. This table should have identical row indices as pca.txt.
I'm just adding some useful notes for development of Iscandar here.
See data-model.js for an example of how this is done. Summary: Suppose we want to add a separate javascript file called test.js, whose content is:
"use strict";
define(function(require, exports, module) {
module.exports = {
data: ['a','b'],
hello: function() {
return 'hello';
},
};
});
We want to include it in main.js so that we can access its data and functions. Modify main.js so that
require.config({
baseUrl: 'js',
paths: {
d3: 'd3.v3.min',
vue: 'vue.min',
plotly: 'plotly-latest.min',
test: 'test',
},
});
define(['d3', 'vue.min', 'plotly-latest.min', 'test'], function(d3, Vue, Plotly, Test) {
console.log(Test.hello());
});
Note that if the component's template is defined in the html, use inline-template attribute on the tag if it occurs inside the dom element of the main Vue app. See https://sebastiandedeyne.com/posts/2016/dealing-with-templates-in-vue-20 for more info.
The lasso dialog is an example of a component that we can add not inline. Note that it needs to be inside wrapperDiv but doesn't work inside another component. Some useful info here: https://adamwathan.me/2016/01/04/composing-reusable-modal-dialogs-with-vuejs/
Plotly.newPlot("plotDiv", ...) redraws over "#plotDiv", and seems fast enough without having to delete and add traces. To delete traces, you have to supply an array of indices matching the traces to delete, eg. [0,3]. So if there are 5 traces and you want to delete them all, the indices must be [0,1,2,3,4]:
Plotly.deleteTraces("plotDiv", [0,1,2,3,4]);
Plotly.downloadImage() function automatically invokes the download file on the browser:
Plotly.downloadImage(document.getElementById("plotDiv"), {format: 'png', width: 1000, height: 700, filename: "myfile"});
Jarny Choi, stemformatics.org (jarny@stemformatics.org)
- v0.1.2 - Fixed a bug where creation date showed today's date. Added help text in many parts of the pages. Disabled lasso for now.