We have dowload 13 single cell RNA sequence (scRNA-seq) datasets from Gene Expression Omnibus (GEO), including blood, nerve, pancrease and so on. All the datasets are collected from the normal, rathre than patients.
The code (data_process.R) is the procedure for us to process the datasets.
Data | Tissue | cell type |
---|---|---|
GSE67835 | nerve | 9 |
GSE70580 | tonsil | 4 |
GSE73721 | nerve | 5 |
GSE74310 | blood | 2 |
GSE76381 | nerve | 27 |
GSE81252 | liver | 15 |
GSE81608 | pancreas | 8 |
GSE83139 | pancreas | 8 |
GSE84133 | pancreas | 14 |
GSE89232 | blood | 3 |
GSE94820 | blood | 28 |
GSE102956 | nerve | 4 |
GSE113197 | breast | 2 |
Identification of cell-subpopulation specific expression levels is the first step for integrative analysis. Intuitively, if a gene is expressed specifically in a cell-subpopulation, then the gene will likely have higher effects or causal probability for the GWAS trait. Here, we select five different methods, including zingeR, edgeR, MAST, t-statistics and high expression (DE_gene_func.R and de_gene.R).