-
Notifications
You must be signed in to change notification settings - Fork 0
/
NMF.Rmd
executable file
·110 lines (82 loc) · 2.47 KB
/
NMF.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
title: "NMF"
author: "liuc"
date: "11/10/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
NMF的思想:V=WH(W权重矩阵、H特征矩阵、V原矩阵),通过计算从原矩阵提取权重和特征两个不同的矩阵出来。属于一个无监督学习的算法,其中限制条件就是W和H中的所有元素都要大于0.
W又称为基矩阵、H称为系数矩阵。
```{r}
library(NMF)
# introduction to NMF package
# 输入矩阵为常见的gene counts矩阵
# For example, if min(transposed_data) outputs -3.1, I add 3.100001 to the matrix
nmfAlgorithm() # 查看NMF包内的算法
nmfAlgorithm('brunet')
nmfSeed() # 查看seed的用法
# number value is used to set the state of the RNG, and the initialization is performed by the built-in seeding method ’random’
# Note that if the seeding method is deterministic there is no need to perform multiple run anymore
```
```{r}
data(esGolub)
# 为避免过拟合,此处数据可randomized
V.random <- randomize(esGolub)
estim.r <- nmf(esGolub, 2:6,
nrun=30,
seed=123456
)
plot(estim.r)
# plot(2:6,estim.r$measures$cophenetic,
# type="b",col="purple")
# 判断最佳rank值的准则就是,cophenetic值随K变化的最大变动的前点
consensusmap(estim.r,
annCol=esGolub, labCol=NA, labRow=NA
)
```
```{r}
nmfEstimateRank(x, range,
method = nmf.getOption("default.algorithm"), nrun = 30,
model = NULL, ..., verbose = FALSE, stop = FALSE)
```
```{r}
res1 <- nmf(x = esGolub,
rank = 4,
nrun=100,
method = "brunet",
seed=123456)
w <- basis(res1)
h <- coef(res1)
V.hat <- fitted(res1)
fit(res1) # model info
summary(res1) # quality measures
summary(res1, class=esGolub$Cell)
dim(res1[1:10, 1:10])
```
特征提取
```{r}
# only compute the scores
s <- featureScore(res1)
summary(s)
# compute the scores and characterize each metagene
s <- extractFeatures(res1)
str(s)
```
comparing algorithms
```{r}
# To enable a fair comparison, a deterministic seeding method should also be used. Here we fix the random seed to 123456.
res.multi.method <- nmf(esGolub, 3, list('brunet', 'lee', 'ns'),
seed=123456,
.options='t')
compare(res.multi.method)
```
# 针对W H矩阵的热图绘制
```{r}
# basis components
basismap(res1, subsetRow=TRUE)
# mixture coefficients
coefmap(res1)
```