-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsimplehc.Rmd
96 lines (77 loc) · 2.15 KB
/
simplehc.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
*This code is tested and run under the **R environment***
**Simple Hierarchical Cluster Analysis**
> Import Packages
```
library(tidyverse) # data manipulation
library(cluster) # clustering algorithms
library(factoextra) # clustering visualization
library(dendextend) # for comparing two dendrograms
```
>Hierarchical Clustering Algorithms
1) Agglomerative clustering (AGNES)
2) Divisive hierarchical clustering (DIANA)
> Import Data
```
seven_matrix<-matrix(c(10,8,15,3,
21,15,21,6,
8,8,10,4,
23,17,19,8,
12,9,10,3,
13,10,9,3,
9,6,7,4),nrow=7,byrow=T)
c("A","B","C","D","E","F","G")->row.names(seven_matrix)
c("Body_length","Tail_length","Wingspan","Beak_length")->colnames(seven_matrix)
```
> Hierarchical Clustering with R
***1) Agglomerative Hierarchical Clustering***
```
# Dissimilarity matrix
dist <- dist(seven_matrix, method = "euclidean")
# Hierarchical clustering using Complete Linkage
hc1 <- hclust(dist, method = "complete" )
# Plot the obtained dendrogram
plot(hc1, cex = 0.6, hang = -1)
plot(as.dendrogram(hc1),horiz=TRUE)
```
- Agnes
```
# Compute with agnes
hc2 <- agnes(dist, method = "complete")
# Agglomerative coefficient
hc2$ac
## [1] 0.8213396
## plot(as.dendrogram(hc2),horiz=TRUE)
```
- Ward's method indetifies
A) methods to assess
```
m <- c( "average", "single", "complete", "ward")
names(m) <- c( "average", "single", "complete", "ward")
```
B) function to compute coefficient
```
ac <- function(x) {
agnes(dist, method = x)$ac
}
map_dbl(m, ac)
```
- Dendogram Visualization
```
hc3 <- agnes(dist, method = "ward")
pltree(hc3, cex = 0.6, hang = -1, main = "Dendrogram of agnes")
plot(as.dendrogram(hc3),horiz=TRUE)
```
***2) Divisive Hierarchical Clustering***
A) Compute divisive hierarchical clustering
```
hc4 <- diana(dist)
hc4$dc
[1] 0.8514345
```
B) Plot dendrogram
```
pltree(hc4, cex = 0.6, hang = -1, main = "Dendrogram of diana")
plot(as.dendrogram(hc4),horiz=TRUE)
```
>References
This source of code is from **https://uc-r.github.io/hc_clustering** with different data by using my school assignment dataset