-
Notifications
You must be signed in to change notification settings - Fork 2
/
README.Rmd
129 lines (101 loc) · 4.94 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
output: github_document
bibliography: '`r system.file("reference.bib", package = "ferrn")`'
editor_options:
chunk_output_type: console
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
message = FALSE
)
```
# ferrn <a href='https://huizezhang-sherry.github.io/ferrn/'><img src='man/figures/logo.png' align="right" height="138.5" /></a>
<!-- badges: start -->
[![R build status](https://github.com/huizezhang-sherry/ferrn/workflows/R-CMD-check/badge.svg)](https://github.com/huizezhang-sherry/ferrn/actions)
<!-- badges: end -->
The **ferrn** package extracts key components from the data object collected during projection pursuit (PP) guided tour optimisation, produces diagnostic plots, and calculates PP index scores.
## Installation
You can install the development version of ferrn from [GitHub](https://github.com/) with:
```{r eval = FALSE}
# install.packages("remotes")
remotes::install_github("huizezhang-sherry/ferrn")
```
## Visualise PP optimisation
The data object collected during a PP optimisation can be obtained by assigning the `tourr::annimate_xx()` function a name. In the following example, the projection pursuit is finding the best projection basis that can detect multi-modality for the `boa5` dataset using the `holes()` index function and the optimiser `search_better`:
```{r eval = FALSE}
set.seed(123456)
holes_1d_better <- animate_dist(
ferrn::boa5,
tour_path = guided_tour(holes(), d = 1, search_f = search_better),
rescale = FALSE)
holes_1d_better
```
The data structure includes the `basis` sampled by the optimiser, their corresponding index values (`index_val`), an `information` tag explaining the optimisation states, and the optimisation `method` used (`search_better`). The variables `tries` and `loop` describe the number of iterations and samples in the optimisation process, respectively. The variable `id` serves as the global identifier.
The best projection basis can be extracted via
```{r get-best}
library(ferrn)
library(dplyr)
holes_1d_better %>% get_best()
holes_1d_better %>% get_best() %>% pull(basis) %>% .[[1]]
holes_1d_better %>% get_best() %>% pull(index_val)
```
The trace plot can be used to view the optimisation progression:
```{r trace-plot}
holes_1d_better %>%
explore_trace_interp() +
scale_color_continuous_botanical()
```
Different optimisers can be compared by plotting their projection bases on the reduced PCA space. Here `holes_1d_geo` is the data obtained from the same PP problem as `holes_1d_better` introduced above, but with a `search_geodesic` optimiser. The 5 $\times$ 1 bases from the two datasets are first reduced to 2D via PCA, and then plotted to the PCA space. (PP bases are ortho-normal and the space for $n \times 1$ bases is an $n$-d sphere, hence a circle when projected into 2D.)
```{r pca-plot}
bind_rows(holes_1d_geo, holes_1d_better) %>%
bind_theoretical(matrix(c(0, 1, 0, 0, 0), nrow = 5),
index = tourr::holes(), raw_data = boa5) %>%
explore_space_pca(group = method, details = TRUE) +
scale_color_discrete_botanical()
```
The same set of bases can be visualised in the original 5-D space via tour animation:
```{r tour-anim, eval = FALSE}
bind_rows(holes_1d_geo, holes_1d_better) %>%
explore_space_tour(flip = TRUE, group = method,
palette = botanical_palettes$fern[c(1, 6)],
max_frames = 20,
point_size = 2, end_size = 5)
```
```{r eval = FALSE, echo = FALSE}
prep <- prep_space_tour(dplyr::bind_rows(holes_1d_better, holes_1d_geo),
flip = TRUE, group = method,
palette = botanical_palettes$fern[c(1,6)],
axes = "bottomleft",
point_size = 2, end_size = 5)
# render gif
set.seed(123456)
render_gif(
prep$basis,
tour_path = grand_tour(),
display = display_xy(col = prep$col, cex = prep$cex, pch = prep$pch,
edges = prep$edges, edges.col = prep$edges_col,
axes = "bottomleft"),
rescale = FALSE,
frames = 20,
gif_file = here::here("man", "figures","tour.gif")
)
```
<p float="center">
<img src="man/figures/tour.gif">
</p>
<!-- ## Calculate PP index scores -->
<!-- Properties of PP index described in @laa_using_2020s includes smoothness, squintability, flexibility, rotation invariance, and speed. Here implementations are provided to calculate smoothness and squintability scores. -->
<!-- ```{r} -->
<!-- # define the holes index as per tourr::holes -->
<!-- holes <- tourr::holes -->
<!-- basis_smoothness <- sample_bases(idx = "holes") -->
<!-- calc_smoothness(basis_smoothness) -->
<!-- basis_squint <- sample_bases(idx = "holes", n_basis = 100, step_size = 0.01, min_proj_dist = 1.5) -->
<!-- calc_squintability(basis_squint, method = "ks", bin_width = 0.01) -->
<!-- ``` -->
# Reference