Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
huizezhang-sherry committed Jun 18, 2024
1 parent b53b845 commit 4d18876
Show file tree
Hide file tree
Showing 11 changed files with 169 additions and 71 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -44,5 +44,6 @@ Suggests:
pkgdown,
testthat,
forcats,
patchwork
patchwork,
future.apply
Language: en-GB
12 changes: 1 addition & 11 deletions R/calc-squintability.R
Original file line number Diff line number Diff line change
Expand Up @@ -35,17 +35,7 @@
#' @examples
#' # define the holes index as per tourr::holes
#' library(GpGp)
#' holes <- function() {
#' function(mat) {
#' n <- nrow(mat)
#' d <- ncol(mat)
#'
#' num <- 1 - 1 / n * sum(exp(-0.5 * rowSums(mat^2)))
#' den <- 1 - exp(-d / 2)
#'
#' num / den
#' }
#' }
#' library(tourr)
#' basis_smoothness <- sample_bases(idx = "holes")
#' calc_smoothness(basis_smoothness)
#' basis_squint <- sample_bases(idx = "holes", n_basis = 100, step_size = 0.01, min_proj_dist = 1.5)
Expand Down
50 changes: 37 additions & 13 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
output: github_document
bibliography: '`r system.file("reference.bib", package = "ferrn")`'
editor_options:
chunk_output_type: console
---
Expand All @@ -21,7 +22,7 @@ knitr::opts_chunk$set(
[![R build status](https://github.com/huizezhang-sherry/ferrn/workflows/R-CMD-check/badge.svg)](https://github.com/huizezhang-sherry/ferrn/actions)
<!-- badges: end -->

The **ferrn** package extracts key components in the data object collected by the guided tour optimisation, and produces diagnostic plots. An associated paper can be found at <https://journal.r-project.org/archive/2021/RJ-2021-105/index.html>.
The **ferrn** package extracts key components from the data object collected during projection pursuit (PP) guided tour optimisation, produces diagnostic plots, and calculates PP index scores.

## Installation

Expand All @@ -33,22 +34,22 @@ remotes::install_github("huizezhang-sherry/ferrn")
```


## Usage
## Visualise PP optimisation

To extract the data object from a guided tour, assign the `annimate_xx()` function a name:
The data object collected during a PP optimisation can be obtained by assigning the `tourr::annimate_xx()` function a name. In the following example, the projection pursuit is finding the best projection basis that can detect multi-modality for the `boa5` dataset using the `holes()` index function and the optimiser `search_better`:

```{r eval = FALSE}
set.seed(123456)
holes_1d_better <- animate_dist(
ferrn::boa5,
tour_path = guided_tour(holes(), d = 1,
search_f = search_better),
tour_path = guided_tour(holes(), d = 1, search_f = search_better),
rescale = FALSE)
holes_1d_better
```

The above code will collect data from the 1D animation on `boa5` dataset, a simulated data in the `ferrn` package.
The data structure includes the `basis` sampled by the optimiser, their corresponding index values (`index_val`), an `information` tag explaining the optimisation states, and the optimisation `method` used (`search_better`). The variables `tries` and `loop` describe the number of iterations and samples in the optimisation process, respectively. The variable `id` serves as the global identifier.

The best projection basis found by the projection pursuit algorithm can be extracted via
The best projection basis can be extracted via

```{r get-best}
library(ferrn)
Expand All @@ -58,17 +59,15 @@ holes_1d_better %>% get_best() %>% pull(basis) %>% .[[1]]
holes_1d_better %>% get_best() %>% pull(index_val)
```


Trace plot for viewing the optimisation progression with botanical palette:
The trace plot can be used to view the optimisation progression:

```{r trace-plot}
holes_1d_better %>%
explore_trace_interp() +
scale_color_continuous_botanical()
```

Compare two algorithms via plotting the projection bases on the reduced PCA space:

Different optimisers can be compared by plotting their projection bases on the reduced PCA space. Here `holes_1d_geo` is the data obtained from the same PP problem as `holes_1d_better` introduced above, but with a `search_geodesic` optimiser. The 5 $\times$ 1 bases from the two datasets are first reduced to 2D via PCA, and then plotted to the PCA space. (PP bases are ortho-normal and the space for $n \times 1$ bases is an $n$-d sphere, hence a circle when projected into 2D.)

```{r pca-plot}
bind_rows(holes_1d_geo, holes_1d_better) %>%
Expand All @@ -78,8 +77,7 @@ bind_rows(holes_1d_geo, holes_1d_better) %>%
scale_color_discrete_botanical()
```


View the projection bases on its original 5-D space via tour animation:
The same set of bases can be visualised in the original 5-D space via tour animation:

```{r tour-anim, eval = FALSE}
bind_rows(holes_1d_geo, holes_1d_better) %>%
Expand Down Expand Up @@ -114,3 +112,29 @@ render_gif(
<p float="center">
<img src="man/figures/tour.gif">
</p>

## Calculate PP index scores

Properties of PP index described in @laa_using_2020s includes smoothness, squintability, flexibility, rotation invariance, and speed. Here implementations are provided to calculate smoothness and squintability scores.

```{r}
# define the holes index as per tourr::holes
holes <- function() {
function(mat) {
n <- nrow(mat)
d <- ncol(mat)
num <- 1 - 1 / n * sum(exp(-0.5 * rowSums(mat^2)))
den <- 1 - exp(-d / 2)
num / den
}
}
basis_smoothness <- sample_bases(idx = "holes")
calc_smoothness(basis_smoothness)
basis_squint <- sample_bases(idx = "holes", n_basis = 100, step_size = 0.01, min_proj_dist = 1.5)
calc_squintability(basis_squint, method = "ks", bin_width = 0.01)
```

# Reference
79 changes: 60 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,10 @@
status](https://github.com/huizezhang-sherry/ferrn/workflows/R-CMD-check/badge.svg)](https://github.com/huizezhang-sherry/ferrn/actions)
<!-- badges: end -->

The **ferrn** package extracts key components in the data object
collected by the guided tour optimisation, and produces diagnostic
plots. An associated paper can be found at
The **ferrn** package extracts key components from the data object
collected during projection pursuit (PP) guided tour optimisation,
produces diagnostic plots, and calculates PP index scores. An associated
paper can be found at
<https://journal.r-project.org/archive/2021/RJ-2021-105/index.html>.

## Installation
Expand All @@ -24,25 +25,31 @@ You can install the development version of ferrn from
remotes::install_github("huizezhang-sherry/ferrn")
```

## Usage
## Examples

To extract the data object from a guided tour, assign the
`annimate_xx()` function a name:
The data object collected during a PP optimisation can be obtained by
assigning the `tourr::annimate_xx()` function a name. In the following
example, the projection pursuit is finding the best projection basis
that can detect multi-modality for the `boa5` dataset using the
`holes()` index function and the optimiser `search_better`:

``` r
set.seed(123456)
holes_1d_better <- animate_dist(
ferrn::boa5,
tour_path = guided_tour(holes(), d = 1,
search_f = search_better),
tour_path = guided_tour(holes(), d = 1, search_f = search_better),
rescale = FALSE)
holes_1d_better
```

The above code will collect data from the 1D animation on `boa5`
dataset, a simulated data in the `ferrn` package.
The data structure includes the `basis` sampled by the optimiser, their
corresponding index values (`index_val`), an `information` tag
explaining the optimisation states, and the optimisation `method` used
(`search_better`). The variables `tries` and `loop` describe the number
of iterations and samples in the optimisation process, respectively. The
variable `id` serves as the global identifier.

The best projection basis found by the projection pursuit algorithm can
be extracted via
The best projection basis can be extracted via

``` r
library(ferrn)
Expand All @@ -63,8 +70,7 @@ holes_1d_better %>% get_best() %>% pull(index_val)
#> [1] 0.9136095
```

Trace plot for viewing the optimisation progression with botanical
palette:
The trace plot can be used to view the optimisation progression:

``` r
holes_1d_better %>%
Expand All @@ -74,8 +80,13 @@ holes_1d_better %>%

<img src="man/figures/README-trace-plot-1.png" width="100%" />

Compare two algorithms via plotting the projection bases on the reduced
PCA space:
Different optimisers can be compared by plotting their projection bases
on the reduced PCA space. Here `holes_1d_geo` is the data obtained from
the same PP problem as `holes_1d_better` introduced above, but with a
`search_geodesic` optimiser. The 5 $\times$ 1 bases from the two
datasets are first reduced to 2D via PCA, and then plotted to the PCA
space. (PP bases are ortho-normal and the space for $n \times 1$ bases
is an $n$-d sphere, hence a circle when projected into 2D.)

``` r
bind_rows(holes_1d_geo, holes_1d_better) %>%
Expand All @@ -87,7 +98,8 @@ bind_rows(holes_1d_geo, holes_1d_better) %>%

<img src="man/figures/README-pca-plot-1.png" width="100%" />

View the projection bases on its original 5-D space via tour animation:
The same set of bases can be visualised in the original 5-D space via
tour animation:

``` r
bind_rows(holes_1d_geo, holes_1d_better) %>%
Expand All @@ -98,7 +110,36 @@ bind_rows(holes_1d_geo, holes_1d_better) %>%
```

<p float="center">

<img src="man/figures/tour.gif">

</p>

``` r
# define the holes index as per tourr::holes
holes <- function() {
function(mat) {
n <- nrow(mat)
d <- ncol(mat)

num <- 1 - 1 / n * sum(exp(-0.5 * rowSums(mat^2)))
den <- 1 - exp(-d / 2)

num / den
}
}

basis_smoothness <- sample_bases(idx = "holes")
calc_smoothness(basis_smoothness)
#> # PP index: holes
#> # No. of bases: 300 [6 x 2]
#> variance range smoothness nugget convergence
#> <dbl> <dbl> <dbl> <dbl> <lgl>
#> 1 0.00000672 18.1 1.03 1138. TRUE
basis_squint <- sample_bases(idx = "holes", n_basis = 100, step_size = 0.01, min_proj_dist = 1.5)
calc_squintability(basis_squint, method = "ks", bin_width = 0.01)
#> # PP index: holes
#> # No. of bases: 100 -> 17159
#> # method: ks
#> max_x max_d squint
#> <dbl> <dbl> <dbl>
#> 1 1.87 0.482 0.901
```
Loading

0 comments on commit 4d18876

Please sign in to comment.