Skip to content

Commit

Permalink
Merge pull request #489 from njtierney/polish-0-4-1-news
Browse files Browse the repository at this point in the history
Polish 0 4 1 news
  • Loading branch information
njtierney committed Mar 15, 2022
2 parents f8c67d5 + b704bde commit a06d2ca
Show file tree
Hide file tree
Showing 6 changed files with 327 additions and 120 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
^CODE_OF_CONDUCT\.md$
^\.EDIT_WEBSITE\.md$
^LICENSE\.md$
LICENSE
^cran-comments\.md$

^logos$
Expand Down Expand Up @@ -40,3 +41,4 @@
^codecov\.yml$

^depends\.rds$
^CRAN-SUBMISSION$
6 changes: 3 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Type: Package
Package: greta
Title: Simple and Scalable Statistical Modelling in R
Version: 0.4.0
Date: 2022-02-21
Version: 0.4.1
Date: 2022-03-14
Authors@R: c(
person("Nick", "Golding", , "nick.golding.research@gmail.com", role = "aut",
comment = c(ORCID = "0000-0001-8916-5570")),
Expand All @@ -28,7 +28,7 @@ Description: Write statistical models in R and fit them by MCMC and
easy to extend and build on. See the website for more information,
including tutorials, examples, package documentation, and the greta
forum.
License: Apache License (>= 2) + file LICENSE
License: Apache License 2.0
URL: https://greta-stats.org
BugReports: https://github.com/greta-dev/greta/issues
Depends:
Expand Down
107 changes: 97 additions & 10 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,109 @@
# greta 0.4.0 (2021-11-26)
# greta 0.4.1 (2022-03-14)

## Summary

This release presents a variety of improvements over the past 2 years. We are now aiming to have smaller, more regular releases of `greta`. This release showcases new features implemented by Nick Golding on the `calculate` and `simulate` functions. There are also many internal changes on installation, error printing, and testing. This release also sees changing of maintainer, from Nick Golding to Nick Tierney.

### Installation

We have overhauled the installation checking process, and created a new helper function for installation, `install_greta_deps()`.

We need the Tensorflow and Tensorflow Probability Python modules to use `greta`.
When these aren't installed, this now triggers a new prompt which encourages users
to use a new installation helper, which looks like this:

```
#> We have detected that you do not have the expected python packages setup.
#> You can set these up by running this R code in the console:
#> `install_greta_deps()`
#> Then, restart R and run:
#> `library(greta)`
#> (Note: Your R session should not have initialised Tensorflow yet.)
#> For more information, see `?install_greta_deps`
```

Running `install_greta_deps()` will then go through the process of installing the dependencies, and ask the user to restart R and load `greta` to get it working:

```
#> ✓ Installation of greta dependencies is complete!
#> • Restart R, then load greta with: `library(greta)`
```

The `install_greta_deps()` function helps ensure Python dependencies are installed correctly. This saves exact versions of Python (3.7), and the python modules NumPy (1.16.4), Tensorflow (1.14.0), and Tensorflow Probability (0.7.0) into a conda environment, "greta-env".

So what is a conda environment? It is similar to the R projects, `packrat` and `renv` (although I believe conda environments are a much older idea!). It allows you to use specific versions of Python and Python modules (Python module = R Package) that do not interact with other projects. Essentially, you "activate" a specific conda environment, which loads the specified Python version and modules. This means you avoid situations where you might update a python module and then all your other code breaks because breaking changes were introduced in a new version.

Why do we need this? Currently `greta` needs specific versions of Tensorflow and Tensorflow Probability, and we know that those specific versions work with a specific version of Python. We wanted to keep things stable for users, so they don't have to go through the (often) painful process of installing dependencies.

How does it work? When `greta` is loaded, say with `library(greta)`, it searches for a "greta-env" conda environment and loads it. It is not required to use the conda environment, "greta-env", so you can install these Python modules yourself.

Overall this means that users can run the function `install_greta_deps()`, follow the prompts, and have all the python modules they need installed, without contaminating other software that use different python modules.

### Error printing

We have reviewed all of the error messages in greta, and rewritten the printing methods for the error messages to use the `cli` package for prettier, more informative testing. We have also used the `glue` package in place of most uses of `sprintf` or `paste/0`, as the literal string interpolation makes it easier to maintain. For example:

``` r
paste0("Objects is of class: ", class(10))
#> [1] "Objects is of class: numeric"
sprintf("Objects is of class: %s", class(10))
#> [1] "Objects is of class: numeric"
glue::glue("Objects is of class: {class(10)}")
#> Objects is of class: numeric
cat(cli::format_message("Objects is of class: {.cls {class(10)}}"))
#> Objects is of class: <numeric>
```

Using `cli` also means we get nifty outputs like this from the new `greta_sitrep()` function, which tests if Python and its dependencies are available

``` r
greta::greta_sitrep()
#> ℹ checking if python available
#> ✓ python (version 3.7) available
#>
#> ℹ checking if TensorFlow available
#> ✓ TensorFlow (version 1.14.0) available
#>
#> ℹ checking if TensorFlow Probability available
#> ✓ TensorFlow Probability (version 0.7.0) available
#>
#> ℹ checking if greta conda environment available
#> ✓ greta conda environment available
#>
#> ℹ Initialising python and checking dependencies, this may take a moment.
#> ✓ Initialising python and checking dependencies ... done!
#>
#> ℹ greta is ready to use!
```

### Testing

We have also overhauled the testing interface to use [snapshotting](https://testthat.r-lib.org/reference/index.html#snapshot-testing). This makes it easier to write and test new error messages, and identify issues with existing print methods, errors, and warnings.

### Looking to the future

In a future release we will switch to using TensorFlow 2.6 (or higher), to ensure `greta` works with Apple computers with an M1 chip. We note that we have gone from "skipped" version 0.4.0, however this is because we had a soft release of 0.4.0 on GitHub in December, and wanted to signify that this package has changed since that time.

### Thanks

A special thanks to everyone who helped with this release: [Nick Golding](https://github.com/goldingn), [Jacob Wujciak-Jens](https://github.com/assignUser), and [Maëlle Salmon](https://github.com/maelle).

## Fixes:

* Python is now initialised when a `greta_array` is created (#468).

* head and tail S3 methods for `greta_array` are now consistent with head and tail methods for R versions 3 and 4 ([#384](https://github.com/greta-dev/greta/issues/384)).

* `greta_mcmc_list` objects (returned by `mcmc()`) are now no longer modified by operations (like code::gelman.diag()).
* `greta_mcmc_list` objects (returned by `mcmc()`) are now no longer modified by operations (like `coda::gelman.diag()`).

* joint distributions of uniform variables now have the correct constraints when sampling (#377).

* array-scalar dispatch with 3D arrays is now less buggy (#298).

* greta now provides R versions of all of R's primitive functions (I think), to prevent them from silently not executing (#317).
* `greta` now provides R versions of all of R's primitive functions (I think), to prevent them from silently not executing (#317).

* Uses `Sys.unsetenv("RETICULATE_PYTHON")` in `.onload` on package startup,
to prevent an issue introduced with the latest version of RStudio where they
do not find the current version of RStudio. See [#444](https://github.com/greta-dev/greta/issues/444) for more details.
to prevent an issue introduced with the "ghost orchid" version of RStudio where they do not find the current version of RStudio. See [#444](https://github.com/greta-dev/greta/issues/444) for more details.

* Internal change to code to ensure `future` continues to support parallelisation of chains. See [#447](https://github.com/greta-dev/greta/issues/447) for more details.

Expand All @@ -30,18 +117,18 @@

* `calculate()` now accepts multiple greta arrays for which to calculate values, via the `...` argument. As a consequence any other arguments must now be named.

* a number of optimiser methods are now deprecated, since they will be unavailable when greta moves to using TensorFlow v2.0: `powell()`, `cg()`, `newton_cg()`, `l_bfgs_b()`, `tnc()`, `cobyla()`, and `slsqp()`.
* A number of optimiser methods are now deprecated, since they will be unavailable when greta moves to using TensorFlow v2.0: `powell()`, `cg()`, `newton_cg()`, `l_bfgs_b()`, `tnc()`, `cobyla()`, and `slsqp()`.

* `dirichlet()` now returns a variable (rather than an operation) greta array, and the graphs created by `lkj_correlation()` and `wishart()` are now simpler as cholesky-shaped variables are now available internally.

* Python dependency installation has been overhauled with the new `install_greta_deps()` function (#417).

* Adds helper functions for helping installation get to "clean slate" (#443)
* Adds the `reinstall_greta_env()`, `reinstall_miniconda()`, `remove_greta_env()`, and `remove_miniconda()` helper functions for helping installation get to "clean slate" (#443).

* `greta` currently doesn't work on Apple Silicon (M1 Macs) as they need to use TF 2.0, which is currently being implemented. `greta` now throws an error if M1 macs are detected and directs users to https://github.com/greta-dev/greta/issues/458 (#487)

## Features:

* New `install_greta_deps()` - provides installation of python dependencies (#417). This saves exact versions of Python (3.7), and the python modules NumPy (1.16.4), Tensorflow (1.14.0), and Tensorflow Probability (0.7.0) into a conda environment, "greta-env". When initialising Python, greta now searches for this conda environment first, which presents a great advantage as it isolates these exact versions of these modules from other Python installations. It is not required to use the conda environment, "greta-env". Overall this means that users can run the function `install_greta_deps()`, follow the prompts, and have all the python modules they need installed, without contaminating other software that use different python modules.

* `calculate()` now enables simulation of greta array values from their priors, optionally conditioned on fixed values or posterior samples. This enables prior and posterior predictive checking of models, and simulation of data.

* A `simulate()` method for greta models is now also provided, to simulate the values of all greta arrays in a model from their priors.
Expand All @@ -50,7 +137,7 @@

* There are three new variable constructor functions: `cholesky_variable()`, `simplex_variable()`, and `ordered_variable()`, for variables with these constraints but no probability distribution.

* a new function `chol2symm()` - the inverse of `chol()`.
* New `chol2symm()` is the inverse of `chol()`.

* `mcmc()`, `stashed_samples()`, and `calculate()` now return objects of class `greta_mcmc_list` which inherit from `coda`'s `mcmc.list` class, but enable custom greta methods for manipulating mcmc outputs, including a `window()` function.

Expand Down
Loading

0 comments on commit a06d2ca

Please sign in to comment.