Add initial support for saving and restoring simulation state. #280

plietar · 2024-01-22T15:20:37Z

The model can now be run for a given number of steps, have its state saved, and then restore it and run for some more time steps.

Parameters of the model may be modified when resuming, allowing a simulation to be forked with a single historical run and many futures, modelling different intervention scenarios.

There are some limitations as to which parameters may be modified. In general, the structure of the simulation must remain the same, with the same variables and same events. This means interventions cannot be enabled or disabled when resuming, only their parameterization can change. Additionally, the timing of intervention events should not be modified. These limitations may be relaxed in the future, if there is a need for it.

To avoid changing the existing API, this feature is available as a new run_resumable_simulation function. The function returns a pair of values, the existing dataframe with the simulation data and an object representing the internal simulation state. The function can be called a second time, passing the saved simulation state as an extra argument. See the test-resume.R file for usage of the new function.

The implementation builds on top of individual's new support for this. Individual already saves the state of its native objects, ie. variables and events. The malaria model keeps quite a bit of state outside of individual, which we need to save and restore explicitly. This is done by creating a set of "stateful objects", ie. R6 objects with a pair save_state and restore_state methods. This interface is implemented by every bit of the model that has state to capture:

LaggedValue objects store the short term EIR and FOIM history.
Solver objects represent the current state of ODE solvers.
Adult mosquito ODEs keep a history of incubating values which need to be restored.
CorrelationParameters stores a randomly sampled value. This needs to be saved to ensure the simulation is resumed with that same value.
In addition to R's native RNG (whose state is already saved by individual), the model uses the dqrng library, whose state needs saving.

Fixes #196

plietar · 2024-01-22T15:22:46Z

This depends on unreleased versions of both individual and of dqrng (including daqana/dqrng#75). We'll need to make/ask for releases of both of these.

plietar · 2024-01-23T17:14:33Z

Here is an example use case, from @pwinskill's use case in #196.

p_historical <- get_parameters(overrides = list(
  human_population = 1000, individual_mosquitoes = TRUE
)) |>
  set_drugs(
    drugs = list(SP_AQ_params)
  ) |>
  set_mda(
    drug = 1,
    # Because of how events are scheduled, we must set an MDA event now at
    # 365+100 otherwise it wouldn't happen in the second part.
    timesteps = c(100, 365+100),
    coverages =  c(0.5, 0.0),
    min_ages = c(0, 0),
    max_ages = c(80 * 365, 80*365)
  ) |>
  set_equilibrium(
    init_EIR = 5
  )

# Run the historical scenario up to the 1-Year mark
set.seed(123)
s_historical <- run_resumable_simulation(
  timesteps = 365,
  parameters = p_historical
)

plot(
  (s_historical$data$n_detect_730_3650 / s_historical$data$n_730_3650) ~ s_historical$data$timestep,
  t = "l",
  ylab = "Prevalence", xlab = "Time",
  ylim = c(0, 1), xlim = c(0, 365 * 2)
)
abline(v = 365, lty = 2)

p_future1 <- p_historical |> set_mda(
  drug = 1,
  timesteps = c(100, 365 + 100),
  coverages =  c(0.5, 0.1),
  min_ages = c(0, 0),
  max_ages = c(80 * 365, 80 * 365)
)

p_future2 <- p_historical |> set_mda(
  drug = 1,
  timesteps = c(100, 365 + 100),
  coverages =  c(0.5, 0.8),
  min_ages = c(0, 0),
  max_ages = c(80 * 365, 80 * 365)
)

# Run two future scenarios (t = 366:730).
# Restore the state from the end of the historical data.
s_future1 <- run_resumable_simulation(
  timesteps = 2*365,
  parameters = p_future1,
  initial_state = s_historical$state
)
lines(
  (s_future1$data$n_detect_730_3650 / s_future1$data$n_730_3650) ~ s_future1$data$timestep,
  col = adjustcolor("deeppink", alpha.f = 0.5))

s_future2 <- run_resumable_simulation(
  timesteps = 2*365,
  parameters = p_future2,
  initial_state = s_historical$state
)
lines(
  (s_future2$data$n_detect_730_3650 / s_future2$data$n_730_3650) ~ s_future1$data$timestep,
  col = adjustcolor("dodgerblue", alpha.f = 0.5))

# Sanity check, the resumed runs are equivalent to full ones
set.seed(123)
s_expected1 <- run_simulation(timesteps = 2*365,  parameters = p_future1)
set.seed(123)
s_expected2 <- run_simulation(timesteps = 2*365,  parameters = p_future2)
stopifnot(all.equal(s_future1$data, s_expected1[366:(2*365),]))
stopifnot(all.equal(s_future2$data, s_expected2[366:(2*365),]))

giovannic

Wonderful. A few minor comments.

I'll make a minor release of individual as soon as I get a moment.

If you're happy to wait for daqana, then that's fine. If not, we could depend on an mrc-ide fork in the meantime. As long as you promise to point back to daqana when they update to CRAN.

giovannic · 2024-01-30T12:30:05Z

src/timeseries.cpp

+}
+
+//[[Rcpp::export]]
+void timeseries_restore_state(Rcpp::XPtr<Timeseries> timeseries, Rcpp::List state) {


Taking std::vector<double> timesteps, std::vector<double> values as arguments will give better Rcpp errors if anything goes wrong.

I don't really like doing that as it splits the layout of the state object across two different files. We'd be constructing it in the C++ file, but de-structuring it in R.

The errors I get from Rcpp as is seem fine:

timeseries_restore_state(ts, list(timestep=1, value=NULL)) #> Error: Not compatible with requested type: [type=NULL; target=double]. timeseries_restore_state(ts, list(timestep=1, value="xx")) #> Error: Not compatible with requested type: [type=character; target=double]. timeseries_restore_state(ts, list(x=1)) #> Error: Index out of bounds: [index='timestep'].

The indexing and implicit conversion into an std::vector does the same checks Rcpp does when calling the function.

src/timeseries.cpp

The model can now be run for a given number of steps, have its state saved, and then restore it and run for some more time steps. Parameters of the model may be modified when resuming, allowing a simulation to be forked with a single historical run and many futures, modelling different intervention scenarios. There are some limitations as to which parameters may be modified. In general, the structure of the simulation must remain the same, with the same variables and same events. This means interventions cannot be enabled or disabled when resuming, only their parameterization can change. Additionally, the timing of intervention events should not be modified. These limitations may be relaxed in the future, if there is a need for it. To avoid changing the existing API, this feature is available as a new `run_resumable_simulation` function. The function returns a pair of values, the existing dataframe with the simulation data and an object representing the internal simulation state. The function can be called a second time, passing the saved simulation state as an extra argument. See the `test-resume.R` file for usage of the new function. The implementation builds on top of individual's new support for this. Individual already saves the state of its native objects, ie. variables and events. The malaria model keeps quite a bit of state outside of individual, which we need to save and restore explicitly. This is done by creating a set of "stateful objects", ie. R6 objects with a pair `save_state` and `restore_state` methods. This interface is implemented by every bit of the model that has state to capture: - `LaggedValue` objects store the short term EIR and FOIM history. - `Solver` objects represent the current state of ODE solvers. - Adult mosquito ODEs keep a history of incubating values which need to be restored. - `CorrelationParameters` stores a randomly sampled value. This needs to be saved to ensure the simulation is resumed with that same value. - In addition to R's native RNG (whose state is already saved by individual), the model uses the dqrng library, whose state needs saving.

Co-authored-by: Giovanni <giovanni.charles@gmail.com>

dqrng is now available over there.

The recent dqrng 0.4.0 release has a few small incompatible changes that makes the build fail. The changes necessary have been applied to the dev branch already as part of larger pull requests #280 and #305. This commit cherry-picks just the fixes for the master branch.

plietar requested review from giovannic and pwinskill January 22, 2024 15:20

plietar changed the base branch from master to dev January 22, 2024 15:24

plietar force-pushed the save-restore branch 2 times, most recently from 7cb5d4d to df84506 Compare January 22, 2024 15:30

giovannic approved these changes Jan 30, 2024

View reviewed changes

plietar force-pushed the save-restore branch from d087e83 to 5893dc3 Compare January 31, 2024 13:11

plietar and others added 9 commits February 22, 2024 15:16

Update src/timeseries.cpp

25114f3

Co-authored-by: Giovanni <giovanni.charles@gmail.com>

Update src/timeseries.cpp

c79da7a

Co-authored-by: Giovanni <giovanni.charles@gmail.com>

Try pulling dqrng from r-universe

007ddf7

Just use git instead of r-universe

a8f0018

fix macos build

9faa18a

fix windows build, maybe

9f51ccb

Update dependency on individual.

91a545f

Update individual dependency

b280d6e

plietar force-pushed the save-restore branch from 969e50a to b280d6e Compare February 22, 2024 15:17

plietar added 2 commits February 27, 2024 14:42

Replace rstub r-universe with mrc-ide.

da4edbf

dqrng is now available over there.

Update RcppExports.cpp

3fb1d40

plietar force-pushed the save-restore branch 3 times, most recently from 3bfc68d to 3c5c736 Compare February 27, 2024 17:05

Update documentation.

4f7ab9a

plietar force-pushed the save-restore branch from 3c5c736 to 4f7ab9a Compare February 27, 2024 18:31

plietar merged commit c42e2c5 into dev Feb 27, 2024
4 checks passed

plietar mentioned this pull request May 22, 2024

Cherry-pick fixes for latest dqrng release. #311

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial support for saving and restoring simulation state. #280

Add initial support for saving and restoring simulation state. #280

plietar commented Jan 22, 2024 •

edited

Loading

plietar commented Jan 22, 2024

plietar commented Jan 23, 2024

giovannic left a comment

giovannic Jan 30, 2024

plietar Jan 31, 2024

Add initial support for saving and restoring simulation state. #280

Add initial support for saving and restoring simulation state. #280

Conversation

plietar commented Jan 22, 2024 • edited Loading

plietar commented Jan 22, 2024

plietar commented Jan 23, 2024

giovannic left a comment

Choose a reason for hiding this comment

giovannic Jan 30, 2024

Choose a reason for hiding this comment

plietar Jan 31, 2024

Choose a reason for hiding this comment

plietar commented Jan 22, 2024 •

edited

Loading