-
Notifications
You must be signed in to change notification settings - Fork 21
Getting Started Guide
- Install R (version >=3.1.0).
- (optional) Install Rstudio.
-
Install the
SpaDES
package via CRAN:install.packages("SpaDES")
Since SpaDES
is still a package in the early stages of maturity, the development branch on GitHub may contain useful bug fixes that are not in the CRAN version. To install this development version:
```r
library("devtools")
install_github("PredictiveEcology/SpaDES", ref="development")
```
-
Install the optional
fastshp
package. This requires OS development tools (e.g., Rtools for Windows).install.packages("fastshp", repos="http://rforge.net", type="source")
Load the SpaDES
package in your R session using library(SpaDES)
.
-
Browse locally available modules:
openModules("/path/to/my/modules") # opens all modules in a directory openModules("/path/to/my/modules", "moduleName") # opens only the named module
-
Browse modules at https://github.com/PredictiveEcology/SpaDES-modules
-
Download modules for use:
downloadModule("moduleName", path = "/path/to/my/modules") openModules("/path/to/my/modules", "moduleName")
-
Create an empty module template:
newModule("moduleName", path = "/path/to/my/modules")
Note: on Windows there is currently a bug in RStudio that it doesn't know what editor to open when file.edit
is called (which is what newModule
does). This will return an error:
```r
Error in editor(file = file, title = title) :
argument "name" is missing, with no default
```
You can just browse to the file and open it manually.
- are module metadata fully and correctly specified (module description, authorship and citation info, parameters and inputs/outputs, etc.)?
-
citation
should specify how to cite the module, or if published, the paper that describes the module. - module object dependencies: use
moduleDiagram
andobjectDiagram
to confirm how data objects are passed among modules.
- are all event types defined in
doEvent
? - use
sim$function(sim)
to access event functions - use
sim$object
to access simulation data objects - use e.g.,
sim[[globals(sim)$objectName]]
to access variable-named objects
- use unique function names to reduce the risk of another module overwriting your functions. E.g., use
moduleNameFunction()
instead offunction()
. - using
sim$function
notation is not required for the definition of event functions
- have you provided useful (meaningful) documentation in the module's
.Rmd
file andREADME
? - have you built (
knit
ted) the.Rmd
file to generate a.pdf
or.html
version? - have you specified the terms under which your module code can be reused and/or modified? Add a license!
Since modules will often have to run many, many times because of replication, there are a few strategies that should be followed:
-
Always write fast code. This means use
data.table
ordplyr
for data and data wrangling.- Avoid
data.frame
if possible. - Matrices and vectors are fast.
- Avoid loops.
- Avoid
-
Use
memoise
orarchivist
package to cache functions for speed. We will soon be pushing out ways and examples to do this easily for functions within a module. -
For computationally intensive functions, consider writing them in C++, via the
Rcpp
package. -
For large (out of RAM) situations, use
ff
orbigMemory
. Sometimes, these can be done seamlessly inside functions using thegetOption("spades.lowMemory")
, where two alternatives a provided, one "in Memory" the other "on disk". See "if (lowMemory)" code block about 20 lines from start ofspread
function for one way to do this withff
.
- Don't write modules that depend internally on other modules. Instead, pass data via the
inputObjects
andoutputObjects
in the metadata. This means avoid scheduling one event in module A from module B, if possible. - Use existing modules from the SpaDES-Modules repository (https://github.com/PredictiveEcology/SpaDES-modules) using
downloadModule()
.
Modules are generic.
The only components that must exist are the metadata and the init
event. This means that many, many types of modules can be written.
As we slowly build a SpaDES ecosystem of modules designed to be used and re-used, we can consider writing our entire work flow -- raw data, data wrangling, data analysis, calibration of simulation model, simulation, output analysis, decision support -- all in one chain.
We can cache everything along the way, so that if something must run again, but its inputs are identical to a previous run, then it can just read from disk.
This is an evolving list of types of modules that would be useful to have in this "re-use" cycle:
-
dynamic forecasting
- "classical" simulation models
- netLogo-type models
- SELES-type models
- time is a component of the model
-
static forecasting
- e.g., predict methods from statistical outputs
-
agent based models
- animals, plants
- processes, such as fire
-
raster models
- e.g., forest succession, cellular automata
-
statistical
- Bayesian
-
calibration and optimization
- taking outputs from other modules and rescheduling those other modules again, iterating through a heuristic optimization
-
translators
- from one data type to another to allow two different modules to talk
-
GIS
- reprojection, crop, mask etc.
-
data manipulation
- simplifying, joining etc.
- interpolators
-
output analysis
- e.g., takes time series of rasters and visualizes them
-
data fetching
- modules that go to specific web resources (e.g., Dryad etc.)
-
quality scanning - e.g., from external databases
Current modules on the SpaDES-modules repository (see above) include simple versions of dynamic forecasting (Forest Succession, fireSpreadLcc, forestAge), GIS (cropReprojectLccAge), translators (LccToBeaconsReclassify),
- Installation
- Getting started
- Help and package vignettes
- SpaDES 4 dummies
- Try me!
- Debugging
- Caching
- Alternate cache backends
- Frequently asked questions
- SpaDES Users Group
-
Wolves recolonizing the Italian Alps - This is a rewrite in
SpaDES
of Marucco & McIntire 2010.