Skip to content

Environmental Interpolation using Spatial Kernel Density Estimation

License

Notifications You must be signed in to change notification settings

lance-waller-lab/envi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

envi: Environmental Interpolation using Spatial Kernel Density Estimation

R-CMD-check CRAN status CRAN version CRAN RStudio mirror downloads total CRAN RStudio mirror downloads monthly License GitHub last commit DOI

Date repository last updated: November 06, 2024

Overview

The envi package is a suite of R functions to estimate the ecological niche of a species and predict the spatial distribution of the ecological niche -- a version of environmental interpolation -- with spatial kernel density estimation techniques. A two-group comparison (e.g., presence and absence locations of a single species) is conducted using the spatial relative risk function that is estimated using the sparr package. Internal cross-validation and basic visualization are also supported.

Installation

To install the release version from CRAN:

install.packages('envi')

To install the development version from GitHub:

devtools::install_github('lance-waller-lab/envi')

Available functions

Function Description
lrren Main function. Estimate an ecological niche using the spatial relative risk function and predict its location in geographic space.
perlrren Sensitivity analysis for lrren whereby observation locations are spatially perturbed ('jittered') with specified radii, iteratively.
plot_obs Display multiple plots of the estimated ecological niche from lrren output.
plot_predict Display multiple plots of the predicted spatial distribution from lrren output.
plot_cv Display multiple plots of internal k-fold cross-validation diagnostics from lrren output.
plot_perturb Display multiple plots of output from perlrren including predicted spatial distribution of the summary statistics.
div_plot Called within plot_obs, plot_predict, and plot_perturb, provides functionality for basic visualization of surfaces with diverging color palettes.
seq_plot Called within plot_perturb, provides functionality for basic visualization of surfaces with sequential color palettes.
pval_correct Called within lrren and perlrren, calculates various multiple testing corrections for the alpha level.

Authors

  • Ian D. Buller - DLH, LLC (formerly Social & Scientific Systems, Inc.), Bethesda, Maryland (current) - Occupational and Environmental Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, Maryland (former) - Environmental Health Sciences, James T. Laney School of Graduate Studies, Emory University, Atlanta, Georgia. (original) - GitHub - ORCID

See also the list of contributors who participated in this package, including:

  • Lance A. Waller - Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia. - GitHub - ORCID

Usage

For the lrren() function

set.seed(1234) # for reproducibility

# ------------------ #
# Necessary packages #
# ------------------ #

library(envi)
library(spatstat.data)
library(spatstat.random)

# -------------- #
# Prepare inputs #
# -------------- #

# Using the 'bei' and 'bei.extra' data within {spatstat.data}

# Environmental Covariates
elev <- bei.extra[[1]]
grad <- bei.extra[[2]]
elev$v <- scale(elev)
grad$v <- scale(grad)
elev_raster <- rast(elev)
grad_raster <- rast(grad)

# Presence data
presence <- bei
marks(presence) <- data.frame(
  'presence' = rep(1, presence$n),
  'lon' = presence$x,
  'lat' = presence$y
)
marks(presence)$elev <- elev[presence]
marks(presence)$grad <- grad[presence]

# (Pseudo-)Absence data
absence <- rpoispp(0.008, win = elev)
marks(absence) <- data.frame(
  'presence' = rep(0, absence$n),
  'lon' = absence$x,
  'lat' = absence$y
)
marks(absence)$elev <- elev[absence]
marks(absence)$grad <- grad[absence]

# Combine
obs_locs <- superimpose(presence, absence, check = FALSE)
obs_locs <- marks(obs_locs)
obs_locs$id <- seq(1, nrow(obs_locs), 1)
obs_locs <- obs_locs[ , c(6, 2, 3, 1, 4, 5)]

# Prediction Data
predict_xy <- crds(elev_raster)
predict_locs <- as.data.frame(predict_xy)
predict_locs$elev <- extract(elev_raster, predict_xy)[ , 1]
predict_locs$grad <- extract(grad_raster, predict_xy)[ , 1]

# ----------- #
# Run lrren() #
# ----------- #

test1 <- lrren(
  obs_locs = obs_locs,
  predict_locs = predict_locs,
  predict = TRUE,
  verbose = TRUE,
  cv = TRUE
)
              
# -------------- #
# Run plot_obs() #
# -------------- #

plot_obs(test1)

# ------------------ #
# Run plot_predict() #
# ------------------ #

plot_predict(
  test1,
  cref0 = 'EPSG:5472',
  cref1 = 'EPSG:4326'
)

# ------------- #
# Run plot_cv() #
# ------------- #

plot_cv(test1)

# -------------------------------------- #
# Run lrren() with Bonferroni correction #
# -------------------------------------- #

test2 <- lrren(
  obs_locs = obs_locs,
  predict_locs = predict_locs,
  predict = TRUE,
  p_correct = 'Bonferroni'
)

# Note: Only showing third plot
plot_obs(test2)

# Note: Only showing second plot
plot_predict(
  test2,
  cref0 = 'EPSG:5472',
  cref1 = 'EPSG:4326'
)

# Note: plot_cv() will display the same results because cross-validation only performed for the log relative risk estimate

For the perlrren() function

set.seed(1234) # for reproducibility

# ------------------ #
# Necessary packages #
# ------------------ #

library(envi)
library(spatstat.data)
library(spatstat.random)

# -------------- #
# Prepare inputs #
# -------------- #

# Using the 'bei' and 'bei.extra' data within {spatstat.data}

# Scale environmental covariates
ims <- bei.extra
ims[[1]]$v <- scale(ims[[1]]$v)
ims[[2]]$v <- scale(ims[[2]]$v)

# Presence data
presence <- bei
marks(presence) <- data.frame(
  'presence' = rep(1, presence$n),
  'lon' = presence$x,
  'lat' = presence$y
)

# (Pseudo-)Absence data
absence <- rpoispp(0.008, win = ims[[1]])
marks(absence) <- data.frame(
  'presence' = rep(0, absence$n),
  'lon' = absence$x,
  'lat' = absence$y
)

# Combine and create 'id' and 'levels' features
obs_locs <- superimpose(presence, absence, check = FALSE)
marks(obs_locs)$id <- seq(1, obs_locs$n, 1)
marks(obs_locs)$levels <- as.factor(rpois(obs_locs$n, lambda = 0.05))
marks(obs_locs) <- marks(obs_locs)[ , c(4, 2, 3, 1, 5)]

# -------------- #
# Run perlrren() #
# -------------- #

# Uncertainty in observation locations
## Most observations within 10 meters
## Some observations within 100 meters
## Few observations within 500 meters

test3 <- perlrren(
  obs_ppp = obs_locs,
  covariates = ims,
  radii = c(10, 100, 500),
  verbose = FALSE, # may not be availabe if parallel = TRUE
  parallel = TRUE,
  n_sim = 100
)
                 
# ------------------ #
# Run plot_perturb() #
# ------------------ #

plot_perturb(
  test3,
  cref0 = 'EPSG:5472',
  cref1 = 'EPSG:4326',
  cov_labs = c('elev', 'grad')
)

Funding

This package was developed while the author was originally a doctoral student in the Environmental Health Sciences doctoral program at Emory University and later as a postdoctoral fellow supported by the Cancer Prevention Fellowship Program at the National Cancer Institute. Any modifications since December 05, 2022 were made while the author was an employee of DLH, LLC (formerly Social & Scientific Systems, Inc.).

Acknowledgments

When citing this package for publication, please follow:

citation('envi')

Questions? Feedback?

For questions about the package, please contact the maintainer Dr. Ian D. Buller or submit a new issue.