Skip to content
Rebecca Pontes Salles edited this page Mar 25, 2021 · 21 revisions

codecov cran version downloads Research software impact

TSPred Package for R : Framework for Nonstationary Time Series Prediction

Current Version: 5.1 Date: 2021-01

Authors: Rebecca Pontes Salles (rebeccapsalles@acm.org) and Eduardo Ogasawara (eogasawara@ieee.org)

Description: Functions for defining and conducting a time series prediction process including pre(post)processing, decomposition, modeling, prediction, and accuracy assessment. The generated models and their yielded prediction errors can be used for benchmarking other time series prediction methods and for creating a demand for the refinement of such methods. For this purpose, benchmark data from prediction competitions may be used.

Available at CRAN: https://CRAN.R-project.org/package=TSPred

Reference manual: TSPred.pdf

Acknowledgements: The authors thank CNPq, CAPES, and FAPERJ for partially sponsoring this research.


Usage:

#Install TSPred package
> install.packages("TSPred")

#Load TSPred package
> library("TSPred")

ARIMA model prediction application using TSPred

#loading CATS dataset
 > data("CATS")

#defining the time series application
 > tspred_arima <- tspred( subsetting = subsetting(test_len = 20),
                           modeling = ARIMA(), 
                           evaluating = list(MSE = MSE(),AIC = AIC()) )

#performing the prediction application and obtaining results
 > tspred_arima_res <- workflow( tspred_arima, data = CATS[5] )

Definition of components/steps of a time series prediction process in TSPred

#Obtaining objects of the processing class
 > proc_subset <- subsetting( test_len = 20 )
 > proc_bct <- BCT()
 > proc_wt <- WT( level = 1, filter = "bl14" )
 > proc_sw <- SW( window_len = 6 )
 > proc_mm <- MinMax()

#Obtaining objects of the modeling class
 > modl_nnet <- NNET( size = 5, sw = proc_sw, proc = list(MM = proc_mm) )

#Obtaining objects of the evaluating class
 > eval_mse <- MSE()

MLM prediction application using TSPred

#Defining a time series prediction process
 > tspred_mlm <- tspred( subsetting = proc_subset, 
                         processing = list(BCT = proc_bct, WT = proc_wt), 
                         modeling = modl_nnet,
                         evaluating = list(MSE = eval_mse) )

#Running the time series prediction process and obtaining results
 > tspred_mlm_res <- tspred_mlm %>% 
                     subset(data = CATS[5]) %>%
                     preprocess(prep_test = TRUE) %>% 
                     train() %>%
                     predict(input_test_data = TRUE) %>% 
                     postprocess() %>% 
                     evaluate()

#Benchmarking tspred objects
 > bmrk_results <- benchmark( tspred_arima_res, list(tspred_mlm_res) )

A user-defined MLM using TSPred

#Subclass my.model
 > my.model <- function(train_par=NULL, pred_par=NULL){
      MLM(train_func = my.model.func, train_par = c(train_par),
          pred_func = my.model.pred.func, pred_par = c(pred_par),
          method = "Name of my model", subclass = "my.model" )
 }

#Obtaining an instance of the subclass my.model
 > model <- my.model(train_par = list(par1="a", par2="b"), pred_par = list(par3="c"))

Other relevant functions:

Nonstationarity treatment:

  • LogT - Logarithmic transform.
  • BCT - Box-Cox transform.
  • an - Adaptive normalization.
  • Diff - Differencing. MAS
  • mas - Moving average smoothing.
  • pct - Percentage change transform.
  • WaveletT - Wavelet transform.
  • emd - Empirical mode decomposition.

Fittest linear models:

  • fittestLM - Automatically finding fittest linear model for prediction.
  • fittestArima - Automatic ARIMA fitting, prediction and accuracy evaluation.
  • fittestArimaKF - Automatic ARIMA fitting and prediction with Kalman filter.
  • fittestPolyR - Automatic fitting and prediction of polynomial regression.
  • fittestPolyRKF - Automatic fitting and prediction of polynomial regression with Kalman filter.

Automatic preprocessing/decomposition and prediction:

  • fittestMAS - Automatic prediction with moving average smoothing.
  • fittestWavelet - Automatic prediction with wavelet transform.
  • fittestEMD - Automatic prediction with empirical mode decomposition

Package architecture: TSPred Architecture



Developed works enabled by TSPred:

Experimental Review of Nonstationary Time Series Transformation Methods (Knowledge-Based Systems 2019):

Starting from version 4.0, the TSPred R-Package provides functions for addressing nonstationarity in time series. These functions implement several transformation methods which are known for aiding the prediction of nonstationary time series. Using ARMA as a baseline prediction model, TSPred enables the comparative analysis of the effects of each implemented nonstationary time series transformation method to the problem of time series prediction.

Datasets, code and results of a thorough experiment for reviewing nonstationary time series transformation methods using the functions available in TSPred are presented in the page:

An Experimental Review of Nonstationary Time Series Transformation Methods

Benchmarking machine learning methods using linear models for univariate time series prediction (IJCNN 2017):

The TSPred R-Package enables the evaluation of time series prediction methods against ARIMA. ARIMA establishes a baseline linear prediction model that can be consistently used to compare with several other machine learning methods. In order to aid such comparison, we have included some benchmark prediction competition datasets. Some of the cited evaluation processes with respect to 5 of the most important time series prediction competitions organized so far are presented in the following.

These competitions were adopted as they maintain their datasets and results available, besides making accessible the papers of a large number of competitors, which describe their applied methods. The works presented in them comprehend a large variety of machine learning methods, which demonstrate great efforts done by the community of scientists on the time series prediction problem through years.

Furthermore, all the selected competitions provide free and easy access to the performance evaluation metrics used, as well as the ranked prediction errors found by each of their 125 competitors. This enabled us to compare the prediction errors and performance of the competitors’ methods against the baseline. Thus, the test and analysis of prediction results are facilitated.

These selected benchmarks differ from each other in many aspects, such as the number of time series and their length, number of observations to be predicted, seasonality, missing data, prediction error metrics, etc.