Skip to content
Rebecca Pontes Salles edited this page Apr 29, 2017 · 3 revisions

The CATS Competition Experiment

The CATS Competition (Lendasse et al. 2004, Lendasse et al. 2007) presented an artificial time series with 5,000 points, among which 100 are unknown.

The competition proposed that the competitors predicted the 100 unknown values from the given time series, which are grouped into five non-consecutive blocks of 20 successive values. The CATS Competition time series is depicted in Fig. 1, in which the five blocks of unknown values are referenced.

Fig. 1 The CATS Competition time series containing 5 blocks of unknown values, hereby referenced as blocks (a), (b), (c), (d) and (e)

The unknown values are the 981-1000, 1981-2000, 2981-3000, 3981-4000 and 4981-5000 elements of the series, which can be seen in Fig. 1.a, Fig. 1.b, Fig. 1.c, Fig. 1.d, and Fig. 1.e, respectively.

The performance evaluation done by the CATS Competition was based on the MSEs computed on the 100 unknown values (E1) and on the 80 first unknown values (E2).

The second error criterion (E2) was considered relevant because some of the proposed methods used interpolation techniques, which cannot be applied in the case of the fifth set of unknown values. However, error E2 was intended just to just providing provide some additional information about performances and properties of the methods.

The CATS time series is present in TSPred R-Package as CATS. The values which were to be predicted of this times series are also present as CATS.cont.


Experiment R-Scripts

#Install hydroGOF package, used for calculating MSE errors
> install.packages("hydroGOF")

#Load DMwR package
> library("hydroGOF")

#Load the datasets CATS and CATS.cont
> data(CATS,CATS.cont)
Using interpolation techniques:
#Automatically fits ARIMA models to each time series of known values in CATS using interpolation techniques and predicts the values in CATS.cont
> pred <- arimainterp(CATS,n.ahead=20,extrap=TRUE)

#Calculates the MSE error of prediction between pred and CATS.cont
> MSE <- mapply(mse, CATS.cont, data.frame(pred), MoreArgs = list(na.rm=TRUE), SIMPLIFY = TRUE, USE.NAMES = TRUE)

#Calculates the errors E1 and E2 and binds them in a vector
> cbind( E1 = sum(MSE*(1/ncol(CATS.cont))), E2 = sum(head(MSE,ncol(CATS.cont)-1)*(1/(ncol(CATS.cont)-1))) )
Using only extrapolation techniques:
#Automatically fits an ARIMA model to each time series of known values in CATS and predicts the values in CATS.cont
#Also plots the predictions against CATS.cont
> pred <- marimapred(CATS,CATS.cont,plot=TRUE)

#Calculates the MSE error of prediction between pred and CATS.cont 
> MSE <- mapply(mse, CATS.cont, data.frame(pred), MoreArgs = list(na.rm=TRUE), SIMPLIFY = TRUE, USE.NAMES = TRUE)

#Calculates the errors E1 and E2 and binds them in a vector
> cbind( E1 = sum(MSE*(1/ncol(CATS.cont))), E2 = sum(head(MSE,ncol(CATS.cont)-1)*(1/(ncol(CATS.cont)-1))) )
Example of plotted graphic:

Fig. 2 ARIMA predictions (solid line) of the last of the 5 blocks of unknown values of the CATS Competition time series. The actual time series values are represented by the dashed line.


General R-Functions

Using interpolation techniques:
> ARIMA.CATS.Interp <- function(TimeSeries, TimeSeriesCont){
    if(is.null(TimeSeries)) stop("TimeSeries is required and must have positive length")
    if(is.null(TimeSeriesCont)) stop("TimeSeriesCont is required and must have positive length")
    
    Predictions <- arimainterp(TimeSeries, n.ahead=nrow(TimeSeriesCont), extrap=TRUE)
    
    MSE <- mapply(mse, TimeSeriesCont, data.frame(Predictions), MoreArgs = list(na.rm=TRUE), SIMPLIFY = TRUE, USE.NAMES = TRUE)
    
    return (cbind( E1 = sum(MSE*(1/ncol(TimeSeriesCont))), E2 = sum(head(MSE,ncol(TimeSeriesCont)-1)*(1/(ncol(TimeSeriesCont)-1))) ))
}
Example:
> ARIMA.CATS.Interp(CATS,CATS.cont)
Using only extrapolation techniques:
> ARIMA.CATS.Extrap <- function(TimeSeries, TimeSeriesCont, plot=FALSE){
    if(is.null(TimeSeries)) stop("TimeSeries is required and must have positive length")
    if(is.null(TimeSeriesCont)) stop("TimeSeriesCont is required and must have positive length")
    
    Predictions <- marimapred(TimeSeries, TimeSeriesCont, plot=plot)
    
    MSE <- mapply(mse, TimeSeriesCont, data.frame(Predictions), MoreArgs = list(na.rm=TRUE), SIMPLIFY = TRUE, USE.NAMES = TRUE)
    
    return (cbind( E1 = sum(MSE*(1/ncol(TimeSeriesCont))), E2 = sum(head(MSE,ncol(TimeSeriesCont)-1)*(1/(ncol(TimeSeriesCont)-1))) ))
}
Example:
> ARIMA.CATS.Extrap(CATS,CATS.cont,plot=TRUE)

References

A. Lendasse, E. Oja, O. Simula, M. Verleysen, and others, 2004, Time Series Prediction Competition: The CATS Benchmark, In: IJCNN’2004–International Joint Conference on Neural Networks

A. Lendasse, E. Oja, O. Simula, and M. Verleysen, 2007, Time series prediction competition: The CATS benchmark, Neurocomputing, v. 70, n. 13–15 (Aug.), p. 2325–2329.


Back to TSPred R-Package