Make it easy to make ensemble time series forecast
forecastHybrid
is an R package to make it easier to use the average predictions of ‘ensembles’ (or ‘combinations’) of time series models from Rob Hyndman’s forecast
package. It looks after the averaging, and also calculates prediction intervals by a conservative method that aims to redress the general over-optimism in forecasting prediction intervals.
New in version 0.3.0 is:
- Following developments in the
forecast
package, prediction intervals are now created fornnetar
objects in the ensemble. This should address one aspect of incorrect prediction intervals (e.g. issue #37). - theta models can be added (by including “
f
” in themodels =
argument forhybridModel()
) and are indeed part of the default - so unless set otherwise,hybridModel()
will now fit six models. accuracy.cvts()
is now exported. It returns ra ange of summary measures of the cross-validated forecast accuracy for objects created bycvts
. Note that development oncvts
forforecastHybrid
has been independent of Hyndman’s work on a similarly purposedtscv
function inforecast
- be careful not to get the two confused.plot.hybridModel()
now supportsggplot2
graphics when the argumentggplot = TRUE
is passed.- Time series must be at least four observations long.
- Fixed an error where
e.args
was passed totbats
instead oft.args
.
In some respects, version 0.2.0 was a more important upgrade because it introduced cvts
and the use of a weighted average forecast with weights for the component models chosen by cross-validation. I didn’t get around to writing a blog post when that happened.
Installation
Basic use
The hybridModel
function fits between two and six time series models to a time series, with or without xreg
external regressor explanatory variables. The default, if the models=
argument is left blank, is to fit all six available models. The modelling process also determines a set of weights to be used in subsequent forecasts. The two options are:
- equal weights ie a simple average
- cross validation weights ie a weighted average, with models that performed better in cross-validation given more weight.
Generally, we expect cross-validation weights to perform better, but they are quite computationally intensive and we haven’t got round to implementing parallel processing yet.
Here’s a demo with the median duration of unemployment in weeks from the economics
dataset in the ggplot2
package. This snippet demonstrates both ways of using weights and the default base plot method. This plot method is the one provided by forecast
for all objects of class forecast
, including those created by forecastHybrid.
The cross-validation errors should be better because they give less weight to poor performing forecasts. The forecasts from nnetar
for example are often not good, and in a set of experiments for a recent talk at the Australian Statistical Conference I showed that they are sufficiently poor that they often worsen ensemble forecasts when added to the mix. Setting weights by cross-validation should mitigate this.
All of the component models can be accessed individually if desired:
The ggplot2::autoplot()
method provided by the forecast
package also works:
Theta models
We implemented the Theta forecast method based on thetaf()
in Hyndman’s forecast
(although there are alternatives). We had to split the process into a modelling function thetam()
, and an accompanying forecast.thetam
method. The results are identical to forecast::thetaf()
, as demonstrated below:
Conservative prediction intervals?
The forecastHybrid
method of setting prediction intervals is conservative. It takes the widest range of values covered by any of the component models. In a blog post on different ways of setting forecast combinations, Rob Hyndman points out that, using the monthly co2
dataset of Mauna Loa Atmospheric CO2 Concentration distributed with R, the prediction intervals for the forecastHybrid
method with six component models look too conservative ie sufficiently wide that they look very likely to contain the correct values much more than aimed for. In this particular case I agree - the method should be used with a bit of caution.
In practice, I often use only the auto.arima
and ets
forecasts in combination rather than all six possibilities. Even then, the method is sometimes too conservative with data that of monthly or higher frequency, particularly at the 80% level. Here’s the conclusions from my recent presentation on this topic (apologies for the screenshot from PowerPoint):
So for annual and quarterly real world data, the method is not too conservative at all; and if you want your prediction intervals to contain the true value 95% of the time, the method checks out ok even for monthly data.
Feedback and suggestions
Our preferred approach for you to provide suggestions and bug reports is to file an issue on GitHub. Other comments and feedback could be provided as a comment on this blog post or on Twitter.