Prediction intervals for ensemble time series forecasts

Peter Ellis
December 2016

The only credible test...

M1

Period DEMOGR INDUST INDUSTRIAL MACRO1 MACRO2 MICRO1 MICRO2 MICRO3
1 MONTHLY 75 183 0 64 92 10 89 104
2 QUARTERLY 39 17 1 45 59 5 21 16
3 YEARLY 30 35 0 30 29 16 29 12

Makridakis et al, 1982

The only credible test...

M3

Period DEMOGRAPHI- DEMOGRAPHIC FINANCE INDUSTRY MACRO MICRO OTHER
1 MONTHLY 0 111 145 334 312 474 52
2 OTHER 0 0 29 0 0 4 141
3 QUARTERLY 57 0 76 83 336 204 0
4 YEARLY 0 245 58 102 83 146 11

Makridakis et al, 2000

The only credible test...

Tourism

Period TOURISM
1 MONTHLY 366
2 QUARTERLY 427
3 YEARLY 518

Athanasopoulos et al, 2011

UNITED NATIONS COPPER ORE PRODUCTION CANADA

forecast_comp(M1[[650]], plot = TRUE) 

plot of chunk unnamed-chunk-5

RATIO CIVILIAN EMPLOYMENT TO TOTAL WORKING AGE POPULATION

forecast_comp(M1[[1000]], plot = TRUE) 

plot of chunk unnamed-chunk-6

Ensemble time series work better than individual models

For example:

model two four six eight
Theta 0.77 1.06 1.35 1.62
ARIMA-ETS average 0.72 1.07 1.38 1.75
ARIMA 0.75 1.12 1.43 1.79
ETS 0.75 1.11 1.44 1.82
Naive 1.08 1.11 1.74 1.87

Mean absolute scaled error of forecasts for 756 quarterly series from the M3 competition, forecast horizon ranging from two to eight quarters.

plot of chunk unnamed-chunk-8

plot of chunk unnamed-chunk-9

How to estimate prediction intervals?

  • Usually presumed some kind of weighted average of the components
  • Weights might be estimated based on in-sample errors

But the components have poor coverage.