This vignette explains briefly how to use the function
adam()
and the related auto.adam()
in
smooth
package. It does not aim at covering all aspects of
the function, but focuses on the main ones.
ADAM is Augmented Dynamic Adaptive Model. It is a model that underlies ETS, ARIMA and regression, connecting them in a unified framework. The underlying model for ADAM is a Single Source of Error state space model, which is explained in detail separately in an online textbook.
The main philosophy of adam()
function is to be agnostic
of the provided data. This means that it will work with ts
,
msts
, zoo
, xts
,
data.frame
, numeric
and other classes of data.
The specification of seasonality in the model is done using a separate
parameter lags
, so you are not obliged to transform the
existing data to something specific, and can use it as is. If you
provide a matrix
, or a data.frame
, or a
data.table
, or any other multivariate structure, then the
function will use the first column for the response variable and the
others for the explanatory ones. One thing that is currently assumed in
the function is that the data is measured at a regular frequency. If
this is not the case, you will need to introduce missing values
manually.
In order to run the experiments in this vignette, we need to load the following packages:
require(greybox)
require(smooth)
First and foremost, ADAM implements ETS model, although in a more
flexible way than (Hyndman et al. 2008):
it supports different distributions for the error term, which are
regulated via distribution
parameter. By default, the
additive error model relies on Normal distribution, while the
multiplicative error one assumes Inverse Gaussian. If you want to
reproduce the classical ETS, you would need to specify
distribution="dnorm"
. Here is an example of ADAM ETS(MMM)
with Normal distribution on AirPassengers
data:
<- adam(AirPassengers, "MMM", lags=c(1,12), distribution="dnorm",
testModel h=12, holdout=TRUE)
summary(testModel)
#>
#> Model estimated using adam() function: ETS(MMM)
#> Response variable: AirPassengers
#> Distribution used in the estimation: Normal
#> Loss function type: likelihood; Loss function value: 468.9192
#> Coefficients:
#> Estimate Std. Error Lower 2.5% Upper 97.5%
#> alpha 0.8451 0.0844 0.6780 1.0000 *
#> beta 0.0205 0.0265 0.0000 0.0727
#> gamma 0.0000 0.0373 0.0000 0.0736
#> level 120.2667 14.4276 91.6885 148.7758 *
#> trend 1.0017 0.0100 0.9819 1.0216 *
#> seasonal_1 0.9132 0.0078 0.8987 0.9365 *
#> seasonal_2 0.8996 0.0082 0.8851 0.9229 *
#> seasonal_3 1.0301 0.0096 1.0156 1.0533 *
#> seasonal_4 0.9866 0.0084 0.9721 1.0098 *
#> seasonal_5 0.9852 0.0073 0.9707 1.0085 *
#> seasonal_6 1.1164 0.0095 1.1019 1.1397 *
#> seasonal_7 1.2328 0.0118 1.2183 1.2561 *
#> seasonal_8 1.2262 0.0109 1.2117 1.2495 *
#> seasonal_9 1.0671 0.0093 1.0527 1.0904 *
#> seasonal_10 0.9279 0.0090 0.9134 0.9512 *
#> seasonal_11 0.8058 0.0078 0.7914 0.8291 *
#>
#> Error standard deviation: 0.0376
#> Sample size: 132
#> Number of estimated parameters: 17
#> Number of degrees of freedom: 115
#> Information criteria:
#> AIC AICc BIC BICc
#> 971.8385 977.2069 1020.8461 1033.9526
plot(forecast(testModel,h=12,interval="prediction"))
You might notice that the summary contains more than what is reported
by other smooth
functions. This one also produces standard
errors for the estimated parameters based on Fisher Information
calculation. Note that this is computationally expensive, so if you have
a model with more than 30 variables, the calculation of standard errors
might take plenty of time. As for the default print()
method, it will produce a shorter summary from the model, without the
standard errors (similar to what es()
does):
testModel#> Time elapsed: 0.12 seconds
#> Model estimated using adam() function: ETS(MMM)
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 468.9192
#> Persistence vector g:
#> alpha beta gamma
#> 0.8451 0.0205 0.0000
#>
#> Sample size: 132
#> Number of estimated parameters: 17
#> Number of degrees of freedom: 115
#> Information criteria:
#> AIC AICc BIC BICc
#> 971.8385 977.2069 1020.8461 1033.9526
#>
#> Forecast errors:
#> ME: -4.532; MAE: 15.668; RMSE: 21.652
#> sCE: -20.72%; Asymmetry: -12.1%; sMAE: 5.969%; sMSE: 0.68%
#> MASE: 0.651; RMSSE: 0.691; rMAE: 0.206; rRMSE: 0.21
Also, note that the prediction interval in case of multiplicative error models are approximate. It is advisable to use simulations instead (which is slower, but more accurate):
plot(forecast(testModel,h=18,interval="simulated"))
If you want to do the residuals diagnostics, then it is recommended
to use plot
function, something like this (you can select,
which of the plots to produce):
par(mfcol=c(3,4))
plot(testModel,which=c(1:11))
par(mfcol=c(1,1))
plot(testModel,which=12)
By default ADAM will estimate models via maximising likelihood
function. But there is also a parameter loss
, which allows
selecting from a list of already implemented loss functions (again, see
documentation for adam()
for the full list) or using a
function written by a user. Here is how to do the latter on the example
of BJsales
:
<- function(actual, fitted, B){
lossFunction return(sum(abs(actual-fitted)^3))
}<- adam(BJsales, "AAN", silent=FALSE, loss=lossFunction,
testModel h=12, holdout=TRUE)
testModel#> Time elapsed: 0.02 seconds
#> Model estimated using adam() function: ETS(AAN)
#> Distribution assumed in the model: Normal
#> Loss function type: custom; Loss function value: 599.2829
#> Persistence vector g:
#> alpha beta
#> 1.0000 0.2253
#>
#> Sample size: 138
#> Number of estimated parameters: 4
#> Number of degrees of freedom: 134
#> Information criteria are unavailable for the chosen loss & distribution.
#>
#> Forecast errors:
#> ME: 3.02; MAE: 3.133; RMSE: 3.871
#> sCE: 15.942%; Asymmetry: 91.8%; sMAE: 1.378%; sMSE: 0.029%
#> MASE: 2.63; RMSSE: 2.523; rMAE: 1.011; rRMSE: 1.01
Note that you need to have parameters actual, fitted and B in the function, which correspond to the vector of actual values, vector of fitted values on each iteration and a vector of the optimised parameters.
loss
and distribution
parameters are
independent, so in the example above, we have assumed that the error
term follows Normal distribution, but we have estimated its parameters
using a non-conventional loss because we can. Some of distributions
assume that there is an additional parameter, which can either be
estimated or provided by user. These include Asymmetric Laplace
(distribution="dalaplace"
) with alpha
,
Generalised Normal and Log-Generalised normal
(distribution=c("gnorm","dlgnorm")
) with shape
and Student’s T (distribution="dt"
) with
nu
:
<- adam(BJsales, "MMN", silent=FALSE, distribution="dgnorm", shape=3,
testModel h=12, holdout=TRUE)
The model selection in ADAM ETS relies on information criteria and
works correctly only for the loss="likelihood"
. There are
several options, how to select the model, see them in the description of
the function: ?adam()
. The default one uses
branch-and-bound algorithm, similar to the one used in
es()
, but only considers additive trend models (the
multiplicative trend ones are less stable and need more attention from a
forecaster):
<- adam(AirPassengers, "ZXZ", lags=c(1,12), silent=FALSE,
testModel h=12, holdout=TRUE)
#> Forming the pool of models based on... ANN , ANA , MNM , MAM , Estimation progress: 71 %86 %100 %... Done!
testModel#> Time elapsed: 0.55 seconds
#> Model estimated using adam() function: ETS(MAM)
#> Distribution assumed in the model: Gamma
#> Loss function type: likelihood; Loss function value: 467.2981
#> Persistence vector g:
#> alpha beta gamma
#> 0.7691 0.0053 0.0000
#>
#> Sample size: 132
#> Number of estimated parameters: 17
#> Number of degrees of freedom: 115
#> Information criteria:
#> AIC AICc BIC BICc
#> 968.5961 973.9646 1017.6038 1030.7102
#>
#> Forecast errors:
#> ME: 9.537; MAE: 20.784; RMSE: 26.106
#> sCE: 43.598%; Asymmetry: 64.8%; sMAE: 7.918%; sMSE: 0.989%
#> MASE: 0.863; RMSSE: 0.833; rMAE: 0.273; rRMSE: 0.254
Note that the function produces point forecasts if
h>0
, but it won’t generate prediction interval. This is
why you need to use forecast()
method (as shown in the
first example in this vignette).
Similarly to es()
, function supports combination of
models, but it saves all the tested models in the output for a potential
reuse. Here how it works:
<- adam(AirPassengers, "CXC", lags=c(1,12),
testModel h=12, holdout=TRUE)
<- forecast(testModel,h=18,interval="semiparametric", level=c(0.9,0.95))
testForecast
testForecast#> Point forecast Lower bound (5%) Lower bound (2.5%) Upper bound (95%)
#> Jan 1960 412.6246 389.5669 385.2802 436.2369
#> Feb 1960 407.7283 378.7151 373.3670 437.6366
#> Mar 1960 470.0235 429.8654 422.5230 511.6786
#> Apr 1960 453.1200 411.1176 403.4699 496.8268
#> May 1960 453.9442 410.7956 402.9500 498.8901
#> Jun 1960 515.9682 464.4714 455.1338 569.7223
#> Jul 1960 572.6622 512.9232 502.1196 635.1440
#> Aug 1960 571.3754 510.0484 498.9773 635.6037
#> Sep 1960 499.1215 444.9185 435.1408 555.9206
#> Oct 1960 436.8293 388.4180 379.6965 487.6094
#> Nov 1960 381.0985 338.1248 330.3919 426.2136
#> Dec 1960 429.7057 379.5381 370.5320 482.4666
#> Jan 1961 437.2582 383.1664 373.4963 494.3226
#> Feb 1961 431.9490 374.9482 364.8088 492.3048
#> Mar 1961 497.8071 427.2176 414.7363 572.8824
#> Apr 1961 479.7731 407.9884 395.3571 556.3898
#> May 1961 480.5163 407.1042 394.2120 558.9822
#> Jun 1961 546.0243 460.6961 445.7441 637.3716
#> Upper bound (97.5%)
#> Jan 1960 440.8930
#> Feb 1960 443.5806
#> Mar 1960 520.0179
#> Apr 1960 505.6094
#> May 1960 507.9326
#> Jun 1960 580.5632
#> Jul 1960 647.7740
#> Aug 1960 648.6068
#> Sep 1960 567.4270
#> Oct 1960 497.9081
#> Nov 1960 435.3726
#> Dec 1960 493.1995
#> Jan 1961 505.9721
#> Feb 1961 504.6780
#> Mar 1961 588.3503
#> Apr 1961 572.2380
#> May 1961 575.2389
#> Jun 1961 656.3307
plot(testForecast)
Yes, now we support vectors for the levels in case you want to produce several. In fact, we also support side for prediction interval, so you can extract specific quantiles without a hustle:
forecast(testModel,h=18,interval="semiparametric", level=c(0.9,0.95,0.99), side="upper")
#> Point forecast Upper bound (90%) Upper bound (95%) Upper bound (99%)
#> Jan 1960 412.6246 430.9093 436.2369 446.3480
#> Feb 1960 407.7283 430.8493 437.6366 450.5587
#> Mar 1960 470.0235 502.1743 511.6786 529.8266
#> Apr 1960 453.1200 486.8270 496.8268 515.9494
#> May 1960 453.9442 488.5976 498.8901 518.5818
#> Jun 1960 515.9682 557.3906 569.7223 593.3382
#> Jul 1960 572.6622 620.7856 635.1440 662.6663
#> Aug 1960 571.3754 620.8272 635.6037 663.9448
#> Sep 1960 499.1215 542.8470 555.9206 581.0019
#> Oct 1960 436.8293 475.9114 487.6094 510.0618
#> Nov 1960 381.0985 415.8130 426.2136 446.1838
#> Dec 1960 429.7057 470.2851 482.4666 505.8752
#> Jan 1961 437.2582 481.1129 494.3226 519.7428
#> Feb 1961 431.9490 478.2896 492.3048 519.3197
#> Mar 1961 497.8071 555.3845 572.8824 606.6772
#> Apr 1961 479.7731 538.4800 556.3898 591.0344
#> May 1961 480.5163 540.6183 558.9822 594.5276
#> Jun 1961 546.0243 615.9648 637.3716 678.8359
A brand new thing in the function is the possibility to use several
frequencies (double / triple / quadruple / … seasonal models). In order
to show how it works, we will generate an artificial time series,
inspired by half-hourly electricity demand using sim.gum()
function:
<- c(1,1,1)
ordersGUM <- c(1,48,336)
lagsGUM <- -25381.7
initialGUM1 <- c(23955.09, 24248.75, 24848.54, 25012.63, 24634.14, 24548.22, 24544.63, 24572.77,
initialGUM2 24498.33, 24250.94, 24545.44, 25005.92, 26164.65, 27038.55, 28262.16, 28619.83,
28892.19, 28575.07, 28837.87, 28695.12, 28623.02, 28679.42, 28682.16, 28683.40,
28647.97, 28374.42, 28261.56, 28199.69, 28341.69, 28314.12, 28252.46, 28491.20,
28647.98, 28761.28, 28560.11, 28059.95, 27719.22, 27530.23, 27315.47, 27028.83,
26933.75, 26961.91, 27372.44, 27362.18, 27271.31, 26365.97, 25570.88, 25058.01)
<- c(23920.16, 23026.43, 22812.23, 23169.52, 23332.56, 23129.27, 22941.20, 22692.40,
initialGUM3 22607.53, 22427.79, 22227.64, 22580.72, 23871.99, 25758.34, 28092.21, 30220.46,
31786.51, 32699.80, 33225.72, 33788.82, 33892.25, 34112.97, 34231.06, 34449.53,
34423.61, 34333.93, 34085.28, 33948.46, 33791.81, 33736.17, 33536.61, 33633.48,
33798.09, 33918.13, 33871.41, 33403.75, 32706.46, 31929.96, 31400.48, 30798.24,
29958.04, 30020.36, 29822.62, 30414.88, 30100.74, 29833.49, 28302.29, 26906.72,
26378.64, 25382.11, 25108.30, 25407.07, 25469.06, 25291.89, 25054.11, 24802.21,
24681.89, 24366.97, 24134.74, 24304.08, 25253.99, 26950.23, 29080.48, 31076.33,
32453.20, 33232.81, 33661.61, 33991.21, 34017.02, 34164.47, 34398.01, 34655.21,
34746.83, 34596.60, 34396.54, 34236.31, 34153.32, 34102.62, 33970.92, 34016.13,
34237.27, 34430.08, 34379.39, 33944.06, 33154.67, 32418.62, 31781.90, 31208.69,
30662.59, 30230.67, 30062.80, 30421.11, 30710.54, 30239.27, 28949.56, 27506.96,
26891.75, 25946.24, 25599.88, 25921.47, 26023.51, 25826.29, 25548.72, 25405.78,
25210.45, 25046.38, 24759.76, 24957.54, 25815.10, 27568.98, 29765.24, 31728.25,
32987.51, 33633.74, 34021.09, 34407.19, 34464.65, 34540.67, 34644.56, 34756.59,
34743.81, 34630.05, 34506.39, 34319.61, 34110.96, 33961.19, 33876.04, 33969.95,
34220.96, 34444.66, 34474.57, 34018.83, 33307.40, 32718.90, 32115.27, 31663.53,
30903.82, 31013.83, 31025.04, 31106.81, 30681.74, 30245.70, 29055.49, 27582.68,
26974.67, 25993.83, 25701.93, 25940.87, 26098.63, 25771.85, 25468.41, 25315.74,
25131.87, 24913.15, 24641.53, 24807.15, 25760.85, 27386.39, 29570.03, 31634.00,
32911.26, 33603.94, 34020.90, 34297.65, 34308.37, 34504.71, 34586.78, 34725.81,
34765.47, 34619.92, 34478.54, 34285.00, 34071.90, 33986.48, 33756.85, 33799.37,
33987.95, 34047.32, 33924.48, 33580.82, 32905.87, 32293.86, 31670.02, 31092.57,
30639.73, 30245.42, 30281.61, 30484.33, 30349.51, 29889.23, 28570.31, 27185.55,
26521.85, 25543.84, 25187.82, 25371.59, 25410.07, 25077.67, 24741.93, 24554.62,
24427.19, 24127.21, 23887.55, 24028.40, 24981.34, 26652.32, 28808.00, 30847.09,
32304.13, 33059.02, 33562.51, 33878.96, 33976.68, 34172.61, 34274.50, 34328.71,
34370.12, 34095.69, 33797.46, 33522.96, 33169.94, 32883.32, 32586.24, 32380.84,
32425.30, 32532.69, 32444.24, 32132.49, 31582.39, 30926.58, 30347.73, 29518.04,
29070.95, 28586.20, 28416.94, 28598.76, 28529.75, 28424.68, 27588.76, 26604.13,
26101.63, 25003.82, 24576.66, 24634.66, 24586.21, 24224.92, 23858.42, 23577.32,
23272.28, 22772.00, 22215.13, 21987.29, 21948.95, 22310.79, 22853.79, 24226.06,
25772.55, 27266.27, 28045.65, 28606.14, 28793.51, 28755.83, 28613.74, 28376.47,
27900.76, 27682.75, 27089.10, 26481.80, 26062.94, 25717.46, 25500.27, 25171.05,
25223.12, 25634.63, 26306.31, 26822.46, 26787.57, 26571.18, 26405.21, 26148.41,
25704.47, 25473.10, 25265.97, 26006.94, 26408.68, 26592.04, 26224.64, 25407.27,
25090.35, 23930.21, 23534.13, 23585.75, 23556.93, 23230.25, 22880.24, 22525.52,
22236.71, 21715.08, 21051.17, 20689.40, 20099.18, 19939.71, 19722.69, 20421.58,
21542.03, 22962.69, 23848.69, 24958.84, 25938.72, 26316.56, 26742.61, 26990.79,
27116.94, 27168.78, 26464.41, 25703.23, 25103.56, 24891.27, 24715.27, 24436.51,
24327.31, 24473.02, 24893.89, 25304.13, 25591.77, 25653.00, 25897.55, 25859.32,
25918.32, 25984.63, 26232.01, 26810.86, 27209.70, 26863.50, 25734.54, 24456.96)
<- sim.gum(orders=ordersGUM, lags=lagsGUM, nsim=1, frequency=336, obs=3360,
y measurement=rep(1,3), transition=diag(3), persistence=c(0.045,0.162,0.375),
initial=cbind(initialGUM1,initialGUM2,initialGUM3))$data
We can then apply ADAM to this data:
<- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
testModel silent=FALSE, h=336, holdout=TRUE)
testModel#> Time elapsed: 0.79 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Gamma
#> Loss function type: likelihood; Loss function value: 22013.03
#> Persistence vector g:
#> alpha beta gamma1 gamma2
#> 0.9789 0.0000 0.0211 0.0211
#> Damping parameter: 0.9988
#> Sample size: 3024
#> Number of estimated parameters: 6
#> Number of degrees of freedom: 3018
#> Information criteria:
#> AIC AICc BIC BICc
#> 44038.07 44038.10 44074.15 44074.27
#>
#> Forecast errors:
#> ME: 568.186; MAE: 921.821; RMSE: 1133.784
#> sCE: 628.35%; Asymmetry: 64.2%; sMAE: 3.034%; sMSE: 0.139%
#> MASE: 1.27; RMSSE: 1.112; rMAE: 0.128; rRMSE: 0.129
Note that the more lags you have, the more initial seasonal
components the function will need to estimate, which is a difficult
task. This is why we used initial="backcasting"
in the
example above - this speeds up the estimation by reducing the number of
parameters to estimate. Still, the optimiser might not get close to the
optimal value, so we can help it. First, we can give more time for the
calculation, increasing the number of iterations via
maxeval
(the default value is 40 iterations for each
estimated parameter, e.g. \(40 \times 5 =
200\) in our case):
<- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
testModel silent=FALSE, h=336, holdout=TRUE, maxeval=10000)
testModel#> Time elapsed: 4.92 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Gamma
#> Loss function type: likelihood; Loss function value: 19493.85
#> Persistence vector g:
#> alpha beta gamma1 gamma2
#> 0.0214 0.0000 0.1850 0.2381
#> Damping parameter: 0.9008
#> Sample size: 3024
#> Number of estimated parameters: 6
#> Number of degrees of freedom: 3018
#> Information criteria:
#> AIC AICc BIC BICc
#> 38999.69 38999.72 39035.78 39035.89
#>
#> Forecast errors:
#> ME: 102.323; MAE: 160.013; RMSE: 193.081
#> sCE: 113.158%; Asymmetry: 66.9%; sMAE: 0.527%; sMSE: 0.004%
#> MASE: 0.22; RMSSE: 0.189; rMAE: 0.022; rRMSE: 0.022
This will take more time, but will typically lead to more refined
parameters. You can control other parameters of the optimiser as well,
such as algorithm
, xtol_rel
,
print_level
and others, which are explained in the
documentation for nloptr
function from nloptr package (run
nloptr.print.options()
for details). Second, we can give a
different set of initial parameters for the optimiser, have a look at
what the function saves:
$B testModel
and use this as a starting point for the reestimation (e.g. with a different algorithm):
<- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
testModel silent=FALSE, h=336, holdout=TRUE, B=testModel$B)
testModel#> Time elapsed: 0.96 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Gamma
#> Loss function type: likelihood; Loss function value: 19493.85
#> Persistence vector g:
#> alpha beta gamma1 gamma2
#> 0.0214 0.0000 0.1850 0.2381
#> Damping parameter: 0.906
#> Sample size: 3024
#> Number of estimated parameters: 6
#> Number of degrees of freedom: 3018
#> Information criteria:
#> AIC AICc BIC BICc
#> 38999.69 38999.72 39035.78 39035.89
#>
#> Forecast errors:
#> ME: 102.323; MAE: 160.013; RMSE: 193.081
#> sCE: 113.158%; Asymmetry: 66.9%; sMAE: 0.527%; sMSE: 0.004%
#> MASE: 0.22; RMSSE: 0.189; rMAE: 0.022; rRMSE: 0.022
If you are ready to wait, you can change the initialisation to the
initial="optimal"
, which in our case will take much more
time because of the number of estimated parameters - 389 for the chosen
model. The estimation process in this case might take 20 - 30 times more
than in the example above.
In addition, you can specify some parts of the initial state vector or some parts of the persistence vector, here is an example:
<- adam(y, "MMdM", lags=c(1,48,336), initial="backcasting",
testModel silent=TRUE, h=336, holdout=TRUE, persistence=list(beta=0.1))
testModel#> Time elapsed: 0.62 seconds
#> Model estimated using adam() function: ETS(MMdM)[48, 336]
#> Distribution assumed in the model: Gamma
#> Loss function type: likelihood; Loss function value: 21830.26
#> Persistence vector g:
#> alpha beta gamma1 gamma2
#> 0.9671 0.1000 0.0318 0.0329
#> Damping parameter: 0.6265
#> Sample size: 3024
#> Number of estimated parameters: 5
#> Number of degrees of freedom: 3019
#> Information criteria:
#> AIC AICc BIC BICc
#> 43670.52 43670.54 43700.60 43700.68
#>
#> Forecast errors:
#> ME: 612.352; MAE: 945.443; RMSE: 1160.15
#> sCE: 677.192%; Asymmetry: 66.9%; sMAE: 3.112%; sMSE: 0.146%
#> MASE: 1.303; RMSSE: 1.138; rMAE: 0.131; rRMSE: 0.132
The function also handles intermittent data (the data with zeroes) and the data with missing values. This is partially covered in the vignette on the oes() function. Here is a simple example:
<- adam(rpois(120,0.5), "MNN", silent=FALSE, h=12, holdout=TRUE,
testModel occurrence="odds-ratio")
testModel#> Time elapsed: 0.05 seconds
#> Model estimated using adam() function: iETS(MNN)[O]
#> Occurrence model type: Odds ratio
#> Distribution assumed in the model: Mixture of Bernoulli and Gamma
#> Loss function type: likelihood; Loss function value: 68.7142
#> Persistence vector g:
#> alpha
#> 0
#>
#> Sample size: 108
#> Number of estimated parameters: 5
#> Number of degrees of freedom: 103
#> Information criteria:
#> AIC AICc BIC BICc
#> 294.7652 294.9960 308.1758 299.3518
#>
#> Forecast errors:
#> Asymmetry: -18.254%; sMSE: 23.078%; rRMSE: 0.793; sPIS: 368.43%; sCE: -45.267%
Finally, adam()
is faster than es()
function, because its code is more efficient and it uses a different
optimisation algorithm with more finely tuned parameters by default.
Let’s compare:
<- adam(AirPassengers, "CCC",
adamModel h=12, holdout=TRUE)
<- es(AirPassengers, "CCC",
esModel h=12, holdout=TRUE)
"adam:"
#> [1] "adam:"
adamModel#> Time elapsed: 2.17 seconds
#> Model estimated: ETS(CCC)
#> Loss function type: likelihood
#>
#> Number of models combined: 30
#> Sample size: 132
#> Average number of estimated parameters: 21.8083
#> Average number of degrees of freedom: 110.1917
#>
#> Forecast errors:
#> ME: -1.759; MAE: 14.402; RMSE: 20.569
#> sCE: -8.04%; sMAE: 5.487%; sMSE: 0.614%
#> MASE: 0.598; RMSSE: 0.656; rMAE: 0.19; rRMSE: 0.2
"es():"
#> [1] "es():"
esModel#> Time elapsed: 2.07 seconds
#> Model estimated: ETS(CCC)
#> Loss function type: likelihood
#>
#> Number of models combined: 30
#> Sample size: 132
#> Average number of estimated parameters: 21.3232
#> Average number of degrees of freedom: 110.6768
#>
#> Forecast errors:
#> ME: 2.906; MAE: 15.858; RMSE: 22.111
#> sCE: 13.287%; sMAE: 6.041%; sMSE: 0.71%
#> MASE: 0.658; RMSSE: 0.706; rMAE: 0.209; rRMSE: 0.215
As mentioned above, ADAM does not only contain ETS, it also contains
ARIMA model, which is regulated via orders
parameter. If
you want to have a pure ARIMA, you need to switch off ETS, which is done
via model="NNN"
:
<- adam(BJsales, "NNN", silent=FALSE, orders=c(0,2,2),
testModel h=12, holdout=TRUE)
testModel#> Time elapsed: 0.04 seconds
#> Model estimated using adam() function: ARIMA(0,2,2)
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 240.5944
#> ARMA parameters of the model:
#> MA:
#> theta1[1] theta2[1]
#> -0.7484 -0.0165
#>
#> Sample size: 138
#> Number of estimated parameters: 5
#> Number of degrees of freedom: 133
#> Information criteria:
#> AIC AICc BIC BICc
#> 491.1888 491.6434 505.8251 506.9449
#>
#> Forecast errors:
#> ME: 2.961; MAE: 3.087; RMSE: 3.812
#> sCE: 15.63%; Asymmetry: 90.2%; sMAE: 1.358%; sMSE: 0.028%
#> MASE: 2.591; RMSSE: 2.485; rMAE: 0.996; rRMSE: 0.995
Given that both models are implemented in the same framework, they can be compared using information criteria.
The functionality of ADAM ARIMA is similar to the one of
msarima
function in smooth
package, although
there are several differences.
First, changing the distribution
parameter will allow
switching between additive / multiplicative models. For example,
distribution="dlnorm"
will create an ARIMA, equivalent to
the one on logarithms of the data:
<- adam(AirPassengers, "NNN", silent=FALSE, lags=c(1,12),
testModel orders=list(ar=c(1,1),i=c(1,1),ma=c(2,2)), distribution="dlnorm",
h=12, holdout=TRUE)
testModel#> Time elapsed: 0.58 seconds
#> Model estimated using adam() function: SARIMA(1,1,2)[1](1,1,2)[12]
#> Distribution assumed in the model: Log-Normal
#> Loss function type: likelihood; Loss function value: 506.1549
#> ARMA parameters of the model:
#> AR:
#> phi1[1] phi1[12]
#> 0.0925 0.1077
#> MA:
#> theta1[1] theta2[1] theta1[12] theta2[12]
#> 0.0703 0.3247 -0.4464 -0.0607
#>
#> Sample size: 132
#> Number of estimated parameters: 33
#> Number of degrees of freedom: 99
#> Information criteria:
#> AIC AICc BIC BICc
#> 1078.310 1101.208 1173.442 1229.345
#>
#> Forecast errors:
#> ME: -33.764; MAE: 33.764; RMSE: 37.234
#> sCE: -154.352%; Asymmetry: -100%; sMAE: 12.863%; sMSE: 2.012%
#> MASE: 1.402; RMSSE: 1.188; rMAE: 0.444; rRMSE: 0.362
Second, if you want the model with intercept / drift, you can do it
using constant
parameter:
<- adam(AirPassengers, "NNN", silent=FALSE, lags=c(1,12), constant=TRUE,
testModel orders=list(ar=c(1,1),i=c(1,1),ma=c(2,2)), distribution="dnorm",
h=12, holdout=TRUE)
testModel#> Time elapsed: 0.49 seconds
#> Model estimated using adam() function: SARIMA(1,1,2)[1](1,1,2)[12] with drift
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 489.421
#> ARMA parameters of the model:
#> AR:
#> phi1[1] phi1[12]
#> -0.6043 -0.2731
#> MA:
#> theta1[1] theta2[1] theta1[12] theta2[12]
#> 0.3958 0.0039 0.1525 0.1035
#>
#> Sample size: 132
#> Number of estimated parameters: 34
#> Number of degrees of freedom: 98
#> Information criteria:
#> AIC AICc BIC BICc
#> 1046.842 1071.378 1144.857 1204.760
#>
#> Forecast errors:
#> ME: -16.487; MAE: 18.657; RMSE: 23.574
#> sCE: -75.371%; Asymmetry: -85.2%; sMAE: 7.108%; sMSE: 0.807%
#> MASE: 0.775; RMSSE: 0.752; rMAE: 0.245; rRMSE: 0.229
If the model contains non-zero differences, then the constant acts as
a drift. Third, you can specify parameters of ARIMA via the
arma
parameter in the following manner:
<- adam(AirPassengers, "NNN", silent=FALSE, lags=c(1,12),
testModel orders=list(ar=c(1,1),i=c(1,1),ma=c(2,2)), distribution="dnorm",
arma=list(ar=c(0.1,0.1), ma=c(-0.96, 0.03, -0.12, 0.03)),
h=12, holdout=TRUE)
testModel#> Time elapsed: 0.26 seconds
#> Model estimated using adam() function: SARIMA(1,1,2)[1](1,1,2)[12]
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 534.9627
#> ARMA parameters of the model:
#> AR:
#> phi1[1] phi1[12]
#> 0.1 0.1
#> MA:
#> theta1[1] theta2[1] theta1[12] theta2[12]
#> -0.96 0.03 -0.12 0.03
#>
#> Sample size: 132
#> Number of estimated parameters: 27
#> Number of degrees of freedom: 105
#> Information criteria:
#> AIC AICc BIC BICc
#> 1123.925 1138.464 1201.761 1237.255
#>
#> Forecast errors:
#> ME: 9.575; MAE: 17.082; RMSE: 19.148
#> sCE: 43.773%; Asymmetry: 61.9%; sMAE: 6.508%; sMSE: 0.532%
#> MASE: 0.709; RMSSE: 0.611; rMAE: 0.225; rRMSE: 0.186
Finally, the initials for the states can also be provided, although
getting the correct ones might be a challenging task (you also need to
know how many of them to provide; checking
testModel$initial
might help):
<- adam(AirPassengers, "NNN", silent=FALSE, lags=c(1,12),
testModel orders=list(ar=c(1,1),i=c(1,1),ma=c(2,0)), distribution="dnorm",
initial=list(arima=AirPassengers[1:24]),
h=12, holdout=TRUE)
testModel#> Time elapsed: 0.31 seconds
#> Model estimated using adam() function: SARIMA(1,1,2)[1](1,1,0)[12]
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 493.7137
#> ARMA parameters of the model:
#> AR:
#> phi1[1] phi1[12]
#> 0.100 0.153
#> MA:
#> theta1[1] theta2[1]
#> -0.3336 0.0581
#>
#> Sample size: 132
#> Number of estimated parameters: 31
#> Number of degrees of freedom: 101
#> Information criteria:
#> AIC AICc BIC BICc
#> 1049.427 1069.267 1138.794 1187.232
#>
#> Forecast errors:
#> ME: -23.835; MAE: 23.967; RMSE: 29.528
#> sCE: -108.963%; Asymmetry: -97.9%; sMAE: 9.13%; sMSE: 1.265%
#> MASE: 0.995; RMSSE: 0.942; rMAE: 0.315; rRMSE: 0.287
If you work with ADAM ARIMA model, then there is no such thing as
“usual” bounds for the parameters, so the function will use the
bounds="admissible"
, checking the AR / MA polynomials in
order to make sure that the model is stationary and invertible (aka
stable).
Similarly to ETS, you can use different distributions and losses for
the estimation. Note that the order selection for ARIMA is done
in auto.adam()
function, not in the
adam()
! However, if you do
orders=list(..., select=TRUE)
in adam()
, it
will call auto.adam()
and do the selection.
Finally, ARIMA is typically slower than ETS, mainly because its
initial states are more difficult to estimate due to an increased
complexity of the model. If you want to speed things up, use
initial="backcasting"
and reduce the number of iterations
via maxeval
parameter.
Another important feature of ADAM is introduction of explanatory
variables. Unlike in es()
, adam()
expects a
matrix for data
and can work with a formula. If the latter
is not provided, then it will use all explanatory variables. Here is a
brief example:
<- cbind(BJsales,BJsales.lead)
BJData <- adam(BJData, "AAN", h=18, silent=FALSE) testModel
If you work with data.frame or similar structures, then you can use
them directly, ADAM will extract the response variable either assuming
that it is in the first column or from the provided formula (if you
specify one via formula
parameter). Here is an example,
where we create a matrix with lags and leads of an explanatory
variable:
<- cbind(as.data.frame(BJsales),as.data.frame(xregExpander(BJsales.lead,c(-7:7))))
BJData colnames(BJData)[1] <- "y"
<- adam(BJData, "ANN", h=18, silent=FALSE, holdout=TRUE, formula=y~xLag1+xLag2+xLag3)
testModel
testModel#> Time elapsed: 0.13 seconds
#> Model estimated using adam() function: ETSX(ANN)
#> Distribution assumed in the model: Normal
#> Loss function type: likelihood; Loss function value: 210.9197
#> Persistence vector g (excluding xreg):
#> alpha
#> 1
#>
#> Sample size: 132
#> Number of estimated parameters: 6
#> Number of degrees of freedom: 126
#> Information criteria:
#> AIC AICc BIC BICc