Best procedure to select predictors and ARIMA model terms #68
-
|
I'm searching for a valid procedure to select predictors and I have a dataset from The first procedure I thought about is:
A second procedure (more difficult to implement) could be:
Which of these two procedures could be better? Is there a best practice and/or a more rigorous and tested approach? Thank you. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
|
Great question! I've done something similar to the first procedure before, and it works pretty well. Note that minimizing the AICc is asymptotically equivalent to minimizing one-step RMSE on cross-validated test sets, so there is no guarantee that the ARIMA model will be optimal for multi-step forecasting. On the other hand, if the data truly come from the fitted model, then optimizing for one-step RMSE will also give the optimal model for multi-step RMSE. The second approach focuses more directly on the multi-step RMSE, but is less efficient in choosing the ARIMA model as there are a limited number of training/test splits that you can average over. It will also be much slower as the model search space is much bigger (including all the possible ARIMA models in each iteration). I don't know of any literature on this, but I would guess that it won't make that much difference to forecast accuracy, and so I'd go for the faster approach, namely the first procedure. |
Beta Was this translation helpful? Give feedback.
Great question! I've done something similar to the first procedure before, and it works pretty well. Note that minimizing the AICc is asymptotically equivalent to minimizing one-step RMSE on cross-validated test sets, so there is no guarantee that the ARIMA model will be optimal for multi-step forecasting. On the other hand, if the data truly come from the fitted model, then optimizing for one-step RMSE will also give the optimal model for multi-step RMSE.
The second approach focuses more directly on the multi-step RMSE, but is less efficient in choosing the ARIMA model as there are a limited number of training/test splits that you can average over. It will also be much slower as the mode…