# ahead forecasting (v0.10.0): fast time series model calibration and Python plots

*This article was first published on*

Want to share your content on python-bloggers? click here.

**T. Moudiki's Webpage - Python**, and kindly contributed to python-bloggers. (You can report issue about the content on this page here)Want to share your content on python-bloggers? click here.

`ahead`

is an R, Python and Julia package for univariate and multivariate time series forecasting with **uncertainty quantification** (including predictive simulation). Matlab is next. The aim is always to make these implementations of `ahead`

as **similar as possible**, but that’s a looot of work as you might guess. Do not hesitate to submit a pull request to these repositories if you want to help and contribute. You can also signal bugs (constructively).

The main changes in `ahead`

’s `v0.10.0`

are:

**Fast calibration**of`ridge2f`

using a Leave-One-Out Cross-Validation (LOOCV) trick (R for now)**Univariate forecasting**with`ridge2f`

with external regressors**Plotting functionality**in Python version

Read on for more details. You can also use this notebook if you want to reproduce the experiments.

# 1. **Fast calibration of **`ridge2f`

:

`ridge2f`

:The fast calibration of `ridge2f`

uses a remarkable result available for linear models’ LOOCV:

\[LOOCV error = \frac{1}{n} \sum_{i=1}^n\left(y_i-\hat{f}_{-i}\left(\mathbf{z}_i\right)\right)^2 = \frac{1}{n} \sum_{i=1}^n\left(\frac{y_i-\hat{f}\left(\mathbf{z}_i\right)}{1-\mathbf{S}_{i i}}\right)^2\]

Where \(\hat{f}_{-i}\left(\mathbf{z}_i\right)\) is a model statistical learning model $f$ fitted without the \(i\)-th observation, \(\mathbf{z}_i\) is the \(i\)-th observation, and \(\mathbf{S}_{i i}\) is the $i$-th diagonal element of the hat matrix (a.k.a smoother, a matrix so that \(\hat{y} = \mathbf{S} y\)). **LOOCV is valid for time series data** because contrary to other cross-validation techniques, it does not break the sequential structure of the inputs. **Keep in mind**, however, that time series cross-vaidation will give different results.

The LOOCV result can be extended further and approximated by Generalized Cross-Validation (GCV), still with a closed-form formula available. GCV is used in `ahead::ridge2f`

. Indeed, even though `ridge2`

is not – strictly speaking – linear, it possesses the structure of a ridge regression model with 2 regularization parameters; a regularization parameter for the original explanative variables, and a regularization parameter for new, engineered features. These engineered features transform the linear model into a non-linear one.

In the following R example, it’s worth mentioning that **only the 2 regularization parameters are calibrated**. Other hyperparameters such as the number of time series lags or the number of nodes in the hidden layer are set to their default values. Feel free try and share other experiments including them.

%load_ext rpy2.ipython

%%R options(repos = c( techtonique = 'https://techtonique.r-universe.dev', CRAN = 'https://cloud.r-project.org')) install.packages("ahead") install.packages("ggplot2") install.packages("forecast") install.packages("dfoptim")

%%R library(ahead) library(dfoptim)

# Convert to R DataFrame %R -i df

%R df_ts <- as.ts(df)

%%R objective_function <- function(xx) { ahead::loocvridge2f(df_ts, h = 20, type_pi="blockbootstrap", lambda_1=10^xx[1], lambda_2=10^xx[2], show_progress = FALSE, )$loocv } start <- proc.time()[3] (opt <- dfoptim::nmkb(fn=objective_function, lower=c(-10,-10), upper=c(10,10), par=c(0.1, 0.1))) print(proc.time()[3]-start)

elapsed 6.657

%%R start <- proc.time()[3] res <- ahead::ridge2f(df_ts, h = 20, type_pi="blockbootstrap", lambda_1=10^opt$par[1], # 'optimal' parameters lambda_2=10^opt$par[2]) # 'optimal' parameters print(proc.time()[3]-start)

|======================================================================| 100% elapsed 0.805

%%R par(mfrow=c(2, 2)) plot(res, "realgovt", type = "sims", main="50 predictive simulations \n of realgovt") plot(res, "realgovt", type = "dist", main="predictive distribution \n of realgovt") plot(res, "realgdp", type = "sims", main="50 predictive simulations \n of realgdp") plot(res, "realgdp", type = "dist", main="predictive distribution \n of realgovt")

# 2. **Univariate forecasting with **`ridge2f`

with external regressors:

`ridge2f`

with external regressors:`ridge2f`

was initially designed for multivariate time series forecasting. Nothing prevents it from being used for univariate forecasting. This functionality is now available in `ahead`

(R for now). In addition, as in the multivariate case, it is possible to use external regressors in the forecasting process.

%%R library(ggplot2) library(forecast) x <- fdeaths xreg <- ahead::createtrendseason(x) (z <- ahead::ridge2f(x, xreg = xreg, h=20)) autoplot(z)

# 3. **Plotting functionality in Python version:**

**Install and import packages**

!pip install ahead

import numpy as np import pandas as pd import statsmodels.api as sm from time import time from ahead import Ridge2Regressor # may take some time to run, ONLY the 1st time it's run from statsmodels.tsa.base.datetools import dates_from_str from time import time

**Import data**

# some example data mdata = sm.datasets.macrodata.load_pandas().data # prepare the dates index dates = mdata[['year', 'quarter']].astype(int).astype(str) quarterly = dates["year"] + "Q" + dates["quarter"] quarterly = dates_from_str(quarterly) mdata = mdata[['realgovt', 'tbilrate', 'cpi', 'realgdp']] mdata.index = pd.DatetimeIndex(quarterly) df = np.log(mdata).diff().dropna() display(df)

realgovt | tbilrate | cpi | realgdp | |
---|---|---|---|---|

1959-06-30 | 0.023664 | 0.088193 | 0.005849 | 0.024942 |

1959-09-30 | 0.020481 | 0.215321 | 0.006838 | -0.001193 |

1959-12-31 | -0.014781 | 0.125317 | 0.000681 | 0.003495 |

1960-03-31 | -0.046197 | -0.212805 | 0.005772 | 0.022190 |

1960-06-30 | -0.003900 | -0.266946 | 0.000338 | -0.004685 |

… | … | … | … | … |

2008-09-30 | 0.031005 | -0.396881 | -0.007904 | -0.006781 |

2008-12-31 | 0.015732 | -2.277267 | -0.021979 | -0.013805 |

2009-03-31 | -0.010967 | 0.606136 | 0.002340 | -0.016612 |

2009-06-30 | 0.026975 | -0.200671 | 0.008419 | -0.001851 |

2009-09-30 | 0.019888 | -0.405465 | 0.008894 | 0.006862 |

202 rows × 4 columns

regr = Ridge2Regressor(h = 20, date_formatting = "original", type_pi="blockbootstrap", # type of simulations for predictive inference B=50, # number of simulations seed=1) start = time() regr.forecast(df) print(f"Elapsed: {time() - start} s")

|======================================================================| 100% Elapsed: 0.49063634872436523 s

regr2 = Ridge2Regressor(h = 20, date_formatting = "original", type_pi="gaussian", # Gaussian prediction intervals seed=1) start = time() regr2.forecast(df) print(f"Elapsed: {time() - start} s")

Elapsed: 0.038446664810180664 s

regr.plot("realgovt")

regr.plot("realgdp")

regr2.plot("realgovt")

regr2.plot("realgdp")

**leave a comment**for the author, please follow the link and comment on their blog:

**T. Moudiki's Webpage - Python**.

Want to share your content on python-bloggers? click here.