AutoML in nnetsauce (randomized and quasi-randomized nnetworks) Pt.2: multivariate time series forecasting

This article was first published on T. Moudiki's Webpage - Python , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

Last week, I talked about an AutoML method for regression and classification implemented in Python package nnetsauce. This week, my post is about the same AutoML method, applied this time to multivariate time series (MTS) forecasting.

In the examples below, keep in mind that VAR (Vector Autoregression) and VECM (Vector Error Correction Model) forecasting models aren’t thoroughly trained here. nnetsauce.MTS isn’t really tuned either; this is just a demo. To finish, a probabilistic error metric (instead of the Root Mean Squared Error, RMSE) is better suited for models capturing forecasting uncertainty.

Contents

  • 1 – Install
  • 2 – MTS
  • 2 – 1 nnetsauce.MTS
  • 2 – 2 statsmodels VAR
  • 2 – 3 statsmodels VECM

1 – Install

!pip install git+https://github.com/Techtonique/nnetsauce.git@lazy-predict
import nnetsauce as ns
import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.tsa.api import VAR
from statsmodels.tsa.base.datetools import dates_from_str
from sklearn.linear_model import LassoCV
from statsmodels.tsa.api import VAR
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.vector_ar.vecm import VECM, select_order
from statsmodels.tsa.base.datetools import dates_from_str

2 – MTS

Macro data

# some example data
mdata = sm.datasets.macrodata.load_pandas().data

# prepare the dates index
dates = mdata[['year', 'quarter']].astype(int).astype(str)

quarterly = dates["year"] + "Q" + dates["quarter"]

quarterly = dates_from_str(quarterly)

mdata = mdata[['realgovt', 'tbilrate']]

mdata.index = pd.DatetimeIndex(quarterly)

data = np.log(mdata).diff().dropna()

display(data)
df = data

df.index.rename('date')

idx_train = int(df.shape[0]*0.8)
idx_end = df.shape[0]
df_train = df.iloc[0:idx_train,]
df_test = df.iloc[idx_train:idx_end,]

regr_mts = ns.LazyMTS(verbose=1, ignore_warnings=True, custom_metric=None,
                      lags = 1, n_hidden_features=3, n_clusters=0, random_state=1)
models, predictions = regr_mts.fit(df_train, df_test)
model_dictionary = regr_mts.provide_models(df_train, df_test)
display(models)
RMSEMAEMPLTime Taken
Model
LassoCV0.220.120.060.20
ElasticNetCV0.220.120.060.19
LassoLarsCV0.220.120.060.08
LarsCV0.220.120.060.08
DummyRegressor0.220.120.060.06
ElasticNet0.220.120.060.07
LassoLars0.220.120.060.06
Lasso0.220.120.060.07
ExtraTreeRegressor0.220.140.070.12
KNeighborsRegressor0.220.120.060.09
SVR0.220.120.060.13
HistGradientBoostingRegressor0.230.130.060.79
NuSVR0.230.130.060.20
ExtraTreesRegressor0.240.130.070.87
GradientBoostingRegressor0.240.130.070.25
RandomForestRegressor0.260.160.082.06
AdaBoostRegressor0.280.190.100.45
DecisionTreeRegressor0.280.180.090.06
BaggingRegressor0.280.190.100.20
GaussianProcessRegressor8.265.902.950.17
BayesianRidge11774168792.683129885640.501564942820.250.08
TweedieRegressor1066305878860.67263521546472.00131760773236.000.12
LassoLarsIC10841414830181.572665022282527.501332511141263.750.08
PassiveAggressiveRegressor200205325611502239744.0040689888595970097152.0020344944297985048576.000.17
SGDRegressor1383750703550277812748288.00269310062772019343130624.00134655031386009671565312.000.13
LinearSVR6205416599219790202011648.001189414936788171753521152.00594707468394085876760576.000.06
OrthogonalMatchingPursuitCV18588484112627753604349952.003542235944300533382119424.001771117972150266691059712.000.23
OrthogonalMatchingPursuit18588484112627753604349952.003542235944300533382119424.001771117972150266691059712.000.20
HuberRegressor50554040814422644093913571262464.009061839427591544042390898606080.004530919713795772021195449303040.000.09
RidgeCV1788858960353426286932811384356864.00317940467527547291488891451736064.00158970233763773645744445725868032.000.23
RANSACRegressor352805899757804849079011831705501696.0061914238966205227684888230708117504.0030957119483102613842444115354058752.001.44
LinearRegression13408548756595947978849418193194188800.002316276205868561893698967459810246656.001158138102934280946849483729905123328.000.06
TransformedTargetRegressor13408548756595947978849418193194188800.002316276205868561893698967459810246656.001158138102934280946849483729905123328.000.11
Lars13408548756596845228481163425784791040.002316276205868715960905471081985343488.001158138102934357980452735540992671744.000.08
Ridge27935786184657480745080678989281886208.004824713257018197525713060327109689344.002412356628509098762856530163554844672.000.12
KernelRidge27935786184685139645570846501298503680.004824713257022931107816326787730767872.002412356628511465553908163393865383936.000.09
MLPRegressor64247413650209509837810706524366567768365621314…10088348458681313437051396009759695398571807517…50441742293406567185256980048798476992859037587…0.42

model_dictionary['LassoCV']
MTS(n_clusters=0, n_hidden_features=3, obj=LassoCV(random_state=1), seed='mean')

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.