# Hyperparameters tuning with GPopt

*This article was first published on*

Want to share your content on python-bloggers? click here.

**T. Moudiki's Webpage - Python**, and kindly contributed to python-bloggers. (You can report issue about the content on this page here)Want to share your content on python-bloggers? click here.

Statistical/Machine learning models can have multiple *hyperparameters*

that control their performance (out-of-sample accuracy, area under the curve, Root Mean Squared Error, etc.). In this post, in order to determine these hyperparameters for mlsauce’s `LSBoostClassifier`

(on the wine dataset), cross-validation is used along with a Bayesian optimizer, GPopt. The *best* set of hyperparameters is the one that **maximizes 5-fold cross-validation accuracy**.

**Installing packages**

!pip install mlsauce GPopt numpy sklearn

**Import packages**

import GPopt as gp import mlsauce as ms import numpy as np from sklearn.datasets import load_breast_cancer, load_wine, load_iris, load_digits from sklearn.model_selection import train_test_split, cross_val_score from sklearn.metrics import classification_report from time import time

**Define the objective function to be minimized**

def lsboost_cv(X_train, y_train, learning_rate=0.1, n_hidden_features=5, reg_lambda=0.1, dropout=0, tolerance=1e-4, seed=123): estimator = ms.LSBoostClassifier(n_estimators=100, learning_rate=learning_rate, n_hidden_features=np.int(n_hidden_features), reg_lambda=reg_lambda, dropout=dropout, tolerance=tolerance, seed=seed, verbose=0) return -cross_val_score(estimator, X_train, y_train, scoring='accuracy', cv=5, n_jobs=-1).mean()

**Define the optimizer (based on GPopt)**

def optimize_lsboost(X_train, y_train): def crossval_objective(x): return lsboost_cv( X_train=X_train, y_train=y_train, learning_rate=x[0], n_hidden_features=np.int(x[1]), reg_lambda=x[2], dropout=x[3], tolerance=x[4]) gp_opt = gp.GPOpt(objective_func=crossval_objective, lower_bound = np.array([0.001, 5, 1e-2, 0, 0]), upper_bound = np.array([0.4, 250, 1e4, 1, 1e-1]), n_init=10, n_iter=190, seed=123) return {'parameters': gp_opt.optimize(verbose=2, abs_tol=1e-2), 'opt_object': gp_opt}

**Using the wine dataset**

wine = load_wine() X = wine.data y = wine.target # split data into training test and test set X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=15029)

**Hyperparameters optimization**

res = optimize_lsboost(X_train, y_train) print(res)

{'parameters': (array([ 0.30017694, 33.57635498, 5.50315857, 0.47113037, 0.05940552]), -0.993103448275862), 'opt_object': <GPopt.GPOpt.GPOpt.GPOpt object at 0x7f0373030ad0>}

Cross-validation average accuracy is equal to 99.31%.

**Test set accuracy**

parameters = res["parameters"] start = time() estimator_wine = ms.LSBoostClassifier(n_estimators=100, learning_rate=parameters[0][0], n_hidden_features=np.int(parameters[0][1]), reg_lambda=parameters[0][2], dropout=parameters[0][3], tolerance=parameters[0][4], seed=123, verbose=1).fit(X_train, y_train) print(f"\n\n Test set accuracy: {estimator_wine.score(X_test, y_test)}") print(f"\n Elapsed: {time() - start}")

8%|▊ | 8/100 [00:00<00:00, 473.55it/s] Test set accuracy: 1.0 Elapsed: 0.03318595886230469

Due to `LSBoostClassifier`

’s `tolerance`

hyperparameter (equal to 0.05940552 here), the learning

procedure is stopped early, and only 8 iterations of the classifier are necessary

to obtain a high accuracy.

**leave a comment**for the author, please follow the link and comment on their blog:

**T. Moudiki's Webpage - Python**.

Want to share your content on python-bloggers? click here.