A few weeks ago, I intoduced a model-agnostic gradient boosting procedure, that can use any base learner (available in R and Python package mlsauce
):
The rationale is different from other histogram-based gradient boosting algorithms, as histograms are only used here for feature engineering of continuous features . So far, I don’t see huge differences with the original implementation of the GenericBooster
, but it’s still a work in progress. I envisage to try it out on a data set that contains a ‘higher’ mix of continuous and categorical features (as categorical features are not histogram-engineered ).
Here are a few results that can give you an idea of the performance of the algorithm:
!pip install git+https://github.com/Techtonique/mlsauce.git --verbose --upgrade --no-cache-dir
import os
import mlsauce as ms
from sklearn.datasets import load_breast_cancer, load_iris, load_wine, load_digits
from sklearn.model_selection import train_test_split
from time import time
load_models = [load_breast_cancer, load_iris, load_wine, load_digits]
for model in load_models:
data = model()
X = data.data
y= data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = .2, random_state = 13)
clf = ms.LazyBoostingClassifier(verbose=0, ignore_warnings=True, #n_jobs=2,
custom_metric=None, preprocess=False)
start = time()
models, predictioms = clf.fit(X_train, X_test, y_train, y_test, hist=True)
models2, predictioms = clf.fit(X_train, X_test, y_train, y_test, hist=False)
print(f"\nElapsed: {time() - start} seconds\n")
display(models)
display(models2)
2it [00:00, 2.27it/s]
100%|██████████| 38/38 [00:41<00:00, 1.09s/it]
2it [00:00, 5.14it/s]
100%|██████████| 38/38 [00:43<00:00, 1.14s/it]
Elapsed: 85.95083284378052 seconds
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
GenericBooster(MultiTask(TweedieRegressor))
0.99
0.99
0.99
0.99
1.73
GenericBooster(LinearRegression)
0.99
0.99
0.99
0.99
0.37
GenericBooster(TransformedTargetRegressor)
0.99
0.99
0.99
0.99
0.40
GenericBooster(RidgeCV)
0.99
0.99
0.99
0.99
1.28
GenericBooster(Ridge)
0.99
0.99
0.99
0.99
0.27
XGBClassifier
0.96
0.96
0.96
0.96
0.50
RandomForestClassifier
0.96
0.96
0.96
0.96
0.37
GenericBooster(ExtraTreeRegressor)
0.94
0.94
0.94
0.94
0.40
GenericBooster(MultiTask(BayesianRidge))
0.94
0.93
0.93
0.94
4.97
GenericBooster(KNeighborsRegressor)
0.87
0.89
0.89
0.87
0.70
GenericBooster(DecisionTreeRegressor)
0.87
0.88
0.88
0.87
2.24
GenericBooster(MultiTaskElasticNet)
0.87
0.79
0.79
0.86
0.11
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.86
0.79
0.79
0.85
1.28
GenericBooster(MultiTaskLasso)
0.85
0.76
0.76
0.84
0.06
GenericBooster(ElasticNet)
0.85
0.76
0.76
0.84
0.16
GenericBooster(MultiTask(QuantileRegressor))
0.82
0.72
0.72
0.80
10.42
GenericBooster(Lasso)
0.82
0.71
0.71
0.79
0.09
GenericBooster(LassoLars)
0.82
0.71
0.71
0.79
0.08
GenericBooster(MultiTask(LinearSVR))
0.81
0.69
0.69
0.78
14.75
GenericBooster(DummyRegressor)
0.68
0.50
0.50
0.56
0.01
GenericBooster(MultiTask(SGDRegressor))
0.50
0.46
0.46
0.51
1.67
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
GenericBooster(MultiTask(TweedieRegressor))
0.99
0.99
0.99
0.99
1.67
GenericBooster(LinearRegression)
0.99
0.99
0.99
0.99
0.30
GenericBooster(TransformedTargetRegressor)
0.99
0.99
0.99
0.99
0.74
GenericBooster(RidgeCV)
0.99
0.99
0.99
0.99
2.77
GenericBooster(Ridge)
0.99
0.99
0.99
0.99
0.28
XGBClassifier
0.96
0.96
0.96
0.96
0.13
GenericBooster(MultiTask(BayesianRidge))
0.94
0.93
0.93
0.94
7.81
GenericBooster(ExtraTreeRegressor)
0.94
0.94
0.94
0.94
0.23
RandomForestClassifier
0.92
0.93
0.93
0.92
0.25
GenericBooster(KNeighborsRegressor)
0.87
0.89
0.89
0.87
0.42
GenericBooster(DecisionTreeRegressor)
0.87
0.88
0.88
0.87
0.97
GenericBooster(MultiTaskElasticNet)
0.87
0.79
0.79
0.86
0.11
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.86
0.79
0.79
0.85
1.20
GenericBooster(MultiTaskLasso)
0.85
0.76
0.76
0.84
0.06
GenericBooster(ElasticNet)
0.85
0.76
0.76
0.84
0.09
GenericBooster(MultiTask(QuantileRegressor))
0.82
0.72
0.72
0.80
10.57
GenericBooster(LassoLars)
0.82
0.71
0.71
0.79
0.09
GenericBooster(Lasso)
0.82
0.71
0.71
0.79
0.09
GenericBooster(MultiTask(LinearSVR))
0.81
0.69
0.69
0.78
14.20
GenericBooster(DummyRegressor)
0.68
0.50
0.50
0.56
0.01
GenericBooster(MultiTask(SGDRegressor))
0.50
0.46
0.46
0.51
1.33
2it [00:00, 6.46it/s]
100%|██████████| 38/38 [00:12<00:00, 3.11it/s]
2it [00:00, 10.38it/s]
100%|██████████| 38/38 [00:11<00:00, 3.18it/s]
Elapsed: 24.71835470199585 seconds
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
GenericBooster(RidgeCV)
1.00
1.00
None
1.00
0.18
GenericBooster(Ridge)
1.00
1.00
None
1.00
0.14
GenericBooster(LinearRegression)
0.97
0.97
None
0.97
0.13
GenericBooster(DecisionTreeRegressor)
0.97
0.97
None
0.97
0.18
GenericBooster(TransformedTargetRegressor)
0.97
0.97
None
0.97
0.23
GenericBooster(ExtraTreeRegressor)
0.97
0.97
None
0.97
0.14
XGBClassifier
0.97
0.97
None
0.97
0.05
RandomForestClassifier
0.93
0.95
None
0.93
0.26
GenericBooster(KNeighborsRegressor)
0.93
0.95
None
0.93
0.27
GenericBooster(MultiTask(SGDRegressor))
0.90
0.92
None
0.90
0.75
GenericBooster(MultiTask(TweedieRegressor))
0.90
0.92
None
0.90
1.61
GenericBooster(MultiTask(LinearSVR))
0.80
0.85
None
0.80
2.15
GenericBooster(MultiTaskElasticNet)
0.80
0.85
None
0.80
0.07
GenericBooster(MultiTask(BayesianRidge))
0.63
0.72
None
0.57
2.42
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.57
0.67
None
0.45
1.05
GenericBooster(Lars)
0.50
0.46
None
0.48
0.59
GenericBooster(MultiTask(QuantileRegressor))
0.43
0.33
None
0.26
2.19
GenericBooster(LassoLars)
0.27
0.33
None
0.11
0.01
GenericBooster(MultiTaskLasso)
0.27
0.33
None
0.11
0.01
GenericBooster(Lasso)
0.27
0.33
None
0.11
0.01
GenericBooster(ElasticNet)
0.27
0.33
None
0.11
0.01
GenericBooster(DummyRegressor)
0.27
0.33
None
0.11
0.01
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
GenericBooster(RidgeCV)
1.00
1.00
None
1.00
0.16
GenericBooster(Ridge)
1.00
1.00
None
1.00
0.16
RandomForestClassifier
0.97
0.97
None
0.97
0.15
GenericBooster(LinearRegression)
0.97
0.97
None
0.97
0.13
GenericBooster(DecisionTreeRegressor)
0.97
0.97
None
0.97
0.16
GenericBooster(TransformedTargetRegressor)
0.97
0.97
None
0.97
0.24
GenericBooster(ExtraTreeRegressor)
0.97
0.97
None
0.97
0.14
XGBClassifier
0.97
0.97
None
0.97
0.04
GenericBooster(KNeighborsRegressor)
0.93
0.95
None
0.93
0.28
GenericBooster(MultiTask(SGDRegressor))
0.90
0.92
None
0.90
0.78
GenericBooster(MultiTask(TweedieRegressor))
0.90
0.92
None
0.90
1.35
GenericBooster(MultiTask(LinearSVR))
0.80
0.85
None
0.80
2.15
GenericBooster(MultiTaskElasticNet)
0.80
0.85
None
0.80
0.07
GenericBooster(MultiTask(BayesianRidge))
0.63
0.72
None
0.57
1.81
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.57
0.67
None
0.45
1.21
GenericBooster(Lars)
0.50
0.46
None
0.48
0.58
GenericBooster(MultiTask(QuantileRegressor))
0.43
0.33
None
0.26
2.63
GenericBooster(LassoLars)
0.27
0.33
None
0.11
0.01
GenericBooster(MultiTaskLasso)
0.27
0.33
None
0.11
0.01
GenericBooster(Lasso)
0.27
0.33
None
0.11
0.01
GenericBooster(ElasticNet)
0.27
0.33
None
0.11
0.02
GenericBooster(DummyRegressor)
0.27
0.33
None
0.11
0.01
2it [00:00, 5.45it/s]
100%|██████████| 38/38 [00:14<00:00, 2.63it/s]
2it [00:00, 9.26it/s]
100%|██████████| 38/38 [00:14<00:00, 2.58it/s]
Elapsed: 29.76035761833191 seconds
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
RandomForestClassifier
1.00
1.00
None
1.00
0.30
GenericBooster(ExtraTreeRegressor)
1.00
1.00
None
1.00
0.17
GenericBooster(TransformedTargetRegressor)
1.00
1.00
None
1.00
0.26
GenericBooster(RidgeCV)
1.00
1.00
None
1.00
0.23
GenericBooster(Ridge)
1.00
1.00
None
1.00
0.15
GenericBooster(LinearRegression)
1.00
1.00
None
1.00
0.15
XGBClassifier
0.97
0.96
None
0.97
0.06
GenericBooster(MultiTask(SGDRegressor))
0.97
0.98
None
0.97
1.10
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.97
0.98
None
0.97
1.18
GenericBooster(MultiTask(LinearSVR))
0.97
0.98
None
0.97
3.71
GenericBooster(MultiTask(BayesianRidge))
0.97
0.98
None
0.97
1.86
GenericBooster(MultiTask(TweedieRegressor))
0.97
0.98
None
0.97
1.39
GenericBooster(Lars)
0.94
0.94
None
0.95
0.93
GenericBooster(KNeighborsRegressor)
0.92
0.93
None
0.92
0.19
GenericBooster(DecisionTreeRegressor)
0.92
0.92
None
0.92
0.22
GenericBooster(MultiTaskElasticNet)
0.69
0.61
None
0.61
0.03
GenericBooster(ElasticNet)
0.61
0.53
None
0.53
0.05
GenericBooster(MultiTaskLasso)
0.42
0.33
None
0.25
0.01
GenericBooster(LassoLars)
0.42
0.33
None
0.25
0.01
GenericBooster(Lasso)
0.42
0.33
None
0.25
0.01
GenericBooster(DummyRegressor)
0.42
0.33
None
0.25
0.01
GenericBooster(MultiTask(QuantileRegressor))
0.25
0.33
None
0.10
2.73
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
RandomForestClassifier
1.00
1.00
None
1.00
0.15
GenericBooster(ExtraTreeRegressor)
1.00
1.00
None
1.00
0.16
GenericBooster(TransformedTargetRegressor)
1.00
1.00
None
1.00
0.24
GenericBooster(RidgeCV)
1.00
1.00
None
1.00
0.22
GenericBooster(Ridge)
1.00
1.00
None
1.00
0.16
GenericBooster(LinearRegression)
1.00
1.00
None
1.00
0.15
XGBClassifier
0.97
0.96
None
0.97
0.06
GenericBooster(MultiTask(SGDRegressor))
0.97
0.98
None
0.97
0.84
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.97
0.98
None
0.97
1.18
GenericBooster(MultiTask(LinearSVR))
0.97
0.98
None
0.97
3.41
GenericBooster(MultiTask(BayesianRidge))
0.97
0.98
None
0.97
2.15
GenericBooster(MultiTask(TweedieRegressor))
0.97
0.98
None
0.97
1.91
GenericBooster(Lars)
0.94
0.94
None
0.95
0.93
GenericBooster(KNeighborsRegressor)
0.92
0.93
None
0.92
0.20
GenericBooster(DecisionTreeRegressor)
0.92
0.92
None
0.92
0.23
GenericBooster(MultiTaskElasticNet)
0.69
0.61
None
0.61
0.03
GenericBooster(ElasticNet)
0.61
0.53
None
0.53
0.04
GenericBooster(MultiTaskLasso)
0.42
0.33
None
0.25
0.01
GenericBooster(LassoLars)
0.42
0.33
None
0.25
0.01
GenericBooster(Lasso)
0.42
0.33
None
0.25
0.01
GenericBooster(DummyRegressor)
0.42
0.33
None
0.25
0.01
GenericBooster(MultiTask(QuantileRegressor))
0.25
0.33
None
0.10
2.78
2it [00:01, 1.90it/s]
100%|██████████| 38/38 [09:30<00:00, 15.02s/it]
2it [00:01, 1.03it/s]
100%|██████████| 38/38 [09:27<00:00, 14.94s/it]
Elapsed: 1141.7054164409637 seconds
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
RandomForestClassifier
0.97
0.97
None
0.97
0.56
XGBClassifier
0.97
0.97
None
0.97
0.50
GenericBooster(ExtraTreeRegressor)
0.96
0.96
None
0.96
1.75
GenericBooster(KNeighborsRegressor)
0.95
0.95
None
0.95
4.34
GenericBooster(LinearRegression)
0.94
0.94
None
0.94
4.47
GenericBooster(MultiTask(BayesianRidge))
0.94
0.94
None
0.94
51.97
GenericBooster(TransformedTargetRegressor)
0.94
0.94
None
0.94
2.54
GenericBooster(RidgeCV)
0.94
0.94
None
0.94
4.55
GenericBooster(Ridge)
0.94
0.94
None
0.94
0.63
GenericBooster(MultiTask(TweedieRegressor))
0.93
0.93
None
0.93
13.86
GenericBooster(DecisionTreeRegressor)
0.88
0.88
None
0.88
6.14
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.79
0.79
None
0.80
13.46
GenericBooster(MultiTask(LinearSVR))
0.37
0.39
None
0.26
297.07
GenericBooster(Lars)
0.20
0.20
None
0.21
19.23
GenericBooster(MultiTask(QuantileRegressor))
0.12
0.10
None
0.03
140.91
GenericBooster(MultiTask(SGDRegressor))
0.10
0.10
None
0.06
9.46
GenericBooster(LassoLars)
0.07
0.10
None
0.01
0.05
GenericBooster(Lasso)
0.07
0.10
None
0.01
0.07
GenericBooster(MultiTaskLasso)
0.07
0.10
None
0.01
0.04
GenericBooster(ElasticNet)
0.07
0.10
None
0.01
0.03
GenericBooster(DummyRegressor)
0.07
0.10
None
0.01
0.02
GenericBooster(MultiTaskElasticNet)
0.07
0.10
None
0.01
0.05
Accuracy
Balanced Accuracy
ROC AUC
F1 Score
Time Taken
Model
RandomForestClassifier
0.97
0.97
None
0.97
0.67
XGBClassifier
0.97
0.97
None
0.97
1.27
GenericBooster(ExtraTreeRegressor)
0.96
0.96
None
0.96
1.69
GenericBooster(KNeighborsRegressor)
0.95
0.95
None
0.95
4.76
GenericBooster(LinearRegression)
0.94
0.94
None
0.94
2.01
GenericBooster(MultiTask(BayesianRidge))
0.94
0.94
None
0.94
46.87
GenericBooster(TransformedTargetRegressor)
0.94
0.94
None
0.94
5.40
GenericBooster(RidgeCV)
0.94
0.94
None
0.94
3.93
GenericBooster(Ridge)
0.94
0.94
None
0.94
0.60
GenericBooster(MultiTask(TweedieRegressor))
0.93
0.93
None
0.93
14.96
GenericBooster(DecisionTreeRegressor)
0.88
0.88
None
0.88
4.12
GenericBooster(MultiTask(PassiveAggressiveRegressor))
0.79
0.79
None
0.80
12.68
GenericBooster(MultiTask(LinearSVR))
0.37
0.39
None
0.26
294.88
GenericBooster(Lars)
0.20
0.20
None
0.21
19.40
GenericBooster(MultiTask(QuantileRegressor))
0.12
0.10
None
0.03
145.91
GenericBooster(MultiTask(SGDRegressor))
0.10
0.10
None
0.06
10.30
GenericBooster(LassoLars)
0.07
0.10
None
0.01
0.02
GenericBooster(Lasso)
0.07
0.10
None
0.01
0.03
GenericBooster(MultiTaskLasso)
0.07
0.10
None
0.01
0.03
GenericBooster(ElasticNet)
0.07
0.10
None
0.01
0.03
GenericBooster(DummyRegressor)
0.07
0.10
None
0.01
0.02
GenericBooster(MultiTaskElasticNet)
0.07
0.10
None
0.01
0.03
Want to share your content on python-bloggers?
click here .