learningmachine for Python (new version)
Want to share your content on python-bloggers? click here.
Last week, I (re)introduced learningmachine
, an R package for Machine Learning that includes uncertainty quantification for regression and classification, and explainability through sensitivity analysis. This week, I talk about learningmachine
for Python. The Python version is a port of the R package, which means:
- It’s faster to install if R is already installed on your machine (otherwise, the Python package will attempt to install R and the package dependencies by itself)
- If R and the package dependencies are not already installed it (learningmachine Python) may take a long time to get started, but ONLY the first time it’s installed and run
Not everything is ultra-smooth yet (documentation coming in a few weeks), but you can already do some advanced stuff, as shown below.
The next algorithm I’ll include in learningmachine
is the Bayesian one described in this document, that learns in a way that’s most intuitive to us (online instead of batch).
Install learningmachine
from GitHub (tested on macOS, OK on Posit Cloud, KO on Google Colab)
!pip install git+https://github.com/Techtonique/learningmachine_python.git --verbose
Examples
import learningmachine as lm import numpy as np import pandas as pd from sklearn.datasets import load_diabetes, load_wine from sklearn.datasets import load_wine, load_iris, load_breast_cancer from sklearn.datasets import fetch_california_housing from sklearn.model_selection import train_test_split, cross_val_score from rpy2.robjects.vectors import FloatMatrix, FloatVector, StrVector from time import time from sklearn.metrics import mean_squared_error from math import sqrt # 1. Regression diabetes = load_diabetes() X = pd.DataFrame(diabetes.data[:150], columns=diabetes.feature_names) y = diabetes.target[:150] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1213) print("\n ----- fitting krr ----- \n") fit_obj2 = lm.Regressor(method="krr", pi_method="none") start = time() fit_obj2.fit(X_train, y_train, lambda_=0.05) # R's `lambda` is renamed as `lambda_` in Python as `lambda` is reserved print("Elapsed time: ", time() - start) print(fit_obj2.summary(X=X_test, y=y_test)) # 2. Classification datasets = [load_wine(), load_iris(), load_breast_cancer()] print("\n ----- fitting Kernel Ridge Regression ----- \n") for dataset in datasets: print(f"Description: {dataset.DESCR}") X = pd.DataFrame(dataset.data, columns=dataset.feature_names) y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123) fit_obj = lm.Classifier(method = "krr", pi_method="none") start = time() fit_obj.fit(X_train, y_train, reg_lambda = 0.05) print("Elapsed time: ", time() - start) ## Compute accuracy print(fit_obj.summary(X=X_test, y=y_test, class_index=0)) print("\n ----- fitting xgboost ----- \n") for dataset in datasets: print(f"Description: {dataset.DESCR}") X = pd.DataFrame(dataset.data, columns=dataset.feature_names) y = dataset.target X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123) fit_obj = lm.Classifier(method = "xgboost", pi_method="kdesplitconformal", type_prediction_set = 'score', B=100) print("nb_hidden = 0 -----") # no hidden layer start = time() fit_obj.fit(X_train, y_train, nrounds=100, eta=0.05, max__depth=4, verbose=0) # dot ('.') in R parameters is replaced by '__' print("Elapsed time: ", time() - start) print(fit_obj.predict(X_test)) print(fit_obj.summary(X=X_test, y=y_test, class_index=1)) # specify the class whose probability is of interest fit_obj = lm.Classifier(method = "xgboost", pi_method="kdesplitconformal", type_prediction_set = 'score', nb_hidden = 5, B=100) print("nb_hidden = 5 -----") # hidden layer with 5 nodes start = time() fit_obj.fit(X_train, y_train, nrounds=100, eta=0.05, max__depth=4, verbose=0) # dot ('.') in R parameters is replaced by '__' print("Elapsed time: ", time() - start) print(fit_obj.predict(X_test)) print(fit_obj.summary(X=X_test, y=y_test, class_index=1)) # specify the class whose probability is of interest
Want to share your content on python-bloggers? click here.