Want to share your content on python-bloggers? click here.
Elastic net regression combines the power of ridge and lasso regression into one algorithm. What this means is that with elastic net the algorithm can remove weak variables altogether as with lasso or to reduce them to close to zero as with ridge. All of these algorithms are examples of regularized regression.
This post will provide an example of elastic net regression in Python. Below are the steps of the analysis.
- Data preparation
- Baseline model development
- Elastic net model development
To accomplish this, we will use the Fair dataset from the pydataset library. Our goal will be to predict marriage satisfaction based on the other independent variables. Below is some initial code to begin the analysis.
from pydataset import data import numpy as np import pandas as pd pd.set_option('display.max_rows', 5000) pd.set_option('display.max_columns', 5000) pd.set_option('display.width', 10000) from sklearn.model_selection import GridSearchCV from sklearn.linear_model import ElasticNet from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error
Data Preparation
We will now load our data. The only preparation that we need to do is convert the factor variables to dummy variables. Then we will make our and y datasets. Below is the code.
df=pd.DataFrame(data('Fair')) df.loc[df.sex== 'male', 'sex'] = 0 df.loc[df.sex== 'female','sex'] = 1 df['sex'] = df['sex'].astype(int) df.loc[df.child== 'no', 'child'] = 0 df.loc[df.child== 'yes','child'] = 1 df['child'] = df['child'].astype(int) X=df[['religious','age','sex','ym','education','occupation','nbaffairs']] y=df['rate']
We can now proceed to creating the baseline model
Baseline Model
This model is a basic regression model for the purpose of comparison. We will instantiate our regression model, use the fit command and finally calculate the mean squared error of the data. The code is below.
regression=LinearRegression() regression.fit(X,y) first_model=(mean_squared_error(y_true=y,y_pred=regression.predict(X))) print(first_model) 1.0498738644696668
This mean standard error score of 1.05 is our benchmark for determining if the elastic net model will be better or worst. Below are the coefficients of this first model. We use a for loop to go through the model and the zip function to combine the two columns.
coef_dict_baseline = {} for coef, feat in zip(regression.coef_,X.columns): coef_dict_baseline[feat] = coef coef_dict_baseline Out[63]: {'religious': 0.04235281110639178, 'age': -0.009059645428673819, 'sex': 0.08882013337087094, 'ym': -0.030458802565476516, 'education': 0.06810255742293699, 'occupation': -0.005979506852998164, 'nbaffairs': -0.07882571247653956}
We will now move to making the elastic net model.
Elastic Net Model
Elastic net, just like ridge and lasso regression, requires normalize data. This argument is set inside the ElasticNet function. The second thing we need to do is create our grid. This is the same grid as we create for ridge and lasso in prior posts. The only thing that is new is the l1_ratio argument.
When the l1_ratio is set to 0 it is the same as ridge regression. When l1_ratio is set to 1 it is lasso. Elastic net is somewhere between 0 and 1 when setting the l1_ratio. Therefore, in our grid, we need to set several values of this argument. Below is the code.
elastic=ElasticNet(normalize=True) search=GridSearchCV(estimator=elastic,param_grid={'alpha':np.logspace(-5,2,8),'l1_ratio':[.2,.4,.6,.8]},scoring='neg_mean_squared_error',n_jobs=1,refit=True,cv=10)
We will now fit our model and display the best parameters and the best results we can get with that setup.
search.fit(X,y) search.best_params_ Out[73]: {'alpha': 0.001, 'l1_ratio': 0.8} abs(search.best_score_) Out[74]: 1.0816514028705004
The best hyperparameters was an alpha set to 0.001 and a l1_ratio of 0.8. With these settings we got an MSE of 1.08. This is above our baseline model of MSE 1.05 for the baseline model. Which means that elastic net is doing worse than linear regression. For clarity, we will set our hyperparameters to the recommended values and run on the data.
elastic=ElasticNet(normalize=True,alpha=0.001,l1_ratio=0.75) elastic.fit(X,y) second_model=(mean_squared_error(y_true=y,y_pred=elastic.predict(X))) print(second_model) 1.0566430678343806
Now our values are about the same. Below are the coefficients
coef_dict_baseline = {} for coef, feat in zip(elastic.coef_,X.columns): coef_dict_baseline[feat] = coef coef_dict_baseline Out[76]: {'religious': 0.01947541724957858, 'age': -0.008630896492807691, 'sex': 0.018116464568090795, 'ym': -0.024224831274512956, 'education': 0.04429085595448633, 'occupation': -0.0, 'nbaffairs': -0.06679513627963515}
The coefficients are mostly the same. Notice that occupation was completely removed from the model in the elastic net version. This means that this values was no good to the algorithm. Traditional regression cannot do this.
Conclusion
This post provided an example of elastic net regression. Elastic net regression allows for the maximum flexibility in terms of finding the best combination of ridge and lasso regression characteristics. This flexibility is what gives elastic net its power.
Want to share your content on python-bloggers? click here.