Articles by Dr. Darrin

Visualizations with Altair

February 27, 2022 | Dr. Darrin

We are going to take a look at Altair which is a data visulization library for Python. What is unique abiut Altair compared to other packages experienced on this blog is that it allows for interactions. The interactions can take place inside jupyter or they can be exported and loaded ...
[...Read more...]

Random Forest Classification with Python

March 31, 2019 | Dr. Darrin

Random forest is a type of machine learning algorithm in which the algorithm makes multiple decision trees that may use different features and subsample to making as many trees as you specify. The trees then vote to determine the class of an example. This approach helps to deal with the ...
[...Read more...]

Data Exploration Case Study: Credit Default

February 21, 2019 | Dr. Darrin

Exploratory data analysis is the main task of a Data Scientist with as much as 60% of their time being devoted to this task. As such, the majority of their time is spent on something that is rather boring compared to building models. This post will provide a simple example of ...
[...Read more...]

RANSAC Regression in Python

February 7, 2019 | Dr. Darrin

RANSAC is an acronym for Random Sample Consensus. What this algorithm does is fit a regression model on a subset of data that the algorithm judges as inliers while removing outliers. This naturally improves the fit of the model due to the removal of some data points. The process that ...
[...Read more...]

Combining Algorithms for Classification with Python

January 20, 2019 | Dr. Darrin

Many approaches in machine learning involve making many models that combine their strength and weaknesses to make more accuracy classification. Generally, when this is done it is the same algorithm being used. For example, random forest is simply many decision trees being developed. Even when bagging or boosting is being ...
[...Read more...]

Gradient Boosting Regression in Python

January 13, 2019 | Dr. Darrin

In thisĀ  post, we will take a look at gradient boosting for regression. Gradient boosting simply makes sequential models that try to explain any examples that had not been explained by previously models. This approach makes gradient boosting superior to AdaBoost. Regression trees are mostly commonly teamed with boosting. There ...
[...Read more...]

Gradient Boosting Classification in Python

January 8, 2019 | Dr. Darrin

Gradient Boosting is an alternative form of boosting to AdaBoost. Many consider gradient boosting to be a better performer than adaboost. Some differences between the two algorithms is that gradient boosting uses optimization for weight the estimators. Like adaboost, gradient boosting can be used for most algorithms but is commonly ...
[...Read more...]

AdaBoost Regression with Python

January 6, 2019 | Dr. Darrin

This post will share how to use the adaBoost algorithm for regression in Python. What boosting does is that it makes multiple models in a sequential manner. Each newer model tries to successful predict what older models struggled with. For regression, the average of the models are used for the ...
[...Read more...]

AdaBoost Classification in Python

January 1, 2019 | Dr. Darrin

Boosting is a technique in machine learning in which multiple models are developed sequentially. Each new model tries to successful predict what prior models were unable to do. The average for regression and majority vote for classification are used. For classification, boosting is commonly associated with decision trees. However, boosting ...
[...Read more...]

Recommendation Engine with Python

December 25, 2018 | Dr. Darrin

Recommendation engines make future suggestion to a person based on their prior behavior. There are several ways to develop recommendation engines but for purposes, we will be looking at the development of a user-based collaborative filter. This type of filter takes the ratings of others to suggest future items to ...
[...Read more...]
1 3 4 5 6