What is Artificial Intelligence?—Training a Simple Neural Network Model
Want to share your content on python-bloggers? click here.
Artificial intelligence (AI) has widespread applications in the real world, such as healthcare, robotics, autonomous vehicles, retail, eCommerce, etc. Together, AI, machine learning (ML), and deep learning (DL) solve various academic and industrial problems.
Due to AI’s accurate predictive capabilities, the use of AI has skyrocketed over the last decade, and this growth will continue to increase in the future. Currently, the global AI market is valued at $387.45 billion and is expected to grow to $1394.30 billion by 2029.
With AI, we can develop smart machines and devices capable of performing tasks that usually require human intelligence.
In this blog, we will explore the field of artificial intelligence, its major types, and its industrial applications. We’ll also share a practical artificial intelligence example at the end. To understand how AI algorithms work, we will build an Artificial Neural Network (ANN) model using the Titanic Kaggle dataset to perform a simple classification task.
What is Artificial Intelligence?
Artificial intelligence is a set of tools and techniques that enables machines to mimic human intelligence using loads of data and smart algorithms known as models.
AI is an iterative process where the machine or algorithmic models learn from experience (historical data) during the model training phase. Then the trained AI models predict the output of real-world data for various tasks such as classification, regression, and clustering.
In summary, the term “AI” can be applied to a machine that performs actions related to the human mind and cognition, such as thinking, learning, seeing, listening, and problem-solving.
How Does Artificial Intelligence Work?
AI processes a large amount of data using an iterative algorithm to identify underlying patterns in the dataset. Every time an AI model executes, it tests and evaluates its performance. The results are observed and recorded to improve the model for a better outcome.
AI is an umbrella term–a combination of various technologies that work collectively to solve real-world problems. Some of these technologies are:
- Computer vision (CV)
- Machine learning
- Deep learning
- Natural language processing (NLP)
- Neural networks
- Cognitive computing
Types of Artificial Intelligence
Broadly, artificial intelligence can be classified into four main types, which are listed below:
- Reactive Machines
- Limited Memory
- Theory of Mind
- Self-Aware
Reactive Machines
Reactive machines perform basic operations and are regarded as the simplest level of AI. There is no learning phase at this level, as it reacts to input with some output without any information about past events. Static machine learning models, such as simple chess agents, are considered reactive machines, having the simplest architecture.
Limited Memory
This type of artificial intelligence model can record previous events and later use that data to make better predictions. They have limited memory designated for a particular task, such as a last three actions of a specific task. Due to memory use, their machine learning architecture becomes more complex compared to reactive machines.
Theory of Mind
In this type of artificial intelligence, the AI program interacts with or understands the emotions and thoughts of human beings. It is still in the early stages of development and can be observed in Natural Language Processing (NLP) models and applications. Modern NLP models can understand contextual information in human language to some extent.
Self-Aware
Lastly, in the future, AI models will become self-aware–at least, that’s what many AI enthusiasts believe. This kind of AI exists in theory and imagination, which brings a lot of fear to a certain audience that is concerned about AI taking over the world. Self-aware AI is beyond human intelligence and is considered independent intelligence.
Currently, AI applications are somewhere between limited memory and theory of mind. However, these applications are becoming more intelligent every day. One day, we might see a truly self-aware machine as well–who knows!
Why artificial neural networks?
One of the most popular domains in artificial intelligence are artificial neural networks (ANNs).
Artificial neural networks are a type of machine learning that is inspired by the human brain. The first artificial neural network was created by McCulloch and Pitts in 1943. It consisted of a series of interconnected artificial neurons and simulated a simplified neuron.
Since then, there have been many different types of artificial neural networks developed, most notably the Perceptron invented by Frank Rosenblatt in 1958. This was the first time that an artificial neural network could be trained to perform a specific task without being told how to do it beforehand.
Recently, ANNs have become very popular due to the rise of deep learning, which has applications in countless diverse fields: from classification and regression, to autonomous vehicles, and natural language processing.
Artificial Intelligence Example—Training an Artificial Neural Network Model for Classification
To explain artificial intelligence and how it works, we have used the Titanic dataset, which is widely available on Kaggle. The dataset contains Titanic passenger information such as age, sex, ticket number, cabin number, etc. The goal is to predict whether the passenger would survive on the Titanic or not!
We’ll apply the Artificial Neural Network (ANN) algorithm to the given dataset to make predictions. A typical ANN model has three layers that are closely interconnected with each other, namely the input layer, the output layer, and the hidden layer. To learn more about the ANN algorithm, watch this video.
Let’s discuss the steps involved in building an ANN model.
Step 1: Import libraries in Python
First, we will import some essential libraries. In Python, you can easily import any library with: pip install library_name command.
We’ll use the keras API for building the neural network, seaborn and matplotlib for visualization, pandas for data manipulation, and sklearn for data preprocessing. The code snippet below imports all these libraries.
from sklearn.model_selection import train_test_split from sklearn.preprocessing import MinMaxScaler import tensorflow as tf from tensorflow import keras import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline
Step 2: Loading the dataset
Our Titanic data is stored on the train.csv file. Load the data using read_csv(). pandas offer a dataframe to contain data along with various data manipulation methods. Next, use the pandas head() function to return the first five rows of the dataset.
Titanic_df = pd.read_csv('train.csv') Titanic_df.head()
Step 3: Data visualization
Data visualization helps inspect the data to understand its underlying patterns better. For instance, the bar plot below shows that the highest number of people who survived on Titanic belongs to the class 1 ticket.
sns.barplot(x='Pclass', y='Survived', data=Titanic_df)
In another example: Visualizing Pearson’s correlation matrix indicates the strength of the relationship between dataset variables. It shows whether they have a positive or negative correlation.
In Pearson’s correlation, the correlation values range from 0 to 1. A positive correlation means 1, and a negative correlation between variables means 0.
The correlation matrix identifies important features. Generally, features that are highly correlated are redundant and we can remove one of them to obtain better results. Follow the code snippet below.
import matplotlib.pyplot as plt import seaborn as sns plt.figure(figsize=(10,6)) sns.heatmap(Titanic_df.corr(), annot=True, cmap='coolwarm').set_title('seaborn') plt.show()
The survival rate of females is higher than that of males, based on the count plot, which shows the total number of people who survived.
sns.catplot(x ="Sex", hue ="Survived", kind ="count", data = Titanic_df)
Step 4: Data cleaning
In the data cleaning step, we perform a series of steps to extract high-quality data. First, we detect and remove the missing values from the dataset so the model can perform well on the given dataset. Use the pandas isnull() method to identify rows with missing values.
Titanic_df.isnull().sum()
There are missing values in certain columns like Age, Cabin, and Embarked. Typically, the column with the highest number of missing values can be dropped.
We can also determine unimportant columns using various techniques and drop them, as they don’t contribute much to the final output. For instance, PassengerID is an unimportant column as IDs don’t contain important information concerning neural networks. For the sake of simplicity in this tutorial, we are only keeping four features.
T_df1=Titanic_df.drop(['PassengerId','Name','Ticket','Fare','Cabin','Embarked','SibSp','Parch'],axis=1) T_df1.shape
Now, let’s remove the missing values that we identified above. First, detect the unique values using the dataframe’s unique() method for each of the four features. It will highlight missing values as “nan” and then eliminate these from the data using the dropna() method.
def show_unique(Titanic_df): for col in Titanic_df: print(f'{col}: {Titanic_df[col].unique()}') show_unique(T_df1) T_df1 = T_df1.dropna() T_df1.shape, show_unique(T_df1)
After that, change the Sex column, which contains “male” and “female”, into binary values of 0 and 1. 0 means female, and 1 means male.
#Change the sex column to into integer values 0 and 1 T_df1['Sex'] = T_df1['Sex'].replace(['female','male'],[0,1]) df1.loc[426]
The remaining four columns are stored in the variable and transformed into a numpy array for further processing.
X_Var = T_df1.drop(['Survived'],axis=1) Y_Var = T_df1.Survived X_Var.shape X_Var = X_Var.to_numpy() Y_Var = Y_Var.to_numpy()
Finally, normalize the data using min-max normalization, where the minimum value is mapped to zero and maximum value is mapped to one.
scaler = MinMaxScaler() sx = scaler.fit_transform(X_Var) X_Var = sx X_Var
Step 5: Training and testing data
Now, split the data into two parts: training data and testing data. And apply the ANN algorithm using the artificial neural network and deep learning library keras. The sequential model of ANN contains a dense layer and an activation function. The dense layer shows that the layers and the neurons in those layers are interconnected, while the activation function is used to decide whether the output of a neuron would contribute to the final output of the neural network.
X_Var_train,X_Var_test, Y_Var_train, Y_Var_test = train_test_split(X_Var, Y_Var, test_size=0.2, random_state=42) X_Var_train.shape,X_Var_test.shape
The code snippet below builds the model. You can experiment with the number of layers and their parameters.
model = keras.Sequential([ keras.layers.Flatten(input_shape=(3,)), keras.layers.Dense(10,activation='linear'), keras.layers.Dense(10,activation='linear'), keras.layers.Dense(10,activation='linear'), keras.layers.Dropout(0.1), keras.layers.Dense(1,activation='sigmoid'), ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) model.fit(X_Var_train,Y_Var_train,epochs=100)
Generally, the model’s accuracy increases with each training epoch, and the loss value decreases, which shows that our training phase is working as expected.
Finally, we can test our model on the unseen (testing) data to show whether it predicts well enough or not, using the model’s predict() method. Here you can see the results predicted by the model are correct.
model.evaluate(X_Var_test,Y_Var_test) #Testing the model on data X_Var_test[:5],Y_Var_test[:5]
#Convert the predictions based on a threshold value Y_Var_pred=model.predict([X_Var_test]) Y_Var_pred_actual = list() for i in range(len(Y_Var_pred)): if Y_Var_pred[i] > 0.5 : Y_Var_pred_actual.append(1) else : Y_Var_pred_actual.append(0) Y_Var_pred_actual[:5]
What’s Next in AI?
Artificial intelligence (AI) is booming–exponentially. New technological systems are being used to further expand human intelligence. Artificial neural networks and deep learning are one of the most popular algorithms in AI with many applications in diverse fields such as computer vision, autonomous vehicles and natural language processing.
Today, AI has become the top trend in almost every industry. Now AI-based applications and systems are being used to classify the largest galaxies and stars based on astronomical image data. The future of AI-based applications looks promising as AI continues to revolutionize the world.
Do you want to become a data scientist? Make sure to get in touch or to check out my data science course.
The post What is Artificial Intelligence?—Training a Simple Neural Network Model appeared first on The Data Scientist.
Want to share your content on python-bloggers? click here.