In a previous post, we explained how to predict the stock prices using machine learning models. Today, we will show how we can use advanced artificial intelligence models such as the Long-Short Term Memory (LSTM). In previous post, we have used the LSTM models for Natural Language Generation (NLG) models, like the word based and the character based NLG models.
The LSTM Model
Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning having feedback connections. Not only can process single data points such as images, but also entire sequences of data such as speech or video. For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition, speech recognition, machine translation, anomaly detection, time series analysis etc.
The LSTM models are computationally expensive and require many data points. Usually, we train the LSTM models using GPU instead of CPU. Tensorflow is a great library for training LSTM models.
LSTM model for Stock Prices
Get the Data
We will build an LSTM model to predict the hourly Stock Prices. The analysis will be reproducible and you can follow along. First, we will need to load the data. We will take as an example the AMZN ticker, by taking into consideration the hourly close prices from ‘2019-06-01‘ to ‘2021-01-07‘
import yfinance as yf import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline data=yf.download('AMZN',start='2019-06-01', interval='1h', end='2021-01-07',progress=False)[['Close']] data.head() data.plot(figsize=(10,10))
Prepare the Data
Our train data will have as features the look back values, which are the lag values noted as ‘lb’. For this example, we set the lb equal to 10. Notice that we scale the data on the “train” dataset using the MinMaxScaler() from scikit-learn. Finally, for this example, we keep as train dataset the first 90% of the observations and as a test dataset the rest 10%
cl = data.Close.astype('float32') train = cl[0:int(len(cl)*0.80)] scl = MinMaxScaler() #Scale the data scl.fit(train.values.reshape(-1,1)) cl =scl.transform(cl.values.reshape(-1,1)) #Create a function to process the data into lb observations look back slices # and create the train test dataset (90-10) def processData(data,lb): X,Y = , for i in range(len(data)-lb-1): X.append(data[i:(i+lb),0]) Y.append(data[(i+lb),0]) return np.array(X),np.array(Y) lb=10 X,y = processData(cl,lb) X_train,X_test = X[:int(X.shape*0.90)],X[int(X.shape*0.90):] y_train,y_test = y[:int(y.shape*0.90)],y[int(y.shape*0.90):] print(X_train.shape,X_train.shape) print(X_test.shape, X_test.shape) print(y_train.shape) print(y_test.shape)
2520 10 281 10 2520 281
Build the LSTM Model
We will work with the following LSTM Model.
#Build the model model = Sequential() model.add(LSTM(256,input_shape=(lb,1))) model.add(Dense(1)) model.compile(optimizer='adam',loss='mse') #Reshape data for (Sample,Timestep,Features) X_train = X_train.reshape((X_train.shape,X_train.shape,1)) X_test = X_test.reshape((X_test.shape,X_test.shape,1)) #Fit model with history to check for overfitting history = model.fit(X_train,y_train,epochs=300,validation_data=(X_test,y_test),shuffle=False) model.summary()
Model: "sequential_33" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= lstm_51 (LSTM) (None, 256) 264192 _________________________________________________________________ dense_29 (Dense) (None, 1) 257 ================================================================= Total params: 264,449 Trainable params: 264,449 Non-trainable params: 0 _________________________________________________________________
Let’s get the predictions on the train and test dataset:
plt.figure(figsize=(12,8)) Xt = model.predict(X_train) plt.plot(scl.inverse_transform(y_train.reshape(-1,1)), label="Actual") plt.plot(scl.inverse_transform(Xt), label="Predicted") plt.legend() plt.title("Train Dataset")
plt.figure(figsize=(12,8)) Xt = model.predict(X_test) plt.plot(scl.inverse_transform(y_test.reshape(-1,1)), label="Actual") plt.plot(scl.inverse_transform(Xt), label="Predicted") plt.legend() plt.title("Test Dataset")
The model seems to work perfectly on the train dataset and very good on the test dataset. However, we need to stress out that the model predicts a single observation ahead. Meaning that for every new observation that it has to predict, it takes as input the previous 10 observations. In real life, this is a little bit useless since we mainly want to predict many observations ahead. Let’s see how we can do it in Python.
Predict N Steps Ahead
The logic here is to add the new predicted values as features in the input of the model so that we will be able to predict N steps ahead. In our case, we will predict ahead 251 observations, as many as the test dataset observations.
def processData(data,lb): X=  for i in range(len(data)-lb-1): X.append(data[i:(i+lb),0]) return np.array(X) # create the x_test_dummy cl2 =cl.copy() pred =  for i in range(X_test.shape): cl2[int(X.shape*0.90)+i+lb] = model.predict(X_test)[i] pred.extend(model.predict(X_test)[i]) X = processData(cl2,lb) X_train,X_test = X[:int(X.shape*0.90)],X[int(X.shape*0.90):] X_test = X_test.reshape((X_test.shape,X_test.shape,1)) prediction = scl.inverse_transform(np.array(pred).reshape(-1, 1)) plt.figure(figsize=(12,8)) plt.plot(scl.inverse_transform(y_test.reshape(-1,1)), label="Actual") plt.plot(prediction, label="Predicted") plt.title("Test Dataset 250 Obs Ahead")
Although the model worked very well to predict one observation ahead, when we tried to predict N observations ahead it was a failure. This makes sense because we “multiply the error” since our features are predicted values that include an error. Also, none model will be able to predict the future of n observations ahead.