Building a Rule-based Chatbot in Python
Want to share your content on python-bloggers? click here.
Chatbots have become extremely popular in recent years and their use in the industry has skyrocketed. They have found a strong foothold in almost every task that requires text-based public dealing. They have become so critical in the support industry, for example, that almost 25% of all customer service operations are expected to use them by 2020.
In the first part of A Beginners Guide to Chatbots, we discussed what chatbots were, their rise to popularity and their use-cases in the industry. We also saw how the technology has evolved over the past 50 years.
In this second part of the series, we’ll be taking you through how to build a simple Rule-based chatbot in Python. Before we start with the tutorial, we need to understand the different types of chatbots and how they work.
Types of Chatbots
Chatbots can be classified into two different types, based on how they are built:
Rule-Based Chatbots
Rule-based chatbots are pretty straight forward. They are provided with a database of responses and are given a set of rules that help them match out an appropriate response from the provided database. They cannot generate their own answers but with an extensive database of answers and smartly designed rules, they can be very productive and useful.
The simplest form of Rule-based Chatbots have one-to-one tables of inputs and their responses. These bots are extremely limited and can only respond to queries if they are an exact match with the inputs defined in their database.
AI-Based Chatbots
With the rise in the use of machine learning in recent years, a new approach to building chatbots has emerged. Using artificial intelligence, it has become possible to create extremely intuitive and precise chatbots tailored to specific purposes.
Unlike their rule-based kin, AI based chatbots are based on complex machine learning models that enable them to self-learn.
Now that we’re familiar with how chatbots work, we’ll be looking at the libraries that will be used to build our simple Rule-based Chatbot.
Natural Language Toolkit (NLTK)
Natural Language Toolkit is a Python library that makes it easy to process human language data. It provides easy-to-use interfaces to many language-based resources such as the Open Multilingual Wordnet, as well as access to a variety of text-processing libraries.
Regular Expression (RegEx) in Python
A regular expression is a special sequence of characters that helps you search for and find patterns of words/sentences/sequence of letters in sets of strings, using a specialized syntax. They are widely used for text searching and matching in UNIX.
Python includes support for regular expression through the re
package.
Building a Chatbot
This very simple rule based chatbot will work by searching for specific keywords in inputs given by a user. The keywords will be used to understand what action the user wants to take (user’s intent). Once the intent is identified, the bot will then pick out a response appropriate to the intent.
The list of keywords the bot will be searching for and the dictionary of responses will be built up manually based on the specific use case for the chatbot.
We’ll be designing a very simple chatbot for a Bank. The bot will be able to respond to greetings (Hi, Hello etc.) and will be able to answer questions about the bank’s hours of operation.
A flow of how the chatbot would process inputs is shown below;
We will be following the steps below to build our chatbot
- Importing Dependencies
- Building the Keyword List
- Building a dictionary of Intents
- Defining a dictionary of responses
- Matching Intents and Generating Responses
Importing Dependencies
The first thing we’ll need to do is import the packages/libraries we’ll be using. re
is the package that handles regular expression in Python. We’ll also be using WordNet from NLTK. WordNet is a lexical database that defines semantical relationships between words. We’ll be using WordNet to build up a dictionary of synonyms to our keywords. This will help us expand our list of keywords without manually having to introduce every possible word a user could use.
# Importing modules import re from nltk.corpus import wordnet
Building a List of Keywords
Once we have imported our libraries, we’ll need to build up a list of keywords that our chatbot will look for. This list can be as exhaustive as you want. The more keywords you have, the better your chatbot will perform.
As discussed previously, we’ll be using WordNet to build up a dictionary of synonyms to our keywords. For details about how WordNet is structured, visit their website.
Code:
# Building a list of Keywords list_words=['hello','timings'] list_syn={} for word in list_words: synonyms=[] for syn in wordnet.synsets(word): for lem in syn.lemmas(): # Remove any special characters from synonym strings lem_name = re.sub('[^a-zA-Z0-9 \n\.]', ' ', lem.name()) synonyms.append(lem_name) list_syn[word]=set(synonyms) print (list_syn)
Output:
hello {'hello', 'howdy', 'hi', 'hullo', 'how do you do'} timings {'time', 'clock', 'timing'}
Here, we first defined a list of words list_words
that we will be using as our keywords. We used WordNet to expand our initial list with synonyms of the keywords. This list of keywords is stored in list_syn
.
New keywords can simply be added to list_words
. The chatbot will automatically pull their synonyms and add them to the keywords dictionary. You can also edit list_syn
directly if you want to add specific words or phrases that you know your users will use.
Building a Dictionary of Intents
Once our keywords list is complete, we need build up a dictionary that matches our keywords to intents. We also need to reformat the keywords in a special syntax that makes them visible to Regular Expression’s search function.
Code:
# Building dictionary of Intents & Keywords keywords={} keywords_dict={} # Defining a new key in the keywords dictionary keywords['greet']=[] # Populating the values in the keywords dictionary with synonyms of keywords formatted with RegEx metacharacters for synonym in list(list_syn['hello']): keywords['greet'].append('.*\\b'+synonym+'\\b.*') # Defining a new key in the keywords dictionary keywords['timings']=[] # Populating the values in the keywords dictionary with synonyms of keywords formatted with RegEx metacharacters for synonym in list(list_syn['timings']): keywords['timings'].append('.*\\b'+synonym+'\\b.*') for intent, keys in keywords.items(): # Joining the values in the keywords dictionary with the OR (|) operator updating them in keywords_dict dictionary keywords_dict[intent]=re.compile('|'.join(keys)) print (keywords_dict)
Output:
{'greet': re.compile('.*\\bhello\\b.*|.*\\bhowdy\\b.*|.*\\bhi\\b.*|.*\\bhullo\\b.*|.*\\bhow-do-you-do\\b.*'), 'timings': re.compile('.*\\btime\\b.*|.*\\bclock\\b.*|.*\\btiming\\b.*')}
The updated and formatted dictionary is stored in keywords_dict
. The intent is the key and the string of keywords is the value of the dictionary.
Let’s look at one key-value pair of the keywords_dict
dictionary to understand the syntax of Regular Expression;
{'greet': re.compile('.*\\bhullo\\b.*|.*\\bhow-do-you-do\\b.*|.*\\bhowdy\\b.*|.*\\bhello\\b.*|.*\\bhi\\b.*')
Regular Expression uses specific patterns of special Meta-Characters to search for strings or sets of strings in an expression.
Since we need our chatbot to search for specific words in larger input strings we use the following sequences of meta-characters:
.*\\bhullo\\b.*
In this specific sequence, the keyword (hullo) is encased between a \b
sequence. This tells the RegEx Search function that the search parameter is the keyword (hullo).
The first sequence \bhullo\b
is encased between a period-star .*
sequence. This sequence tells the RegEx Search function to search the entire input string from beginning to end for the search parameter (hullo).
In the dictionary, multiple such sequences are separated by the OR |
operator. This operator tells the search function to look for any of the mentioned keywords in the input string.
More details about Regular Expression and its syntax can be found here.
You can add as many key-value pairs to the dictionary as you want to increase the functionality of the chatbot.
Defining Responses
The next step is defining responses for each intent type. This part is very straightforward. The responses are described in another dictionary with the intent being the key.
We’ve also added a fallback intent and its response. This is a fail-safe response in case the chatbot is unable to extract any relevant keywords from the user input.
Code:
# Building a dictionary of responses responses={ 'greet':'Hello! How can I help you?', 'timings':'We are open from 9AM to 5PM, Monday to Friday. We are closed on weekends and public holidays.', 'fallback':'I dont quite understand. Could you repeat that?', }
Matching Intents and Generating Responses
Now that we have the back-end of the chatbot completed, we’ll move on to taking an input from the user and searching the input string for our keywords.
We use the RegEx Search function to search the user input for keywords stored in the value field of the keywords_dict
dictionary. If you recall, the values in the keywords_dict
dictionary were formatted with special sequences of meta-characters. RegEx’s search function uses those sequences to compare the patterns of characters in the keywords with patterns of characters in the input string.
If a match is found, the current intent gets selected and is used as the key to the responses
dictionary to select the correct response.
Code:
print ("Welcome to MyBank. How may I help you?") # While loop to run the chatbot indefinetely while (True): # Takes the user input and converts all characters to lowercase user_input = input().lower() # Defining the Chatbot's exit condition if user_input == 'quit': print ("Thank you for visiting.") break matched_intent = None for intent,pattern in keywords_dict.items(): # Using the regular expression search function to look for keywords in user input if re.search(pattern, user_input): # if a keyword matches, select the corresponding intent from the keywords_dict dictionary matched_intent=intent # The fallback intent is selected by default key='fallback' if matched_intent in responses: # If a keyword matches, the fallback intent is replaced by the matched intent as the key for the responses dictionary key = matched_intent # The chatbot prints the response that matches the selected intent print (responses[key])
Sample conversation with Bankbot:
The chatbot picked the greeting from the first user input (‘Hi’) and responded according to the matched intent. The same happened when it located the word (‘time’) in the second user input. The third user input (‘How can I open a bank account’) didn’t have any keywords that present in Bankbot’s database and so it went to its fallback intent.
You can add as many keywords/phrases/sentences and intents as you want to make sure your chatbot is robust when talking to an actual human.
Conclusion
This blog was a hands-on introduction to building a very simple rule-based chatbot in python. We only worked with 2 intents in this tutorial for simplicity. You can easily expand the functionality of this chatbot by adding more keywords, intents and responses.
As we saw, building a rule-based chatbot is a laborious process. In a business environment, a chatbot could be required to have a lot more intents depending on the tasks it is supposed to undertake.
In such a situation, rule-based chatbots become very impractical as maintaining a rule base would become extremely complex. In addition, the chatbot would severely be limited in terms of its conversational capabilities as it is near impossible to describe exactly how a user will interact with the bot.
AI-based Chatbots are a much more practical solution for real-world scenarios. In the next blog in the series, we’ll be looking at how to build a simple AI-based Chatbot in Python.
Do you want to learn more about machine learning and it’s applications? Check out Data Science Dojo’s online data science certificate program!
Want to share your content on python-bloggers? click here.