Bokeh-Scatter Plot Basics in Python
Want to share your content on python-bloggers? click here.
Bokeh is another data visualization library available in Python. One of Bokeh’s unique features is that it allows for interaction. In this post, we will learn how to make a basic scatterplot in Bokeh while also exploring some of the basic interactions that are provided by default.
Data Preparation
We are going to make a scatterplot using the “Duncan” data set that is available in the “pydataset” library. Below is the initial code.
from pydataset import data from bokeh.plotting import figure from bokeh.io import output_file, show
The code above is just the needed libraries. We loaded “pydataset” because this is where our data will come from. All of the other libraries are related to “bokeh.” “Figure” allows us to set up our axes for the scatterplot. “Output_file” allows us to create the file of our plot. Lastly, “show” allows us to show the plot of our visualization. In the code below we will load our dataset, give it a name, and print the first few rows.
df=data('Duncan') df.head()

In the code above we store the “Duncan” dataset in an object called “df” using the data() function. We then display a snippet of the data using the .head() function. The “Duncan” data shares information on jobs as defined by several variables. We will now proceed
Making the Scatterplot
We will now make our scatterplot. We have to do this in three steps.
- Make the axis
- Add the data to the plot
- Create the output file and show the results
Below is the code with the output
# Create a new figure fig = figure(x_axis_label="education", y_axis_label="income") #labels axises # Add circle glyphs fig.circle(x=df["education"], y=df["income"]) #adds the dots # Call function to produce html file and display plot output_file(filename="my_first_plot.html") show(fig)

At the top of the code, we create our axis information using the “figure” function. Here we are plotting education vs income and storing all of this in an object called “fig”. Next, we insert the data into our plot using the “circle” function. To insert the data we also have to subset the “df” dataframe for the variables that we want. Note that the data added to a plot are called “glyphs” in Bokeh. Lastly, we create an output file using a function with the same name and show the results.
To the right of your plot, there are also some interaction buttons as shown below
Here is what they do from top to bottom.
- Takes you to bokeh.org
- Pan the image
- Box zoom
- Wheel zoom
- Download image
- Resets image
- It takes you to information about the bokeh function
There are other interactions possible but these are the default ones when you make a plot.
Conclusion
Bokeh is one of many tools used in Python for data visualization. It is a powerful tool that can be used in certain contexts. The interactive tools can also enhance the user experience.
Want to share your content on python-bloggers? click here.