Bokeh-Scatter Plot Basics in Python

This article was first published on python – educational research techniques , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

Bokeh is another data visualization library available in Python. One of Bokeh’s unique features is that it allows for interaction. In this post, we will learn how to make a basic scatterplot in Bokeh while also exploring some of the basic interactions that are provided by default.

Data Preparation

We are going to make a scatterplot using the “Duncan” data set that is available in the “pydataset” library. Below is the initial code.

from pydataset import data
from bokeh.plotting import figure
from bokeh.io import output_file, show

The code above is just the needed libraries. We loaded “pydataset” because this is where our data will come from. All of the other libraries are related to “bokeh.” “Figure” allows us to set up our axes for the scatterplot. “Output_file” allows us to create the file of our plot. Lastly, “show” allows us to show the plot of our visualization. In the code below we will load our dataset, give it a name, and print the first few rows.

df=data('Duncan')
df.head()
ad

In the code above we store the “Duncan” dataset in an object called “df” using the data() function. We then display a snippet of the data using the .head() function. The “Duncan” data shares information on jobs as defined by several variables. We will now proceed

Making the Scatterplot

We will now make our scatterplot. We have to do this in three steps.

  1. Make the axis
  2. Add the data to the plot
  3. Create the output file and show the results

Below is the code with the output

# Create a new figure
fig = figure(x_axis_label="education", y_axis_label="income") #labels axises

# Add circle glyphs
fig.circle(x=df["education"], y=df["income"]) #adds the dots

# Call function to produce html file and display plot
output_file(filename="my_first_plot.html")
show(fig)

At the top of the code, we create our axis information using the “figure” function. Here we are plotting education vs income and storing all of this in an object called “fig”. Next, we insert the data into our plot using the “circle” function. To insert the data we also have to subset the “df” dataframe for the variables that we want. Note that the data added to a plot are called “glyphs” in Bokeh. Lastly, we create an output file using a function with the same name and show the results.

To the right of your plot, there are also some interaction buttons as shown below

Here is what they do from top to bottom.

  1. Takes you to bokeh.org
  2. Pan the image
  3. Box zoom
  4. Wheel zoom
  5. Download image
  6. Resets image
  7. It takes you to information about the bokeh function

There are other interactions possible but these are the default ones when you make a plot.

Conclusion

Bokeh is one of many tools used in Python for data visualization. It is a powerful tool that can be used in certain contexts. The interactive tools can also enhance the user experience.

To leave a comment for the author, please follow the link and comment on their blog: python – educational research techniques .

Want to share your content on python-bloggers? click here.