Python-bloggers

Make a Bar Graph with Bokeh in Python

This article was first published on python – educational research techniques , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

Bokeh is a data visualization library available in Python with the unique ability of interaction. In this video, we will look at how to make a basic bar graph using bokeh.

To begin we need to load certain libraries as shown below.

from pydataset import data
import pandas as pd
from bokeh.plotting import figure
from bokeh.io import output_file, show

In the code above, we load the “pydataset” library to gain access to the data we will use. Next, we load “pandas” which will help us with some data preparation. The last two libraries are related to “bokeh.” The “figure” function will be used to set the actual plot, the “output_file” function will allow us to save our plot as an HTML file and the “show” function will allow us to display our plot.

Data Preparation

We need to do two things to be ready to create our bar graph. First, we need to load the data. Second, we need to calculate group means for the bar graph. Below is the code for the first step followed by the output.

df=data('Duncan')
df.head()

In the code above we use the “data” function to load the “Duncan” dataset into an object called “df”. Next, we display the output of this. The “Duncan” dataset contains data on different jobs, the type of job, income, education, and prestige. We want to graph prestige and job type as a bar graph which will require us to calculate the mean of prestige by type. The code for this is below.

# Calculate group means of prestige
positions = df.groupby('type', as_index=False)['prestige'].mean()
positions

In the code above we use the “groupby” function on the “df” object. Inside the function, we indicate we want to group by “type”. The “as_index” argument is set to false so that the “type” column is not set at the index or you can say as the row numbers. Next, we subset the data using square brackets to only include the “prestige” column. Lastly, we indicate that we want to calculate the “mean”. The result is that there are three job types and we have the mean for each job’s prestige. The job types and means from this table above are what we will use for making our visualization.

Bar Graph

We are now ready to make our bar graph. Below is the code followed by the output.

# Instantiate figure
fig = figure(x_axis_label="positions", y_axis_label="Prestige", x_range=positions["type"]) 

# Add bars
fig.vbar(x=positions["type"], top=positions["prestige"],width=0.9)

# Produce the html file and display the plot
output_file(filename="Prestige.html")
show(fig)

Here are the steps.

  1. We began by creating the “fig” object. We labeled are x and y axes and also indicated the range of the x values which means determining the categories of our data. For our purposes, this was the unique job type in the “types” column.
  2. Next, we use the “vbar” function to make our bar graph. The x values were set to the “type” column from the “positions” object. The y or “top” values were set to the means of “prestige” from the “positions” object. The “width” argument was set to 0.9 to ensure there was a little whitespace between the bars.
  3. The “output_file” creates a saved plot and the “show” function displays the bar graph.

Conclusion

Bokeh has lots of cool tools available for the data analyst. This post was focused on bar graphs but this is only the most basic information that has been shared here. There is much more possible with this library.

To leave a comment for the author, please follow the link and comment on their blog: python – educational research techniques .

Want to share your content on python-bloggers? click here.
Exit mobile version