Bokeh-Manipulating Glyph Color

Dr. Darrin

4 months ago

This article was first published on python – educational research techniques , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In this post, we will examine how to manipulate the color of the glyphs in a Bokeh data visualization. We are doing this not necessarily for aesthetic reasons but to convey additional information. Below are the initial libraries that we need. We will load additional libraries as required.

from pydataset import data
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.io import output_file, show

pydatasets will be used to load the data we need. The next two lines create the axes we will need and the objects for storing our data. The last line includes functions for saving and displaying our visualization.

Data Preparation

For this example, data preparation is simple. We will load the dataset “Duncan” using the data() function in an object called “df”. This dataset includes data about various occupations as measured on several variables. We will then display the data using the .head() method. Below is the code and output.

df=data('Duncan')
df.head()

Color Glyphs

In the example below, we will color the glyphs based on one of the variables. We will graph education vs income and color code the glyphs based on income. Below is the code followed by the output and finally the explanation.

from bokeh.transform import linear_cmap
from bokeh.palettes import RdBu8

source = ColumnDataSource(data=df)

# Create mapper
mapper = linear_cmap(field_name="income", palette=RdBu8, low=min(df["income"]), high=max(df["income"]))

# Create the figure
fig = figure(x_axis_label="Education", y_axis_label="Income", title="Education vs. Income")

# Add circle glyphs
fig.circle(x="education", y="income", source=source, color=mapper,size=16)

output_file(filename="education_vs_income.html")
show(fig)

Here is what we did

We had to load additional libraries. liner_cmap() will be used to create the actual coloring of the glyphs. RdBu8 is the color choice for the glyphs.
We then create the source of our data using the ColumSourcData() function
We create our mapper function using the linear_map() function. The arguments inside the function are the variable we are using (income) and the low and high values for the variable.
Next, we create our figure. We label our x and y axis based on the variables we are using and set the title.
We use the .circle() method to create our glyphs. Notice how we set the color argument to our “mapper” object.
The last two lines of code are for creating our output and showing it.

You could set the glyph color to a third variable, which would allow you to express a third variable in a two-dimensional space. For example, we could have used the “prestige” variable for the coloring of the glyphs rather than income, as income was already represented on the y-axis.

Adding a Color Bar

Adding a color bar will help to explain to a reader of our visualization what the color of the glyphs means. The code below is mostly the same and is followed by the output and lastly the explanation.

from bokeh.models import ColorBar
source = ColumnDataSource(data=df)

# Create mapper
mapper = linear_cmap(field_name="income", palette=RdBu8, low=min(df["income"]), high=max(df["income"]))

# Create the figure
fig = figure(x_axis_label="Education", y_axis_label="Income", title="Education vs. Income")
fig.circle(x="education", y="income", source=source, color=mapper,size=16)

# Create the color_bar
color_bar = ColorBar(color_mapper=mapper['transform'], width=8)

# Update layout with color_bar on the right
fig.add_layout(color_bar, "right")
output_file(filename="Education_vs_prestige_color_mapped.html")
show(fig)

Here is what happened.

We loaded a new function called ColorBar().
We create our source data (same as the previous example)
We create our mapper (same as the previous example)
We create our figure and glyphs (same as the previous example)
Next, we create our color bar using the ColorBar() function. Inside this function, we set the color_mapper argument to a transformed version of the mapper object we already created. We also can set the width of the color bar using the width argument. Everything we have done in this step is saved in an object called “color_bar”
We then use the .fig_layout() method on our “fig” object and place the object “color_bar” inside it along with the phrase “right” which tells Python to place the color bar on the right-hand side of the scatterplot.

There is one more example of glyph manipulation below.

Color by Category

In this last example, we will map an additional categorical variable onto our plot using color. Below is the code, output, and explanation.

# Import modules
from bokeh.transform import factor_cmap
from bokeh.palettes import Category10_5
source = ColumnDataSource(data=df)

# Create positions
TOOLTIPS=[('Education', '@education'), ('Prestige', '@prestige')]
positions = ["wc","bc","prof"]
fig = figure(x_axis_label="Education", y_axis_label="Prestige", title="Education vs Prestige", tooltips=TOOLTIPS)

# Add circle glyphs
fig.circle(x="education", y="prestige", source=source, legend_field="type", 
           size=16,fill_color=factor_cmap("type", palette=Category10_5, factors=positions))

output_file(filename="Education_vs_prestige_by_type.html")
show(fig)

We need to load some additional libraries. factpor_cmap() will be used to color the glyphs based on the categorical variable “types. Category10_5 is the color palette.
We create an object called “source” for our data using the ColumnSourceData() function.
We create an object called “TOOLTIPS”. This is an object that will be used to display the individual data points of a glyph when the mouse hovers over it in the visualization
We create an object called “positions” which is a list of all of the job types we want to match to different colors in our plot.
We create an object called “fig” which uses the figure() function to create the x and y axes and the title of the plot. Inside this function, we also set the tooltips argument equal to the “TOOLTIPS” object we created previously
Next, we use the .circle() method on our “fig” object. Most of the arguments are self-explanatory but notice how the argument “fill_color” is set to the function factor_cmap(). Inside this function we indicate the variable “type” as the variable to use, set the palette, and set the factors to the “position” object that we made earlier.
The last two lines save the output and display it.

Conclusion

Bokeh allows you to do many cool things when creating a visualization for data purposes. This post was focused on how to manipulate the glyphs in a scatterplot. However, there is so much more that can be done beyond what was shared here.

To leave a comment for the author, please follow the link and comment on their blog: python – educational research techniques .

Want to share your content on python-bloggers? click here.