ggplot2 In Python using Plotnine

[This article was first published on Python – Predictive Hacks, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

If you are familiar with ggplot2 in R, you know that this library is one of the best-structured ways to make plots. We will show you how to create plots in python with the syntax of ggplot2, using the library plotnine.

Installation

# Using pip
$ pip install plotnine
      
# Or using conda
$ conda install -c conda-forge plotnine

Firstly, let’s import the libraries and create our dummy data.

import pandas as pd
import numpy as np
import plotnine as p9
import random

data = np.random.randint(1,10, size=300)
df = pd.DataFrame(data, columns=['variable'])
df['category']=random.choices(['A','B','C'],k=300)
df['variable2']=random.sample(range(10, 1000), 300)
df['variable3']=df['variable2'].apply(lambda x: x*random.random())
   variable category  variable2   variable3
0         3        A        747  356.282975
1         6        A        837  432.941801
2         2        A        941  195.533003
3         4        A        679  131.990057
4         7        A        912  696.910478

Now, Let’s create some basic plots using plotnine.

Histogram

p9.ggplot(df)+ p9.aes(x='variable')+p9.geom_histogram(binwidth=2)
Histogram

As you can see, it’s almost identical to ggplot. Let’s see some other basic examples.

Density Plot

p9.ggplot(df)+ p9.aes(x='variable') + p9.geom_density(fill="darkgrey")

Density Plot

Boxplot

p9.ggplot(df)+p9.aes(y='variable',x='category')+p9.geom_boxplot()+ p9.coord_flip()
Boxplot

Barchart

p9.ggplot(df)+p9.aes(x='category')+ p9.geom_bar()

Barchart

Scatterplot

p9.ggplot(df)+p9.aes(y='variable3',x='variable2')+p9.geom_point(size=4)

scatterplot
p9.ggplot(df)+p9.aes(y='variable3',x='variable2',color='category')+p9.geom_point(size=4)

scatterplot

Violin Plot

p9.ggplot(df)+p9.aes(y='variable2',x='category',fill='category')+ p9.geom_violin(scale = "width")

voilin plot

As you can see, the syntax is almost identical to ggplot2 in R. Be sure to check out dplyr pipes in python.

To leave a comment for the author, please follow the link and comment on their blog: Python – Predictive Hacks.

Want to share your content on python-bloggers? click here.