How to create interactive scatterplots with Python in Excel

This article was first published on python - Stringfest Analytics , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In an earlier post, I discussed adding conditional formatting to an Excel chart. One key benefit of using Python in Excel for this—rather than relying on standard Excel charting—is the flexibility it offers. With Python, it’s simpler to adjust the chart when adding new variables or expanding categories:

Let’s take it up a notch. In this post, we’ll make the scatterplot fully dynamic, allowing users to select the x, y, and categorical hue variables from dropdown menus. The chart will automatically update based on these selections.

Follow along using the exercise file provided below:

 

For this example, we’ll use the popular Palmer penguins dataset. To start, we need to identify which variables are quantitative and which are categorical, as this will determine how each is used in the plot. Luckily, we can use a bit of Pandas magic here!

Each column in a DataFrame is assigned a datatype, and Pandas is generally quite reliable at interpreting the correct type for each variable.

To begin, I’ll add some cells in a helper worksheet. This is where we’ll organize the options for quantitative and categorical variable selections.

Next, I’ll import the dataset into Python in Excel and create a variable to identify all the quantitative variables in the dataset. This will help us separate them for easy selection and use in the scatterplot.

I’ll do the same with the categorical variables:

At this point, you should see something like the following in the workbook: dynamic arrays displaying all the quantitative and categorical variables from the dataset. This result will make it easy to view and select the variables as we proceed with the scatterplot.

Data setup

We can actually use this dynamic array as an input for data validation, thanks to the spill operator, #. If this notation is new to you, check out my blog post on the topic:

Now, set up a spot to build the chart—either to the right of the dataset or in a new worksheet. Then, from the ribbon go to Data > Data Validation. Select List as the validation criteria and refer to the list of quantitative variables using the spill operator (#). This allows us to use the dynamic array directly, so whenever the list updates, the dropdown options will too.

Data validation x axis

Repeat the data validation setup for the y-axis, using the same quantitative variables list as the input, and then do the same for the category, using the category range for these options.

Finally, let’s add a dynamic chart title. Unfortunately, Seaborn doesn’t add one automatically, so we’ll have to set it ourselves. Now, we have all the inputs needed to build this chart! The user can define what goes along each axis and select the category to color by, making for a fully customizable visualization:

Concatenate dynamic title

Now it’s time to create the scatterplot, and I’ll be using Seaborn for this. Since these inputs are only intended for this scatterplot, I’ll place them directly in the function arguments. However, if there’s a chance we might reuse these inputs elsewhere, it’s a good idea to assign them to their own variables first and then reference those variables in the plot. This keeps things modular and makes future adjustments easier!

Awesome! With everything set up, you should now see a scatterplot that updates dynamically like below. The user can select their own x-axis, y-axis, and hue categories, and the plot adjusts automatically—no manual tweaking needed. This shows just how scalable data visualization with Python in Excel can be.

Finished product dynamic scatterplot!

This approach can easily be adapted to build other user-driven interactive visualizations. For example, you could create a bar chart where users can not only select the category variable to plot but also choose how to aggregate the results—like displaying sums, averages, and so forth.

What questions do you have about using these techniques to create interactive data visualizations with Seaborn? Let me know in the comments.

The post How to create interactive scatterplots with Python in Excel first appeared on Stringfest Analytics.

To leave a comment for the author, please follow the link and comment on their blog: python - Stringfest Analytics .

Want to share your content on python-bloggers? click here.