How to navigate common pitfalls with using Python in Excel

This article was first published on python - Stringfest Analytics , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In an earlier post, I discussed typical challenges Excel users face when transitioning to Python:

This article will explore the unique “gotchas” that arise specifically when integrating Python in Excel.

Importing packages

Python packages are crucial for streamlined and clear data analysis. Many standard data-related Python packages come pre-installed in Excel, eliminating the need for manual loading.

For details on Python’s initialization in Excel, its functioning, and the available packages, read this post:

Importing datasets

Python workflows usually start with importing datasets from various sources. That’s because Python is neither a database nor a spreadsheet and isn’t designed for data storage.

Currently, however, Python within Excel lacks the capability to import external data directly. Microsoft recommends using Power Query to import data and then interfacing with Python. You can learn more about this workflow here.

Read order of Python code cells

In Excel, the traditional formula calculations follow a dependency order rather than a strict directional order like left-to-right or top-to-bottom. Excel recalculates cells according to which cells depend on which, meaning if a cell value depends on the values of other cells, those other cells are calculated first. This can create a non-linear calculation path across the spreadsheet.

The execution order of Python cells in Excel, on the other hand, is always left-to-right, top-to-bottom, somewhat similar to how Python would execute commands in sequence in a regular Python environment. Within each cell, all code is executed top-to-bottom, just like any typical Python script.

Converting between Python objects and Excel values

Switching from Python object previews to Excel value ranges can also be confusing.

Take, for example, this simple task of wanting to sample 20 rows at random from the mpg DataFrame. You can download this example workbook here.

I’ll run the code to load this table into a DataFrame, then sample the rows. Rather than seeing 20 rows of data, however, I see the name of the resulting Python object, in this case DataFrame:

Creating a DataFrame object in Excel

The output, displaying the data type, is useful for diagnostics and further actions. The stacked square icon indicates a linked data type, allowing for easy access to additional attributes of the object with a click.

Getting a preview is helpful, but often you need the full results. For instance, if your code samples 20 rows, you’ll want to see all these rows, not just a preview, as available via this method:

To convert a Python object into a range of Excel values, use the shortcut Ctrl + Alt + Shift + M.

First, remove the previously previewed output, then apply the shortcut over the cell containing the DataFrame object:

This will convert the Python object to a range of Excel cells, then back again.

Improving calculation speed

As you add more Python cells to your workbook, you may notice slower calculation times. To speed this up, switch to Partial calculation. Go to the Formulas tab on the home ribbon, find the Calculations group, and choose Calculation Options > Partial.

Partial calculation mode

The Partial calculation setting switches Python cells and Data Tables to manual mode, preventing dependent formulas from recalculating with every cell change.

This setting accelerates workbook calculations by reducing recalculation frequency but risks leaving dependent calculations outdated.

To mitigate this, Excel allows you to visually mark “stale” values as struck through. This option is located under Calculation Options, where Partial calculation mode is set:

Format stale values

Resizing plots

Setting up data visualizations in Excel using Python requires some preparation. For instance, when creating a strip plot with seaborn to examine vehicle mileage distribution by origin, the initial output is, not surprisingly, the object type Image:

To display this seaborn plot in Excel, you can reveal the linked image data attribute or convert it to Excel data using Ctrl + Alt + Shift + M.

However, the default size may be too small, confined to the creation cell. For better visibility, right-click on the plot and select Picture in Cell > Place over Cells to adjust its size:

Place over cells

From here, you are free to resize the plot as desired.

Conclusion: More integration, more pitfalls

The integration of Python into Excel is a significant step towards broader Python use among Excel users, smoothing the interaction between the two languages. However, this seamless integration requires underlying infrastructure which is often confusing.

What challenges have you faced using Python with Excel? Let me know in the comments.

The post How to navigate common pitfalls with using Python in Excel first appeared on Stringfest Analytics.

To leave a comment for the author, please follow the link and comment on their blog: python - Stringfest Analytics .

Want to share your content on python-bloggers? click here.