Python in Excel: How to do inferential statistics with Copilot
Want to share your content on python-bloggers? click here.
As a data analyst, spotting intriguing patterns and insights in data is a thrilling part of your role. But how do you determine whether those patterns actually mean something or are simply random noise? That’s exactly where inferential statistics come into play. These techniques help you understand if the relationships you observe genuinely represent the reality of your data… or if you’re just seeing things.
Traditionally, Excel analysts leaned heavily on the built-in Analysis ToolPak for inferential stats. While this ToolPak had its merits, it was notoriously limited. Analysts had access to a relatively narrow set of basic tests, rigid and sometimes confusing outputs, and minimal flexibility in how they explored their data. These limitations often forced analysts into cumbersome, manual calculations, distracting them from the valuable work of interpretation and strategic insight.
Fortunately, Excel’s recent integration with Python through Advanced Analysis and Copilot changes the game dramatically. You now have powerful inferential methods available directly in Excel, letting you spend less time wrestling with the numbers and more time uncovering the story behind your data.
Let’s see it in action using the Windsor housing prices dataset, which you can download below:
If you’ve never used Advanced Analysis with Copilot, you can learn more about how to start up and use this program at the below post.
Independent samples t-test
Let’s start our analysis with a fundamental inferential statistics technique: the independent samples t-test. We’ll use this to check whether there’s a meaningful difference in house prices between homes located in preferred versus non-preferred neighborhoods. This comparison matters to data analysts because identifying factors that significantly influence property values helps stakeholders make informed decisions about pricing, investment, and resource allocation:
Perform an independent samples t-test comparing house prices between homes in preferred neighborhoods (prefarea = ‘yes’) and homes outside preferred neighborhoods (prefarea = ‘no’).”

Copilot conveniently gives us the t-statistic and p-value, along with an interpretation about statistical significance. But significance alone doesn’t tell us whether the difference matters in practice. That’s where effect size comes in.
Effect size measures the practical importance or magnitude of the difference, helping analysts determine if the observed difference in house prices between preferred and non-preferred neighborhoods is meaningful enough to influence real-world decisions. Let’s ask Copilot for help here:
Calculate the effect size for the t-test comparing prices in preferred versus non-preferred neighborhoods, and explain whether this difference is practically meaningful for potential home buyers or real estate investors.

Copilot calculated Cohen’s d as approximately 0.82, indicating a large, practically meaningful difference in house prices between preferred and non-preferred neighborhoods. Cohen’s d helps analysts quickly understand the real-world importance of differences identified through statistical tests. However, it doesn’t show uncertainty about this estimate. Confidence intervals could offer clearer insight by providing a likely range for the true price difference, helping analysts and stakeholders better assess risk. We could easily prompt Copilot to calculate these intervals next, continuing to refine our analysis.
Effectively using Copilot and Python for this kind of inferential analysis and the following examples also depends on your ability to ask the right questions. You need a strong intuition about what matters to your stakeholders. Is the observed difference in prices enough to shift marketing strategies or investment decisions? Only when you deeply understand your audience, their goals, and their decision-making criteria can you confidently interpret the results and determine if you’re truly on the right track.
Analysis of Variance (ANOVA)
Next, we’ll use Copilot to perform an ANOVA test, or Analysis of Variance. This statistical method lets us compare average values across multiple groups simultaneously:
Run an ANOVA test to determine if there’s a significant difference in average house prices based on the number of stories.

Copilot returned a very small p-value (approximately 2.12e-24), indicating a highly significant difference in price across different story categories. But ANOVA alone doesn’t tell us specifically where these differences lie, which is why we perform a post-hoc test.
Conduct post-hoc tests following the ANOVA to pinpoint exactly which categories of house stories differ significantly in price. Provide clear explanations of which story-level differences matter most for real estate marketing strategies

Running a Tukey post-hoc test allowed Copilot to pinpoint exactly which story categories differ significantly. All pairs showed significant differences, but the most substantial price differences involved homes with four stories. For example, comparing one-story and four-story homes revealed an average price difference of about $43,400.
For data analysts working in real estate, this kind of detailed insight is crucial. It helps marketing teams understand exactly which property attributes (such as multi-story homes) can command higher market prices and strategically position these features in promotional efforts.
Chi-square test of independence
Next, let’s conduct a Chi-square test using Copilot. This test allows analysts to determine if two categorical variables (like “yes” or “no” features) are related or independent.
Perform a Chi-square test to see if having central air conditioning (‘airco’) is related to being located in a preferred neighborhood (‘prefarea’).

In this case, the test resulted in a small p-value (0.0095), indicating a statistically significant relationship between having central air conditioning and being in a preferred neighborhood.
However, a Chi-square test only shows if there’s an association, not how strong or practically meaningful it is. To better understand this relationship, we can request additional analysis from Copilot.
Calculate Cramér’s V or Phi coefficient from the Chi-square test examining the association between central air conditioning and preferred neighborhoods. Interpret what this means for marketing campaigns targeting these neighborhoods

In this case, Copilot calculated Cramér’s V, which came out to approximately 0.11, indicating a small effect. From a marketing perspective, while homes in preferred neighborhoods might be slightly more likely to have central air, this factor alone probably isn’t significant enough to emphasize heavily in campaigns. Analysts might instead focus on other attributes with stronger appeal.
Conclusion
Inferential statistics allow analysts to confidently determine if the patterns they see in data are meaningful or just noise:
With Excel’s new integration of Python and Copilot, performing robust statistical analyses, previously cumbersome or limited, has become much more accessible. Analysts can now spend less time on tedious calculations and more on practical insights and strategic decision-making.
What questions do you have about this Copilot and Python use case, or what other scenarios would you like to explore? Let me know in the comments.
The post Python in Excel: How to do inferential statistics with Copilot first appeared on Stringfest Analytics.
Want to share your content on python-bloggers? click here.