Python in Excel: How to build decision trees with Copilot
Want to share your content on python-bloggers? click here.
Decision trees are intuitive analytical tools that help us explore and predict outcomes by asking simple, branching questions about our data. They’re popular because they’re visual, easy to interpret, and closely mimic human decision-making processes. However, Excel users historically faced an uphill battle using decision trees. There’s no built-in Excel feature for building them, forcing analysts either to leave Excel entirely or to spend hours painstakingly crafting complex logical formulas. Thankfully, that’s no longer the case.
With the integration of Python and Copilot’s Advanced Analysis directly inside Excel, decision trees have gone from intimidating to effortless. Excel users can now leverage Python’s powerful machine learning capabilities without leaving the comfort of their spreadsheet. Even better, Copilot guides you through each step, helping you understand and interpret results clearly… no coding or advanced statistics degree required.
In this post, we’ll demonstrate how you can easily create and interpret decision trees using Python and Copilot in Excel, helping HR professionals understand factors driving employee attrition. To follow along, download the IBM HR Employee Attrition dataset below. We’ll use it to build a simple decision tree model, evaluate its accuracy, visualize key insights, and turn these findings into actionable business recommendations.
Setting the business context
We start our analysis with this straightforward prompt:
“Briefly summarize what this dataset is about and explain why using a decision tree might help an HR analyst better understand attrition.”
Here, we’re taking a moment to step back and ensure we understand the underlying business problem: Why do employees leave, and what can we do about it?
Copilot’s output deliberately moves beyond simply describing what is in the dataset. It clarifies why each factor, such as age, role, income, or satisfaction level, is relevant in an HR context. By highlighting that a decision tree visually separates employees based on traits associated with higher or lower attrition risk, Copilot helps us understand the analytical process itself.
Building the decision tree
For our next prompt, we’ll Copilot to build our decision tree model, automatically picking out the most important factors for predicting employee attrition:
“Build a basic decision tree model to predict employee attrition using the most relevant features from the dataset. Explain briefly why the model chose these features as important.”

In the resulting output, we can see the results clearly listed out, with features like MonthlyIncome, OverTime, and TotalWorkingYears emerging as the top predictors.
But Copilot doesn’t leave us hanging there: it also gives us context. For instance, it points out that employees with lower monthly incomes or who frequently work overtime tend to leave the company more often. This makes sense from a practical HR standpoint, right? Knowing these drivers lets us immediately think about actionable solutions.
With Copilot’s help, we’re quickly moving from raw data toward insights that can genuinely impact employee retention and organizational strategy.
Testing the decision tree on new data
Here’s our next prompt. We’re now checking how well our decision tree works with new data, something it hasn’t “seen” before:
“Split the data into training and testing sets (80/20). Train the decision tree model on the training set and tell me how accurately it predicts attrition on new data. Briefly explain why this testing step matters for business decisions.”
Why are we doing this? Because a model might be great at making sense of the past (training data) but struggle to handle future or unseen cases. Splitting the dataset lets us simulate real-world scenarios—training the model on one portion of data, then testing it on a smaller, separate portion to gauge how it might perform “in the wild.”

Copilot’s output here is clear and straightforward. It tells us the decision tree predicted employee attrition accurately about 86% of the time on new, unseen data. Then, importantly, it explains why this matters: testing gives us confidence that our insights aren’t just theoretical.
Building and interpreting a visualization
For our next prompt, we’ll ask Copilot to create a clear visual representation of our trained decision tree right within Excel, making it easier to understand how different factors influence employee attrition:
“Create a visualization of the trained decision tree right here in Excel. Walk me through what the first few splits mean for HR—what factors most influence employees leaving?”

Visualizing the decision tree translates abstract numeric data into an intuitive flowchart, making it easy for HR analysts or stakeholders to understand how the model predicts employee attrition. The current visualization shows the first three layers of the decision tree, using color shading to represent the predicted outcomes: orange indicates employees likely to stay, blue indicates those likely to leave, and lighter shades reflect less certainty.
At the highest level, whether an employee works overtime emerges as the most important factor. Employees not working overtime tend to remain at the company, especially those with more years of experience, higher hourly rates, stock options, and greater job satisfaction. Conversely, those who do work overtime show higher attrition rates, particularly if their monthly income is low, they’re unmarried, hold lower-level job roles, or haven’t been promoted recently.
While this visualization stops after three layers for readability, in practice, the decision tree would branch further into additional, more detailed layers. To keep interpretations clear and manageable, analysts typically “prune” or simplify deeper branches. Even without further pruning here, it’s clear from this initial analysis that factors such as overtime, compensation, promotion frequency, and job satisfaction significantly influence employee retention.
As a quick practical note, plots produced by Copilot may initially appear small, making labels hard to read. You can manually adjust the figure size (figsize) in your Python code, or explicitly request Copilot to resize it for clarity. Additionally, while Copilot is helpful for generating visuals, it can’t currently interpret detailed text within images or plots. For assistance interpreting visualizations, I’d suggest taking a screenshot and pasting it into Microsoft Copilot.
Developing business recommendations
After carefully exploring and interpreting the factors the decision tree identifies as important predictors of attrition, the natural next step is to translate these insights into practical actions:
“Based on this decision tree analysis, provide two or three simple, actionable recommendations HR can implement to reduce employee attrition.”
The resulting output directly responds to this goal, providing clear recommendations HR can feasibly implement. This ensures the analysis moves beyond descriptive insights into concrete, effective strategies for improving employee retention.
Conclusion
Decision trees are powerful yet accessible tools, helping Excel users quickly uncover meaningful insights from complex data… no advanced coding skills required. As we’ve seen, integrating Python and Copilot directly into Excel significantly reduces the friction traditionally associated with advanced analytics. Excel users across any business function can now rapidly identify key factors driving outcomes like employee attrition, customer churn, sales performance, and more, turning raw data into actionable strategies.
However, while decision trees provide intuitive visuals and practical recommendations, they come with certain limitations. They simplify complex relationships and might overlook subtle factors or interactions hidden in your data. Their effectiveness also heavily depends on data quality. Errors, gaps, or biases can affect the accuracy of insights.
Looking ahead, Excel users can take this analysis further by comparing decision trees against other analytical methods, such as logistic regression or random forests, to validate results and strengthen confidence in findings. Regularly refreshing your analysis with new data ensures your insights remain relevant, especially as business conditions change.
Ultimately, while Copilot and Python unlock new possibilities in Excel, the importance of your own domain expertise remains central. Your ability to interpret these outputs thoughtfully, using practical business judgment, ensures recommendations remain realistic, impactful, and tailored to your organization’s goals.
The post Python in Excel: How to build decision trees with Copilot first appeared on Stringfest Analytics.
Want to share your content on python-bloggers? click here.