Python in Excel: How to run a Monte Carlo simulation
Want to share your content on python-bloggers? click here.
Monte Carlo simulations leverage probability and randomness to simulate processes multiple times, exploring a wide range of possible outcomes. This approach can illuminate the inherent uncertainty and variability in business processes and outcomes.
Integrating Python’s capabilities for Monte Carlo simulations into Excel enables the modeling of complex scenarios, from financial forecasting to risk management, all within the familiar confines of a spreadsheet environment. This fusion not only makes your workflow more efficient but also broadens your analytical perspective with a spectrum of potential outcomes.
This guide presents a few relatively simple applications of Monte Carlo simulations using Python in Excel. Follow along with the exercise file provided below:
For this post, I’ll assume you’re already comfortable with the basics of Python programming, including creating objects, working with functions, and more. Additionally, I’ll presume you’re acquainted with the fundamentals of utilizing Python within Excel, such as generating Python objects from Excel data, toggling output between Python objects and Excel values, among other tasks. Should you need further assistance with these topics, I recommend my book, Advancing into Analytics, along with my series of blog posts on integrating Python with Excel.
Monte Carlo with Excel vs Monte Carlo with Python in Excel
A quick online search reveals that Monte Carlo simulations can be performed in Excel, and have been for years. This functionality isn’t exclusively new to Python. However, choosing Python over Excel for these simulations presents considerable benefits.
First, Excel encounters difficulties in setting a random seed, a critical feature for ensuring reproducibility in analysis. We will delve into the significance of this and its implementation in Python later in this post. Second, while Excel’s dynamic arrays have enhanced simulation capabilities, especially when parameter changes are necessary, they fall short in complex situations that demand scalability and flexibility. Furthermore, these arrays can contribute to the workbook’s bulkiness, as all inputs must be displayed visually, unlike in Python, where data can be stored in an object and printed only as needed.
Given that simulations inherently involve evaluating numerous outcomes, visual aids are instrumental in helping decision-makers comprehend the information. Excel’s visualization tools often prove challenging to customize, making it difficult to present a large volume of data in an easily digestible manner.
Python succeeds in areas where Excel does not, boasting powerful libraries for numerical computations and advanced visualizations that facilitate the execution of Monte Carlo simulations. It streamlines the process of setting a random seed and effortlessly manages varying iteration numbers. Additionally, Python’s straightforward syntax and the support of its community lower the learning curve, providing a robust and user-friendly alternative for conducting Monte Carlo simulations.
By integrating Python with Excel, one can leverage Excel’s intuitive interface alongside Python’s computational strength. This synergy not only enhances the efficiency of simulations but also improves their complexity and presentation within the familiar confines of the Excel environment.
Enough conjecture. Let’s build a couple of relatively basic Monte Carlo simulations with Python in Excel.
Example 1: Simulating the distribution of profits
In this initial example, we’ll explore a variety of potential profit outcomes by considering assumptions regarding the timeframe, as well as the average and variability of revenues and expenses. Below, you’ll find a Python code block that captures each input as a separate variable. These variables serve as the foundation for a comprehensive Monte Carlo simulation.
Creating the user inputs
The model dynamically recalculates every time an input is modified, leveraging Excel’s user-friendly interface while Python executes the intricate computations in the background.
Enhancements such as data validation and protection could be integrated into the input cells, augmenting the usability and robustness of your Python-enhanced Excel model.
One element you may be less acquainted with is the number of simulations. In Monte Carlo simulations, the accuracy of results is directly influenced by the number of simulations conducted. A higher number of simulations yields more precise outcomes by averaging across numerous trials. The optimal count hinges on your specific needs for accuracy and the complexity of your model. Starting with a modest number is advisable to gauge requirements, subsequently increasing the number of simulations until the results demonstrate consistency, indicating sufficient accuracy has been achieved.
In scenarios where critical analysis of performance and time efficiency is paramount, strategies for optimizing the number of simulations exist. However, for a preliminary example such as this, you can afford to conduct thousands of iterations without significant time constraints, ultimately achieving reliable results.
The unexpected 5000 outcome in cell A9 should not be a cause for concern; it simply reflects Python displaying the value of the last cell in the sequence. We plan to refine this cell by incorporating additional code to enhance its functionality.
Setting up the random seed
Next, we’ll add a row to our inputs, setting the random seed for the Python model. Setting a seed in numpy involves initializing the random number generator with a specific starting point.
This action ensures that the sequence of random numbers produced by the generator is predictable and reproducible. By setting a seed, you effectively instruct the random number generator on where to begin its sequence, and this sequence will remain consistent every time for the given seed. This predictability is crucial for reproducibility in scientific computing and data analysis, as it allows others to replicate experiments and verify results.
If you use the same seed number as provided in this post (1234), you are guaranteed to achieve identical results. If you’re interested in exploring different outcomes, you can opt for a different random seed number. While the choice of seed is arbitrary, maintaining consistency with the seed and selecting a memorable number reinforces the principle that this practice aids in reproducibility.
Simulating the profits and costs
Now that all the inputs have been added to the model, it’s time to build the simulation. You can add the below code to your Python cell to build that simulation:
Let’s take a moment to delve into this code. Initially, we are simulating monthly revenues. The function np.random.normal
is employed to generate random numbers that adhere to a normal distribution. A similar approach is taken for monthly costs, aiming to simulate various expense scenarios. It’s important to note that these calculations are directly influenced by the inputs provided by the user in the preceding cell. This includes the estimated means, standard deviations, time periods, and the number of simulations.
Finally, annual_profits
are calculated for each simulation by summing the total revenue for the year (accumulating all the monthly revenues) and then deducting the total costs (summing all the monthly expenses). The .sum(axis=1)
part technically refers to the process of summing the figures across each row, which symbolizes a year’s worth of months for each simulation.
In essence, this code conducts a series of “what-if” scenarios, exploring potential profit outcomes over a year while acknowledging that both revenues and costs are subject to fluctuations.
As you familiarize yourself with the workings of this code, don’t hesitate to display values from any object in Excel, adjust the input parameters, and explore further. After all, Excel offers a fantastic, hands-on medium for truly understanding data processing.
Visualizing the distribution of profits
Running a Monte Carlo simulation is one aspect, but interpreting the results is a distinct challenge. It’s useful to begin with summary statistics, such as average profits, and the minimum and maximum values, among others. However, to truly grasp the full spectrum and range of outcomes, visualization proves to be the most effective method, which is what we will explore now.
As our aim is to examine the distribution of outcomes, I will employ a histogram to tally the number of observations falling within various profitability ranges. The code below is designed to introduce formatting enhancements to the graph. For instance, it ensures the X-axis is represented in currency format.
Additionally, I will label the plot and its axes, among other details. Please proceed by adding this code to your existing Python in Excel block, or by creating a new block beneath the current one:
To create a Python plot in Excel, you’ll likely need to adjust its size. For guidance on resizing, refer to this post. Additionally, if your Python code blocks have become too cumbersome for the Excel formula bar, consider switching to the Python code editor provided by Excel Labs.
Great job on creating a Monte Carlo simulation with Python in Excel! To kick the tires a bit, let’s experiment by modifying the inputs and observe the outcomes:
Please keep in mind that as you incorporate more data into your workbook, the update time for the Python-generated results in Excel might extend. Updates do not always happen instantaneously.
Customizing the plot
Although I do notice that the plot changes upon modifying the inputs, I wish it more intuitively represented these changes. To this end, I am going to adjust the histogram to conditionally color the bars red where profits are negative. Additionally, I will expand the plot’s width to prevent the negative values from clustering too closely.
The code to achieve this customized plot is provided below:
With this change, go ahead and modify the simulation inputs again. Doing so will clarify whether a particular bin yields negative or positive profitability.
Example 2: Simulating changes over time
To provide another illustration, let’s design a Monte Carlo simulation to model the potential growth of a startup’s sales over the coming year. This will enable us to visualize various possible trajectories for the startup’s sales over time. Consequently, the visualization we generate will not be a histogram, but rather a line plot.
You’ve already seen the foundational concepts from the previous example, including defining model inputs from Excel cells, setting the random seed, creating simulations, and plotting the results. Below, I will provide the complete code.
If you’re still developing your intuition for how some parts of this code function, I encourage you to break it down into smaller segments within Excel cells and visually analyze the outcomes. Furthermore, if you need assistance in constructing and understanding the Python code, consider seeking help through Copilot in Excel, as suggested in this blog post.
You should now have a chart similar to the one below. Please note that due to space constraints on the plot, we are only displaying the first 100 simulations, even though 1,000 were conducted. To gain a more comprehensive understanding of the range of outcomes, it would be advisable to explore the descriptive statistics of the entire dataset.
Conclusion
I hope this introduction has illustrated the potential of using Monte Carlo simulations, especially through the combination of Excel’s user-friendliness with Python’s computing power.
Keep in mind that this technique isn’t limited to sales forecasting. It can model a wide range of scenarios, from call center hold times to the potential extent of financial losses in a portfolio. As you delve deeper into this method and your models grow in complexity, you may find the need for more inputs to construct a realistic model.
Another critical aspect to consider is the choice of distribution for simulating random numbers. Although we used the normal distribution in our examples, it has often been applied imprudently in situations where a different distribution would have been more appropriate.
What are your thoughts on creating Monte Carlo simulations with Python in Excel? Is this concept new to you? If you’ve previously used Monte Carlo simulations in Excel, how do you find the experience compares with incorporating Python? I’m eager to hear your feedback in the comments.
For those interested in further empowering your finance team to leverage Python with Excel for advanced time series analyses, forecasts, visualizations, and more, consider exploring my Python Foundations for Finance workshops:
The post Python in Excel: How to run a Monte Carlo simulation first appeared on Stringfest Analytics.
Want to share your content on python-bloggers? click here.