Having recently been to see the Barbie movie, it got us thinking: Barbie and Python have more things in common than meets the eye (step aside Ken!). For a start, they are both pioneers in their respective fields: Barbie is a famous fashion doll owned by millions of people around the globe, while Python is a famous programming language with millions of users worldwide. Barbie is well known for her wide range of careers, outfits and accessories. Meanwhile, Python comes in many different versions and has thousands of dedicated libraries and packages.
Crucially, they are both customisable. Barbie can be dressed in different outfits from her wardrobe to meet the demands of her busy schedule, whether that’s a day at the beach, governing her country, or kicking back for a quiet night in. With Python, meanwhile, we can customise our programming environment and switch between different combinations of packages and versions to tackle our data science projects. This is made possible through virtual environments.
What is a virtual environment?
Virtual environments are tools used in software development to create isolated environments for different projects. These environments allow developers to manage dependencies and packages separately for each project. This helps avoid conflicts between different project requirements and keeps everything organised. Each virtual environment is like a contained space where you can install packages without affecting the global Python installation.
While Barbie and Python virtual environments might seem unrelated at a first glance, there are some similarities:
Customisation: Just like how you can dress up Barbie in different outfits and accessories, you can customise each Python virtual environment with specific packages and dependencies tailored to the needs of your project.
Isolation: Barbie’s different outfits don’t interfere with each other, just as Python virtual environments keep the dependencies of different projects separate, preventing conflicts.
Organisation: Barbie’s wardrobe allows her clothes to be neatly stored rather than strewn all over the floor. With Python virtual environments we can work with just the project-specific dependencies rather than hundreds of conflicting packages at once (never a good idea).
Portability: If you’re lucky enough to own multiple Barbies, you can try the same outfit on different Barbies. Similarly, with Python you can duplicate an environment to work on the same project across multiple machines and share it with your colleagues.
Data comes in all shapes and sizes. It can often be difficult to know where to start. Whatever your problem, Jumping Rivers can help.
Virtual environment managers
There are a lot of virtual environment managers out there for Python. Below we will give a basic overview of some on the most popular options and share some useful links for more in-depth information.
Note that some of these tools double as package managers. For more on this, check out our recent blog on Python package managers.
Python’s standard library includes an easy-to-use, lightweight virtual environment module called venv. To create a virtual environment called “myenv”, you can run the following command:
python -m venv myenv
This will generate a folder within the current working directory called “myenv/” (you can call it whatever you like), which will be used to activate the virtual environment and store any packages that are installed into the environment.
To activate the environment on Windows:
On macOS and Linux, you have to
source the activation script:
Once activated, the
pip install <pkg> command will now install packages into the virtual environment, keeping them separate from the user’s system environment. If you want to share your development environment with a colleague that’s working on the same project, you can run:
pip freeze > requirements.txt
This will create a file called “requirements.txt” containing a list of installed Python packages and their version numbers. Your colleague can then install these dependencies into their environment by running:
pip install -r requirements.txt
When you are finished with the environment, it can be deactivated by running:
To delete the envionment outright, simply delete the “myenv/” folder (or whatever you called it).
virtualenv and virtualenvwrapper
Virtualenv is a third-party library that predates venv. If it’s installed with the virtualenvwrapper extension library, it can provide additional commands and features like quick switching between multiple environments. Virtualenv can be installed with pip:
pip install virtualenv
You can then use the
virtualenv command to create a virtual environment from the command line:
Activating and deactivating the environment is similar to venv. With a Unix shell the commands would be:
source myenv/bin/activate deactivate
and packages can again be installed or uninstalled using pip.
Virtualenvwrapper is a set of extensions for virtualenv that simplify the management of multiple virtual environments. It provides commands to create, delete, and switch between virtual environments easily without having to explicitly state the environment file path. To get started with virtualenvwrapper, you’ll first need to install it using pip:
pip install virtualenvwrapper
We then need to add the code below to the shell startup file (
~/.profile, etc) to set the location to where the virtual environments will be stored:
# Virtualenvwrapper settings: export WORKON_HOME=$HOME/.virtualenvs source ~/.local/bin/virtualenvwrapper.sh
Note that these commands are specific to the Unix shell. Windows users should investigate the virtualenvwrapper-win package.
In a new shell, you can now create a virtual environment and activate it as follows:
mkvirtualenv myenv workon myenv
Virtualenvwrapper streamlines the management of virtual environments, making it especially useful when working on multiple Python projects simultaneously.
As well as different outfits and accessories for Barbie, there are different iterations of Barbie herself: Marine Biologist Barbie and Art Teacher Barbie to name a few! Python also comes in different versions, and there are many occasions where having multiple Python installations on the same machine can be useful:
- You may have upgraded to Python 3.11 but still need Python 3.8 to run some old legacy code.
- Your colleagues may be using an older version of Python for a project that you’re working on, and switching to that version to test and debug the project code would be useful.
Pyenv is a tool that allows you to easily switch between multiple Python versions on your system. It also facilitates installing different Python versions and supports creating virtual environments for specific Python versions using virtualenv.
The GitHub documentation provides OS-specific instructions for installing pyenv on your machine. Once installed, you can try adding an older Python version using the
pyenv install command and then create a virtual environment for that version called “myenv/”:
pyenv install 3.8.6 pyenv virtualenv 3.8.6 myenv
You can activate the environment by running:
pyenv activate myenv
and install packages into the environment using pip.
This is a great way to organise Python projects that not only require different packages but also use specific Python releases. And there is a lot more that you can do with pyenv, like specifying the Python version globally or in the current directory. Note, however, that there are some common pitfalls to be wary of when using pyenv:
- It’s easy to think that you’re using your system Python installation when really you’re working with an older version through pyenv.
- Be cautious when working with package managers like pip and poetry, which may be installing packages to your system Python installation rather than to the current pyenv version.
Pipenv is a popular tool for managing both Python dependencies and virtual environments. It combines the functionality of pip and virtualenv into a single tool, and is easy to install through pip:
pip install pipenv
It even integrates with pyenv to work with specific Python versions. To create a virtual environment for Python 3.8 (assuming you have pyenv installed), you can run:
pipenv --python 3.8
This automatically sets up a “Pipfile” within the current folder to manage project dependencies. You can then activate the environment and install packages into the environment using
pipenv shell pipenv install <pkg>
When a package is installed using pipenv, it gets added to the Pipfile. Both the package and its dependencies are also stored in a “Pipfile.lock” file with the exact version numbers. These files can be shared with a colleague, who can then duplicate the environment on their machine by running
For more information about pipenv, Pipfiles and all of pipenv’s commands you can take a look at the official website.
Conda is a cross-platform package and environment manager primarily used in data science and scientific computing. It allows you to create isolated environments with different Python versions and libraries.
Conda is included as part of the Anaconda distribution. The fastest way to obtain it is by installing the Miniconda distribution, which acts as a smaller version of Anaconda that includes conda and Python. You can check out the installation instructions in our previous blog for more info.
By default, you will be working in the conda “base” environment. To create a new environment called “myenv” with Python version 3.8:
conda create --name myenv python=3.8
You can then activate the environment by running:
conda activate myenv
and deactivate the environment by running:
To install packages into the currently-active environment, you should use the
conda install command. For example, NumPy and Pandas can be installed by running:
conda install numpy pandas
When you install packages into a conda environment, the package source files are retained inside a package cache folder within the conda installation directory. This allows you to quickly install the same package across multiple environments without having to perform multiple downloads.
It’s possible to export your conda environment to a YAML file which can then be shared with a colleague:
conda env export > environment.yml
Your colleague can add the environment to their machine by running:
conda env create -f environment.yml
For this to work, both you and your colleague need to have conda installed. Conda can also be used for R and other languages, and downloads its packages from secure repositories that are maintained by the community. For more on conda, check out the official documentation.
Poetry is a modern dependency management and packaging tool for Python projects. It not only creates virtual environments but also simplifies the management of dependencies and project packaging.
Check out our previous blog for installation instructions. To create a new poetry project, run:
poetry new myproject
This initialises a project in the “myproject/” folder and automatically sets up a virtual environment for it. To add a package you can run:
poetry add <pkg>
To activate the virtual environment, run the following command from within the myproject/ folder:
When you install packages these are added to a “pyproject.toml” file. There is also a “poetry.lock” file which lists all dependencies plus all of their dependencies with the exact versions. By sharing the project folder and files with a colleague, they can run
poetry install within the folder to duplicate the environment on their machine.
We highly recommend poetry if you’re starting on a new Python project from scratch. It helps with not only the environment management, but also installing the project dependencies and organising the project folder. It can even be used to package your project and publish it to the Python Package Index (PyPI) if you want to make it publicly-available. Check out the excellent documentation for more info.
Virtual environments with Jupyter
Hopefully we’ve convinced you that virtual environments are as invaluable to Python development as Barbie’s wardrobe is to Barbie! You may now be thinking about how to incorporate some of the options presented in this blog into your development workflow.
Before we conclude, it’s worth mentioning how to add a virtual environment to Jupyter, since this is one of the most popular IDEs for developing and testing Python code. To be able to use your virtual environments within a Jupyter notebook or the JupyterLab IDE you need to:
Activate your virtual environment
Install the Python package
ipykernelinto your virtual environment using the relevant command:
pip install ipykernel
conda install ipykernel
python -m ipykernel install --user --name=<env>
<env>with the name of your virtual environment.
Next time you open a Jupyter notebook or JupyterLab, you should see your environment in the list of available kernels.
Choosing the right virtual environment manager for your Python project depends on your specific requirements and preferences. Each of the tools discussed in this post has its own strengths and use cases as summarised by the table below:
|Environment manager||Quick/easy installation||Package manager||Quick multi-environment switching||Python version manager||Multi-language support||Packaging and publishing to PyPI|
Ultimately, the key is to ensure that your Python projects remain isolated, maintainable, and compatible with the required dependencies. Experiment with these tools and discover which one best fits your development workflow.
For updates and revisions to this article, see the original post