Learn how to write readable code in Python to make it more maintainable and understandable.
It is very important to be organized when working on a project. This is especially true when working with Python, as it is very easy to get lost in the dependencies and versions of the libraries. Not setting up a virtual environment means overriding dependencies every time you install a new library, more often than not, resulting in code not working.
In this blog post, I will cover how to download and set up a Conda environment for your Python projects. I will also cover how to manage the dependencies once the virtual environment is running.
Also, I will assume that you are using a Linux distribution. If you are using Windows, I recommend installing the Windows Subsystem for Linux (WSL) and following the instructions as if you were using Linux.
Conda is a virtual environment manager for Python. In other words, it allows you to create "folders" in which a specific version of Python and its libraries are installed. This way, you can isolate the dependencies of each project with the Python that your system has, and avoid conflicts between libraries. Then, if you get bored of a project, you can just delete the environment and continue as if nothing happened.
We need to install conda in our system first. In particular, we will use Miniconda, a minimal installation of Conda that works only on the command line.
First of all, it is necessary to create a directory in which the installer will be downloaded. The convention is to create a directory called miniconda3 in the home folder.
mkdir ~/miniconda3
cd ~/miniconda3
Then, download the Conda installer. We can do this directly via the terminal with the wget command. Replace [OS] with your operating system (e.g., Linux, MacOSX, Windows) and [ARCH] with your architecture (e.g., x86_64, arm64).
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-[OS]-[ARCH].sh
We can also change the name of the installer to something more readable:
mv Miniconda3-latest-[OS]-[ARCH].sh miniconda.sh
The file we have just downloaded is a shell script that will install Conda in our system. We need to run it with the bash command, and it will ask us where to install Conda and if we want to initialize it in our shell. Say yes to both questions.
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
Once you are done, you can optionally remove the installer.
rm ~/miniconda3/miniconda.sh
Finally, if you open a new terminal, you should see that Conda is installed and that the base environment is active.
If you don't see the base environment active, you can try to refresh Conda with the following command:
source ~/miniconda3/bin/activate
Also, if you are using another shell than bash, and it is not working, you can try to initialize Conda in your shell with the following command:
~/miniconda3/bin/conda init --all
conda config --set auto_activate_base false
Now that Conda is installed, we can create a virtual environment for our project. This is done with the conda create command. It is also very important to specify also a Python version, as it will install the pip package manager along with other useful things. I highly encourage installing a version of Python higher than 3.10.x as it offers very good type hinting.
For example, to create an environment called myenv with Python 3.11.5, run:
conda create -n myenv python=3.11.5
You will be asked to confirm the installation. Type y and press Enter.
After the installation, you can activate the environment with:
conda activate myenv
You should see that your environment has been activated in the shell, as it will appear (myenv) at the left.
Now that Conda is installed and the base environment is active, you could begin installing packages. However, it is important to properly manage the dependencies of your project.
If you start installing packages without tracking them, you will end up with a lot of packages that you don't know why they are there. This is because Python is an interpreted language, meaning that it needs the code of the libraries to run. And the thing is that, most probably, those libraries depend on other libraries, and those libraries depend on other libraries, and so on. And once you want to check which libraries you installed at the beginning, you will have a hard time.
The cleanest way of installing packages is to create a requirements.txt in the root of your project and add there manually the packages that you will use:
numpy
pandas
matplotlib
And to install them, simply use:
pip install -r requirements.txt
If you want to add more packages, add them to the requirements.txt file and repeat the pip installation.
This method is also very useful when sharing your code or using code from other, as the dependencies are clearly stated in the requirements.txt file.
I have heard many times that freezing the version of your packages should be done always. This is because, when you install a package without a specified version, you are installing the latest version of it. This can be a problem if the package has changed its API, as your code will break.
However, the other way around, trying to use an old version of a package can also be a problem, as it may have bugs that have been fixed in the latest version.
I recommend freezing the version of the dependencies if you are developing a project that needs to be future-proof. In my experience, I don't usually use Python for bigger projects and I always want the latest version of the packages.
To freeze the dependencies, just write the version of each package in the requirements.txt file:
numpy==1.21.2
pandas==1.3.3
matplotlib==3.4.3
Conda is a very useful tool, and, in my opinion, a must-have for writing code in Python. Managing dependencies in Python can be a little clumsy, but by manually tracking the packages you install, you can avoid a lot of headaches.
Also, there are other alternatives to Conda, like virtualenv and pipenv, or uv, being the latter a very promising tool. I recommend trying them out and see which one fits your workflow better.
I hope this blog post has been useful to you!
Learn how to write readable code in Python to make it more maintainable and understandable.
Learn the difference between NamedTuple and @dataclass in Python for better data structures.