An attempt at decent Python dependencies management
I’ve since moved to uv for all Python projects.
I can’t recommend it enough.
Finally, a decent and useful solution for Python projects!
Computers were a mistake, once again.
At work we have to handle python-based projects (Python is unavoidable in the data science/AI world). Python is not the language I have the most flight hours with, but it has some nice features and syntax.
The problem
On the topics of dependencies management, packaging & deploying, Python ecosystem is a complete mess, and it’s quite sad, given the popularity of the language and the otherwise nice features.
As an “engineer”, you want to:
- have a reproductible build process
- have a reproductible runtime environment
- keep your dependencies under control
All of this is important for the whole team. From the ease to onboard new developers to the ability to deploy the code with confidence, track down regressions, fix security issues, etc.
Here two recent articles that sum up the situation pretty well:
- How to improve Python packaging, or why fourteen tools are at least twelve too many, by Chris Warrick
- Thoughts on the Python packaging ecosystem, by Pradyun Gedam
The goal here is to:
- share my humble choices on how to handle this topic.
- Not to dive into the painful controversy of the python packaging world (I’ve read enough heated discussions on the topic, and took in enough negativity). If you want to know more, I recommend the two linked articles above (and their and comments). ==Don’t forget a decent amount of salted popcorn==.
Minimal configuration
Here the TLDR;, Pipenv + Docker, see below for more details.
Using the config below can enable a developer to run the app locally on his system (if he’s comfortable with this) or run it locally in a docker container.
Ultra minimal dockerfile
FROM python:3.11
# Install system dependencies: as example here jq, sqlite3
RUN apt-get update && apt-get install -y jq sqlite3
WORKDIR /app
COPY . .
RUN pip install pipenv
RUN pipenv install --deploy
# Run your application
CMD ["pipenv", "run", "./start.sh"]
You can adapt and expend dependending on your needs/context. For example, you may want to do a multi-stage build to have a cleaner final image, and even getting pipenv out of your final image (but in this case, you may want to have a separate dockerfile for development).
Run locally on your system
You’ll need a recent version of python and pipenv installed on your system.
- The recomended way to install pipenv is via
pip install pipenv --user
(the--user
flag is important, as it will install pipenv in your user site-packages directory, and not in the system-wide site-packages directory). - You’ll need to add the user site-packages binary directory to your
PATH
(e.g.export PATH=$PATH:$(python -m site --user-base)/bin
), in your~/.zshrc
or~/.bashrc
, or shell profile file. See https://pipenv.pypa.io/en/latest/install/#installing-pipenv for more info. Your mileage may vary, dependding on the way you manage your system Python installation.
Go into your project’s root and run pipenv install --dev
to install the dependencies (it will automatically create a project-specific virtual env if not already present). You can now run your application using pipenv run <COMMAND>
or pipenv shell
to activate the virtual environment and run your application.
Run locally with docker
Build the docker image:
docker build . -t my-application
Launch the container interactively, mounting the current directory as a volume:
docker run -it --rm -v $(pwd):/app my-application bin/bash
Now, you can edit your code and launch it using the shell inside the container. Don’t forget to use either pipenv run <COMMAND>
or pipenv shell
.
Why Pipenv
Pipenv has flaws. A lot. But it’s one the best compromise I’ve found so far for relatively simple projects.
Pipenv is quite close to the two tools that now ship with Python: venv
and pip
. You can see it as a wrapper around those two tools, with some additional features: lockfiles, dependency graph and dependency audit.
Pipenv is close to the PyPa (Python Packaging Authority) working group, and is the tool that is pushed by default by the Managing Application Dependencies tutorial. “Always bet on the standard (even if you don’t like it so much)” is a good default posture, especially if you’re not an expert on the topic.
The Pypa working group itself is very criticized (see the articles above, again), but it’s the closest thing to a “standard” that we have in the python world.
Docker / containerization
Docker/containerization is now ubiquitous in the industry. I won’t go into the details, as there is a ton of resources on the topic. But it’s a great tool to handle the runtime environment of your application, and all the dependencies you can’t completely control with the programming language’s package manager (system dependencies, etc).