An attempt at decent Python dependencies management

*

This post is now (thankfully) outdated

I’ve since moved to uv for all Python projects.
I can’t recommend it enough.

Finally, a decent and useful solution for Python projects!

Computers were a mistake, once again.

At work we have to handle python-based projects (Python is unavoidable in the data science/AI world). Python is not the language I have the most flight hours with, but it has some nice features and syntax.

The problem

On the topics of dependencies management, packaging & deploying, Python ecosystem is a complete mess, and it’s quite sad, given the popularity of the language and the otherwise nice features.

As an “engineer”, you want to:

  1. have a reproductible build process
  2. have a reproductible runtime environment
  3. keep your dependencies under control

All of this is important for the whole team. From the ease to onboard new developers to the ability to deploy the code with confidence, track down regressions, fix security issues, etc.

Here two recent articles that sum up the situation pretty well:

The goal here is to:

  • share my humble choices on how to handle this topic.
  • Not to dive into the painful controversy of the python packaging world (I’ve read enough heated discussions on the topic, and took in enough negativity). If you want to know more, I recommend the two linked articles above (and their and comments). ==Don’t forget a decent amount of salted popcorn==.

Minimal configuration

Here the TLDR;, Pipenv + Docker, see below for more details.

Using the config below can enable a developer to run the app locally on his system (if he’s comfortable with this) or run it locally in a docker container.

Ultra minimal dockerfile

FROM python:3.11

# Install system dependencies: as example here jq, sqlite3
RUN apt-get update && apt-get install -y jq sqlite3
WORKDIR /app
COPY . .
RUN pip install pipenv
RUN pipenv install --deploy
# Run your application
CMD ["pipenv", "run", "./start.sh"]

You can adapt and expend dependending on your needs/context. For example, you may want to do a multi-stage build to have a cleaner final image, and even getting pipenv out of your final image (but in this case, you may want to have a separate dockerfile for development).

Run locally on your system

You’ll need a recent version of python and pipenv installed on your system.

  1. The recomended way to install pipenv is via pip install pipenv --user (the --user flag is important, as it will install pipenv in your user site-packages directory, and not in the system-wide site-packages directory).
  2. You’ll need to add the user site-packages binary directory to your PATH (e.g. export PATH=$PATH:$(python -m site --user-base)/bin), in your ~/.zshrc or ~/.bashrc, or shell profile file. See https://pipenv.pypa.io/en/latest/install/#installing-pipenv for more info. Your mileage may vary, dependding on the way you manage your system Python installation.

Go into your project’s root and run pipenv install --dev to install the dependencies (it will automatically create a project-specific virtual env if not already present). You can now run your application using pipenv run <COMMAND> or pipenv shell to activate the virtual environment and run your application.

Run locally with docker

Build the docker image:

docker build . -t my-application

Launch the container interactively, mounting the current directory as a volume:

docker run -it --rm -v $(pwd):/app my-application bin/bash

Now, you can edit your code and launch it using the shell inside the container. Don’t forget to use either pipenv run <COMMAND> or pipenv shell.

Why Pipenv

Pipenv has flaws. A lot. But it’s one the best compromise I’ve found so far for relatively simple projects.

Pipenv is quite close to the two tools that now ship with Python: venv and pip. You can see it as a wrapper around those two tools, with some additional features: lockfiles, dependency graph and dependency audit.

Pipenv is close to the PyPa (Python Packaging Authority) working group, and is the tool that is pushed by default by the Managing Application Dependencies tutorial. “Always bet on the standard (even if you don’t like it so much)” is a good default posture, especially if you’re not an expert on the topic.

The Pypa working group itself is very criticized (see the articles above, again), but it’s the closest thing to a “standard” that we have in the python world.

Docker / containerization

Docker/containerization is now ubiquitous in the industry. I won’t go into the details, as there is a ton of resources on the topic. But it’s a great tool to handle the runtime environment of your application, and all the dependencies you can’t completely control with the programming language’s package manager (system dependencies, etc).