Managing dependencies and environments across multiple platforms can be a nightmare. That’s why I was thrilled to discover Pixi. I’ve previous talked about Pixi on LinkedIn/Twitter, but haven’t used it in any “serious” project until recently and so far it has worked exceptional!
Imagine a tool that combines the speed and efficiency of uv with the robust package management of mamba. That’s Pixi in a nutshell. Built from the expertise drawn from as the mamba creators and utilizing uv
for PyPi dependenciess, Pixi offers a streamlined, powerful way to manage Python environments. Compared to mamba, pixi takes things one step further as their PyPi-dependencies are tested with conda on top of the additional tools brought by pixi, such as tasks.
Cherry on top? Pixi is lightning fast and enables multi-platform & multi-environment inside a single file where everything is synced together.
Multi-platform, multi-environment means that we can sync dependencies between osx-arm64, linux-64, CUDA, CPU, … - a standout feature!
Pixi Docker Builds
After solving your local environment in a easy yet producible manner the next step is to solve it for your cloud workloads - containerization.
Containerization is an important part of a developers toolkit in the modern world. To run cloud workloads it’s very common to deploy as a container, in Data Science this is for everything like Training, Inference and Data Pipelines.
With pixi it’s quite straight-forward and they provide ready-to-use images through the pixi-docker registry. There’s multiple base-images, including CUDA, to get started - it can’t be any simpler!
Pixi Sample Docker Builds
Simple starter:
docker pull ghcr.io/prefix-dev/pixi:latest
Find the different tags on Pixi Docker tags page.
Efficient Production build by using Docker Multi-Stage Build: prefix-docker/shell-hook.
Pixi Docker Build on AWS Sagemaker
Sagemaker can be quite challenging to work with. While deploying custom Docker builds is easiest using their own base image, this image is often bloated with unnecessary dependencies. Additionally, to run @remote
jobs on AWS, you need to include a conda
or mamba
environment - something that pixi
doesn’t inherently use.
So, how do we integrate Pixi with Sagemaker?
Here’s a workaround to make them play nicely together:
- Include
micromamba
: Addmicromamba
(available onconda-forge
) as a dependency in yourpixi.toml
. This will allow us to create a conda-like environment within our Pixi setup.- In the future this could be done using a simple shell script, which is a planned improvement in my own projects.
- Add
micromamba
to$PATH
: Ensure that themicromamba
executable installed by Pixi is added to your system’s$PATH
. This will make it accessible to Sagemaker. - Set Environment Variables: Configure necessary environment variables like
CONDA_PREFIX
to point to the appropriate location wheremicromamba
will manage your environment.
With these steps, you’re ready to run your Pixi-managed projects on Sagemaker!
In my experiments, this approach significantly reduced the size of my CUDA images from around 12 GB down to 4.5 GB - a massive improvement in terms of storage and deployment speed!
Pixi Multi-Platform/Environment
One of Pixi’s standout features is its seamless support for multi-platform and multi-environment projects. While I initially planned to delve deeper into this, prefix.dev recently published an excellent guide on the topic. I highly recommend checking out their documentation on combining different OS’s and environments (CPU, CUDA) with PyTorch for a comprehensive overview.
Some Personal Comments
Personally I find this part of pixi one of the biggest strengths, especially how easy it is to work with! To build a docker image you simply follow the basic example above, opting for --feature=cuda
.
The part of keeping lock-files on everything, while allowing certain OS:es missing out on dependencies makes it very practical in real-world scenarios!
Pixi Build Slimmming
When containerizing your code, it’s crucial to keep builds slim. Here are a few tricks to help you minimize your Pixi-based Docker images:
- Leverage
.dockerignore
: Create a.dockerignore
file to exclude unnecessary files and directories (e.g.,.git
,__pycache__
, tests) from your Docker build context. - Optimize Dependencies:
- Carefully consider each dependency and remove any that are not strictly required for production.
- Utilize multiple environments within your
pixi.toml
, e.g.prod
anddev
environments. This allows you to exclude dev-specific dependencies (test, lint, ..) from your production container.
- Employ Multi-Stage Docker Builds: Multi-stage builds reduces the image size. Use a build stage to install dependencies and compile your application, and then copy only the necessary artifacts to a smaller, leaner final image. The
pixi-docker
project provides guidance on using multi-stage builds with shell-hook.
Pixi vs uv
While uv has gained significant traction in the Python community, I believe Pixi offers a more compelling solution for my specific needs, especially when it comes to complex, real-world projects.
Why?
tasks
are awesome. They might not be perfect but they’re great to me!- Multi-platform and Multi-environment projects (personal opinion) somehow ends up easier in Pixi
- I really tried to embrace the
uv
approach as I appreciate it as more lightweight. But Pixi is somehow “smoother”.
- I really tried to embrace the
- Pixi has base-images with CUDA
- Both tools are easy to build from a raw base-image too, so it’s not a huge problem
- Access to
conda
packages- Some hate it, but I like getting pre-built binaries.
- It’s quite interesting to install shell tools via
conda
for container deployment.
- Possible to work with other languages than Python
What is the one big uv
pro?
UV’s Inline Script Dependencies
I think this feature is really cool, but as pixi utilize uv
you can use it in pixi
too! ;)
# /// script
# dependencies = [
# "requests<3",
# "rich",
# ]
# ///
import requests
from rich.pretty import pprint
= requests.get("https://peps.python.org/api/peps.json")
resp = resp.json()
data "title"]) for k, v in data.items()][:10]) pprint([(k, v[
Outro
If you’re a Python developer struggling with dependency management, environment inconsistencies, or cumbersome container builds, I urge you to give Pixi a try. It’s a powerful tool that has the potential to streamline your workflow and make you a happier developer. Pixi has certainly made a significant difference in mine!
Thanks for this time, Hampus Londögård