-
Notifications
You must be signed in to change notification settings - Fork 432
Description
Checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pixi, using
pixi --version.
Reproducible example
- Set up a Pixi project on a shared network drive (Lustre).
- Submit multiple SLURM jobs (e.g., 10+) that execute
pixi install --frozensimultaneously in the same project directory. - Some jobs will succeed, while others fail with the
File was modified during parsingerror.
Commands I ran and their output:
pixi install
Error: × failed to collect prefix records from '/***/.pixi/envs/default'
╰─▶ File was modified during parsing
help: try `pixi clean` to reset the environment and run the command againpixi.toml/pyproject.toml file that reproduces my issue:
[workspace]
channels = ["conda-forge", "bioconda", "biobakery"]
platforms = ["linux-64"]
[system-requirements]
libc = { family = "glibc", version = "2.17" }
[dependencies]
pyega3 = ">=5.2.0,<6"
r-base = ">=4.4,<4.5"
humann = "4.0.0a1.*"
metaphlan = "4.1.1.*"
# ... (other dependencies)pixi info output:
System
------------
Pixi version: 0.63.2
TLS backend: rustls
Platform: linux-64
Virtual packages: __unix=0=0
: __linux=3.10.0=0
: __glibc=2.17=0
: __archspec=1=skylake_avx512
Cache dir: /***/.cache/rattler/cache
Auth storage: /***/.rattler/credentials.json
Config locations: No config files found
Global
------------
Bin dir: /***/.pixi/bin
Environment dir: /***/.pixi/envs
Manifest dir: /***/.pixi/manifests/pixi-global.toml
Workspace
------------
Name: ***
Version: 0.1.0
Manifest file: /***/pixi.toml
Last updated: 19-01-2026 07:27:29
Environments
------------
Environment: default
Features: default
Channels: conda-forge, bioconda, biobakery
Dependency count: 13
Dependencies: pyega3, r-base, humann, metaphlan
Target platforms: linux-64
Prefix location: /***/.pixi/envs/default
System requirements: libc = { family = "glibc", version = "2.17" }
Issue description
I am encountering an intermittent race condition when running multiple instances of a pipeline via SLURM on an HPC cluster.
Even when using pixi install --frozen to prevent environment modifications, concurrent tasks on a shared filesystem (Lustre) sometimes collide when parsing prefix records.
This behavior is non-deterministic:
- In a batch of 14 concurrent samples, 6 failed with this error while the rest succeeded.
- Re-running those 6 failed samples later (with fewer processes competing) worked without issue.
- Conversely, a separate batch of 88 samples, processed with a maximum concurrency of 18 tasks (the CPU quota limit for my user), finished successfully on the first try.
This suggests that the error is not strictly tied to the number of concurrent tasks, but rather to a transient race condition during the initial environment validation on the shared filesystem.
I’m curious if pixi currently implements a locking mechanism for these read/verify operations, or if this is an area where the high latency of Lustre might be causing unexpected behavior.
Regardless of this issue, I would like to thank the maintainers for their incredible work on Pixi; it has significantly improved our bioinformatics workflows! 💙
Expected behavior
Pixi should ideally use a file-locking mechanism (like flock) to ensure that only one process modifies or reads the environment prefix records at a time, or handle concurrent read/verify operations gracefully without crashing when a file is accessed by multiple processes.