Description
Polyscript provides package loading functionality for the Pyodide backend through the packages
config key (the link is to the PyScript docs; I couldn't find this documented for Polyscript). My understanding is that the feature works as follows:
- The first time, Polyscript loads the packages via
micropip
. It then generates a lockfile viamicropip.freeze()
, and caches it in IndexedDB with a key corresponding to the string representation of the array of packages loaded. This happens inimportPackages()
. - On later page loads, Polyscript looks up the package list in IndexedDB, and if it finds it, it creates a blob for the cached package list and sets the
options.lockFileURL
to point at that blob. It also sets the package list asoptions.packages
, so that Pyodide loads the packages during startup. This happens inengine()
.
AFAICT, the cache of lockfiles is never invalidated, except when setting package_cache: 'never'
in the config (which clears the cache completely). This causes issues in at least two situations:
- When updating Pyodide: The lockfile contains the Pyodide version. Pyodide checks it against its own version, and fails to load in case of mismatch.
- When updating packages: While the cached lockfile ensures that always the same package versions are used, sometimes I do want a newer version of a package to be picked up.
I haven't found a good solution to either of these cases. Getting the current Pyodide version requires loading Pyodide, but this fails due to the version mismatch, so I cannot detect beforehand if I should set package_cache: 'never'
. And for the packages, I would have to compare the cached lockfile to the current one; but the cache is an implementation detail, so that's not ideal.
My current workaround, also not ideal, is to not set packages
, and instead pass the package list as a different config key my_packages
, then retrieve that in Python and call await pyodide_js.loadPackages(polyscript.config.my_packages)
from there. This increases the latency until Pyodide is ready, because packages cannot be loaded concurrently with Pyodide initialization.
My use case actually doesn't need the cache at all: I would be fine with Polyscript never loading packages, and instead passing packages
directly as options.pacakges
to Pyodide, without using micropip
and freezing. But there is currently no way to do that. Setting package_cache: 'never'
causes the packages to always be loaded with micropip
, which has even worse latency than my workaround.
So this issue report is really two things:
- A bug report for the cache invalidation issues related to Pyodide updates. Both ideas below require getting the Pyodide version before loading it, which I haven't found how to do, and they don't solve the issue with package updates.
- Have Polyscript check the version in the cached lockfile against the Pyodide version, and invalidate the cache entry if there is a mismatch.
- Include the Pyodide version in the cache key.
- A feature request for disabling the package loading in Polyscript and pass the package list directly to Pyodide. Ideas:
- Accept a
pyodide_options
config key that is merged withoptions
. This would have the additional benefit of allowing more customization of Pyodide, e.g. settingfullStdLib
,stdLibURL
,env
. - Accept
package_cache: 'passthrough'
, and when this is set, setoptions.packages = packages
and don't load packages.
- Accept a
I would be happy to send a PR for at least the feature request. For the bug, I'm still missing the bit about getting the Pyodide version, and how to handle package updates.