-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch API package management from Pipenv to PDM #4107
Conversation
Full-stack documentation: https://docs.openverse.org/_preview/4107 Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again. You can check the GitHub pages deployment action list to see the current status of the deployments. New files ➕: Changed files 🔄: |
I'm testing this out locally and tried the instructions - I made some changes to |
It should have reflected immediately because the |
Works for me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so exciting! Really looking forward to a world without pipenv, however we get there.
@AetherUnbound I undrafted it because I felt quite confident that PDM would be our pick and also so I could also receive feedback on the actual implementation of the migration like that from @sarayourfriend regarding using an Now that #4128 is merged, feel free to review this PR as deeply as you like. Thanks! Also opened a discussion-type issue #4165 so that we can finalise what to name and where to place our Python packages directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I have a few comments/questions, and per the discussion in #4165, I think we should have packages/python/
instead of py-packages/
at the top level (to mirror what we have in automation/
).
api/Dockerfile
Outdated
# Copy subpackages from additional build-context 'packages' | ||
COPY --from=packages openverse-attribution ./py-packages/openverse-attribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder here if there's a way for us to "wildcard" this 🤔 As in, not have to be explicit about every package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to make this a wildcard, I only didn't do that because we have only one package right now (and that too, is more of a PoC than a real library worth distributing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best not to wildcard it either, if we can avoid it; it would be nice for it to be "automatic" if new packages are added (I'm sure they will be, eh?) but we can look into ways to derive it from the pyproject.toml rather than a catch-all wild card. There might eventually be dev dependencies in the packages, or packages only used by catalog and ingestion worker (for example) that we would not want copied into the context of the API container (and vice-versa re ingestion workers).
dev = [ | ||
"ipython >=8.22.1, <9", | ||
"pgcli >=3.5.0, <4", | ||
"remote-pdb >=2.1.0, <3", | ||
] | ||
test = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a functional difference between dev
and test
here, or is the separation only semantic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows us to selectively build smaller and smaller images from dev (all deps) to CI (main + test) to prod (main only).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! I hadn't seen the test install happening anywhere, so I wasn't sure.
py-packages/openverse-attribution/src/openverse_attribution/attribution.py
Outdated
Show resolved
Hide resolved
py-packages/openverse-attribution/src/openverse_attribution/attribution.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Can't wait to see the PRs that Renovate will open.
I tried changing the attribution package code, and it immediately showed up in the API response.
My request for change is to add the testing of the openverse-attribution
to the CI tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the work on selecting the package manager and implementing the changes, @dhruvkb . Makes sense to split the changes into several PRs.
Can we create a milestone for all the pdm-related accompanying issues?
I love this, Madison! 💯 |
@sarayourfriend I'll update #4165 to indicate that general support for this naming scheme. Also I'm preparing for Python packages to go in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exciting! The transition looks smoother than I anticipated ✨ I haven't reviewed all the PR, but I noticed some things I want to comment on.
Also, I updated the testing instructions since they mentioned an abbreviated name for the new Python package and were missing an installation step.
|
||
This library is a part of the Openverse project. For more information, refer to | ||
the | ||
[documentation](https://docs.openverse.org/packages/ov_attribution/index.html). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[documentation](https://docs.openverse.org/packages/ov_attribution/index.html). | |
[documentation](https://docs.openverse.org/packages/openverse_attribution/index.html). |
# These commands use standard GNU tools instead of `pdm` because we do not | ||
# require PDM on the host machine for API development. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conflicts with the testing instructions for the openverse-attribution
package. Maybe there are alternative instructions using docker containers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement still holds. If one is working on the API (and even if they are making changes to the attribution package), they do not need PDM on the host (because the package will be mounted into the API container directly and be usable via the API).
PDM is only needed on the host if one is purely working outside the context of the API, like the attribution package directly.
@@ -0,0 +1,65 @@ | |||
[project] | |||
name = "api" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: more specific would be...
name = "api" | |
name = "openverse-api" |
or
name = "api" | |
name = "openverse_api" |
FWIW, this isn't the official recommendation from PDM, and I think I've run into weird virtual environment issues sometimes when using the pipx approach that I never had happen when using the project's own recommendation: https://pdm-project.org/en/latest/#recommended-installation-method If you're not having issues, no worried, but if anyone does run into things, I was able to get everything working without a hitch by avoiding pipx for this. PDM's installer isolates itself and pdm is self-updating, so pipx's specific benefits are elusive (other than being consistent for one's own personal preferences, which I appreciate can matter very much). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGMT, Dhruv, keeping in mind my opinion that implementation details of the openverse-attribution
package are out of scope for this PR and should be pushed to a separate issue dedicated that package.
I'm approving this personally now despite the py-packages
directory, because I don't think it's strictly necessary to block this PR on the location of those files, and it'll be trivial to move them after the related JS package PR is merged.
One thing that could be nice is to have a fast-follow to implement a just recipe like the p
recipe, that runs pdm
for all Python packages, optionally following a pattern. If we don't use pdm workspaces, I suppose we can just use find
to identify pyproject.toml
files and use PDM's -P
flag to point to them and run the command in each one. To that end, we should have consistent script names as well (cf. test:unit
, types
for the JS packages).
That, to me, is a nice to have in a follow up issue though, and not necessary for this first PR, which is already interesting enough as it is. I agree with many of the things others have brought up as potential avenues to explore in follow up issues, but do not believe any to be blockers for this specific PR, which we'll want to get out and able to build upon for the rest of the -> PDM migration. Feel free to correct me @krysal and @AetherUnbound, just leave a request changes, but in the spirit of expedient decision making, it'd be great to be able to move forward with this without accidentally expanding the scope, especially for anything that don't have lasting harmful implications for the project or its team members (cf. the definition of "blocker" in our decision making process.
Not trying to override anyone else's intuition, just want to encourage making sure we carefully evaluate and respect a defined scope for the PR.
api/Dockerfile
Outdated
# Copy subpackages from additional build-context 'packages' | ||
COPY --from=packages openverse-attribution ./py-packages/openverse-attribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best not to wildcard it either, if we can avoid it; it would be nice for it to be "automatic" if new packages are added (I'm sure they will be, eh?) but we can look into ways to derive it from the pyproject.toml rather than a catch-all wild card. There might eventually be dev dependencies in the packages, or packages only used by catalog and ingestion worker (for example) that we would not want copied into the context of the API container (and vice-versa re ingestion workers).
api/Dockerfile
Outdated
@@ -96,6 +98,7 @@ RUN useradd --create-home opener \ | |||
USER opener | |||
|
|||
# Copy code into the final image | |||
COPY --chown=opener --from=packages openverse-attribution /py-packages/openverse-attribution/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary, I thought, because it should be copied into the .venv
for production (non-editable). Unless this is explicitly to support the development-mode of this target? I wish it were possible to conditionally copy, if that is the reason.
Bit of a nit, though, we can iterate on this without issue.
Edit (after reading the compose file again): Even in development mode, it shouldn't be necessary, because we map the python packages directory in the compose file, so the editable path reference should still be valid.
@@ -8,12 +8,12 @@ | |||
from django.utils.html import format_html | |||
|
|||
from elasticsearch import Elasticsearch, NotFoundError | |||
from openverse_attribution.attribution import get_attribution_text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we've gone back and forth on the name, but it ocurred to me this doesn't even need openverse specified in the package name. Nothing about it is actually Openverse specific. It is, however, CC-specific! cc_attribution
would be a good name for it.
Just a nit though, don't bother spending time on this, we can bikeshed the name if we ever get around to publishing it.
@@ -0,0 +1,79 @@ | |||
# `openverse-attribution` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, occurring to me that it'd be important for package README's to be in the package itself, maybe just copied to documentation during the build step, otherwise it isn't available for inclusion in the build output of the package. npm and pypi require a README (twine check
will fail on build outputs that don't have it, and you won't be allowed to upload it).
Not an issue for this PR, but something we will need to address in the future when we publish packages from the packages directories.
py-packages/openverse-attribution/src/openverse_attribution/attribution.py
Outdated
Show resolved
Hide resolved
py-packages/openverse-attribution/src/openverse_attribution/attribution.py
Outdated
Show resolved
Hide resolved
Since these changes are working and the PR's core ideas have 2 approvals, I will merge this PR. Not dismissing any of the comments from the reviews, I'll make separate issues for each of them and add them to the milestone. |
Related
Related to #286. Additional background can be read in this comment.
Description
This PR replaces Pipenv with PDM. While PDM has monorepo support, we have opted not to use this feature in this PR. Instead we will use a path dependency to depend on subpackages located at
packages/
in the monorepo. To see this in action, the license and attribution related functionality in the API has been separated into its own package atpackages/ov-attribution
.The
ov-attribution
subpackage is listed as a dependency as well as an editable dev-dependency of the API. This means that in dev mode, the package will not actually be copied into the virtualenv but rather only point to../packages/ov-attribution
. When installed with--no-editable
it will be copied into the virtualenv. Thanks @sarayourfriend for suggesting this and making usage of PDM possible.Testing Instructions
pipx
approach, I am too.just dc build web
.py-packages/openverse-attribution
to see the change immediately in the API.openverse-attribution
with:just api/test
.Checklist
Update index.md
).main
) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin