Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ REFACTOR: Source package metadata from PyPI #201

Merged
merged 8 commits into from
Jan 17, 2022
Merged

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented Jan 15, 2022

This refactor makes it so that package data is primarily sourced from PyPI,
if a pypi_name key is present if the pip_url points to a PyPI package,
with a fallback to the repository build file (setup.json, setup.cfg, pyproject.toml).

(a) this ensures the released metadata is shown (not unreleased, development metadata),
and (b) this handles dynamic metadata, such as the version in flit and setup.cfg.

Currently, the entry points are always read from the build file,
since PyPI JSON does not include entry points,
and so we would need to download and read the wheel file to obtain them.

when the bdist_wheel is available on PyPI, it is downloaded and the entry-points extracted from it

I diffed the original plugin_metadata.json to the new, and everything looked correct,
except for one thing:
The aiida_version is null for some more packages.
This is because PyPI only extracts/stores requires_dist data, if abdist_wheel distribution is available (see https://www.python.org/dev/peps/pep-0491/).
This is essentially a bug with those packages (and we should warn them),
since all best practice build frontends (https://github.com/pypa/build, flit publish, poetry publish)
build both the sdist and the wheel.
Therefore, there is now a new report warning for this: WARNING: No bdist_wheel available for PyPI release

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jan 15, 2022

Oh and it also adds reading of PEP621 compliant pyproject.toml,
which is basically the goal here: allow us to recommend https://github.com/pypa/flit, as the best practice for pure-python packages 😄

FYI, the latest flit + pip versions now support editable builds with no setup.py 🎉 (i.e. pip install -e .), which is something that had stopped me from using it previously

(I've now implemented it in https://github.com/aiidateam/aiida-firecrest, and it's working great)

Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chrisjsewell !

If I were to play devil's advocate, instead of needing to fetch just the build file from the repo, we now need to fetch both from PyPI and from the repo...

I agree, however, that it is a step in the right direction - we want to show the properties of the published packages (if available); showing updated data from the repository that conflicts with the info on PyPI is undesirable.

I just have one main change request, after this happy to merge


Currently the entry points are always read from the build,
since PyPI JSON does not include entry points,
and so we would need to download and read the wheel file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we open an issue for this then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep 👍 (see the comment I just made)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually now fixed 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! Would you mind updating the comment as well?

plugins.json Outdated
@@ -6,49 +6,56 @@
"pip_url": "aiida-core",
"plugin_info": "https://raw.githubusercontent.com/aiidateam/aiida-core/master/setup.json",
"code_home": "https://github.com/aiidateam/aiida-core",
"documentation_url": "https://aiida-core.readthedocs.io/"
"documentation_url": "https://aiida-core.readthedocs.io/",
"pypi_name": "aiida-core"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong but I don't think this additional field is necessary.

As described in the README, the pip_url will contain either the package name or a URL to the repo.

One can argue about whether that is the best solution, but it has been in place since a while, so I suggest you simply get the package name from there.

Copy link
Member Author

@chrisjsewell chrisjsewell Jan 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep sure, will just have to add a bit of logic that decides if it is a pypi name (and so we can source metadata from there) or an external url (so we can only use the plugin_info url for metadata)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok fixed in 03f09bd (#201)

@chrisjsewell
Copy link
Member Author

If I were to play devil's advocate, instead of needing to fetch just the build file from the repo, we now need to fetch both from PyPI and from the repo...

But yeh, as you said, the repo data is not what we should be referencing; it should be the latest release.
It's a shame that the pypi json does not directly include the entry points, then we absolutely would not need to read from the repo.
You can, download/parse the released wheel file (the url is supplied in the pypi json), then you wouldn't have to touch the source. The only thing is, obviously this would incur a cost in the amount of data you have to download.
Probably leave that for a separate PR though.

@chrisjsewell chrisjsewell requested a review from ltalirz January 16, 2022 13:29
@chrisjsewell chrisjsewell changed the title ♻️ REFACTOR: Source package data from PyPI ♻️ REFACTOR: Source package metadata from PyPI Jan 16, 2022
@chrisjsewell
Copy link
Member Author

chrisjsewell commented Jan 16, 2022

If I were to play devil's advocate, instead of needing to fetch just the build file from the repo, we now need to fetch both from PyPI and from the repo...

Ok now you don't 😝

The entry points are fetched from the PyPI bdist_wheel (if available) and, in this case, downloading/reading from the repo is entirely skipped.

tested the speed, and there is really not that much difference:

$ time aiida-registry fetch --no-fetch-pypi
...
aiida-registry fetch --no-fetch-pypi  23.48s user 0.57s system 27% cpu 1:26.61 total
$ time aiida-registry fetch --fetch-pypi
...
aiida-registry fetch --fetch-pypi  26.97s user 0.92s system 25% cpu 1:47.67 total

(and obviously this does not include request caching)

Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Chris!

can you still update the comment; then it should be good to go.

I trust that you checked that the fetched metadata is at least as complete as before.


Currently the entry points are always read from the build,
since PyPI JSON does not include entry points,
and so we would need to download and read the wheel file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! Would you mind updating the comment as well?

@chrisjsewell chrisjsewell requested a review from ltalirz January 16, 2022 17:14
@chrisjsewell
Copy link
Member Author

I trust that you checked that the fetched metadata is at least as complete as before.

yep it is indeed 👍

@chrisjsewell
Copy link
Member Author

can you still update the comment; then it should be good to go.

done

@ltalirz ltalirz merged commit d9b7ffb into master Jan 17, 2022
@ltalirz ltalirz deleted the data-from-pypi branch March 18, 2022 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants