Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publishing a package is hard. New command 'pip publish'? #60

Open
hickford opened this issue Jan 1, 2015 · 105 comments
Open

Publishing a package is hard. New command 'pip publish'? #60

hickford opened this issue Jan 1, 2015 · 105 comments

Comments

@hickford
Copy link
Contributor

hickford commented Jan 1, 2015

Even after you've written a setup.py, publishing a package to PyPI is hard. Certainly I found it confusing the first time.

The sequence of steps is complex and made stressful by all the decisions left to the user:

  • Register on PyPI (on the website or with setup.py register?)
  • Login to PyPI (with setup.py register or by writing a .pypirc?)
  • Build distributions (source? egg? windows installers? wheel?)
  • Upload distributions (with setup.py upload or with twine?)

It would be neat to have a single command pip publish analogous to npm publish that did all of this, correctly.

It would build whichever distributions are deemed fashionable (source + wheel). It you weren't logged in it would automatically run the wizard pip register.

@beaugunderson
Copy link

Huge 👍 for this.

Related, there's a pip-init command (analogous to npm init). Would be great to have a similarly easy experience with publishing!

(See this thread for details of why I came looking for this)

@daenney
Copy link

daenney commented Apr 17, 2015

One of the super unobvious things that I ran into wrt passwords and .pypirc is that if you have tokens in there like the { and } you need to duplicate them, otherwise it tends to try and think they're string format interpolation tokens and it blows up in interesting ways.

@beaugunderson
Copy link

Oh, that is very non-obvious! The thing I was publishing when I was thinking about how hard all of this was actually a tiny wrapper around ConfigParser that disables interpolation and implements some other niceties; I didn't have a good example of how interpolation can blow up unexpectedly so thank you for that @daenney!

@glyph
Copy link

glyph commented May 28, 2015

Since nobody's said it yet - always, always publish with twine. setup.py upload uploads via plain text, exposing all your users to unnecessary risk.

@ekohl
Copy link

ekohl commented May 28, 2015

That's great advice which I didn't know. For the lazy people, https://pypi.python.org/pypi/twine

@merwok
Copy link

merwok commented May 28, 2015

FTR Distutils in Python 2.7.7+ uploads using HTTPS.

@dstufft
Copy link
Member

dstufft commented May 28, 2015

Not verified HTTPS, that requires 2.7.9+ (I think it'll use verified then, though I haven't made sure of it).

@dstufft
Copy link
Member

dstufft commented May 28, 2015

Also 3.4.3+ for Verified TLS on 3.x.

But that doesn't matter a whole lot, because of the design of distutils means that if you want to say, upload wheel files for 2.6, 3.2, or 3.3 then you have to upload with a version of Python that doesn't verify HTTPS.

@hickford
Copy link
Contributor Author

@dstufft thanks Donald for explaining, I hadn't appreciated why part of the solution was to create a new tool, as well as fix the bug. Cool. I think this is all the more reason for a friendly and reliable pip publish command, that would be useful (and receive security updates) to all Python versions

@jacobbridges
Copy link

I'm 100% all for this. Python is such a beautiful language, and its use cases range from simple "proof of concept" scripts to proprietary codebases. Yet it seems to be fading in popularity when compared with the Node community. I believe npm is a primary reason Node is so popular. The numbing simplicity of creating and publishing a Node package to the internet drastically lowers the bar for innovation, allowing more people to express their ideas and contributions.

We need something like this for Python. What can I do to help?

@glyph
Copy link

glyph commented Jul 13, 2015

I just realized that the only thing I've said so far on this issue is "use twine". What I should really say is: Yes. pip publish is the right solution to this problem.

@Pomax
Copy link

Pomax commented Apr 5, 2017

been a while - as someone coming to pip from the Node.js + npm world where npm publish is pretty much all you need (on first run, it asks you to set up creds if there aren't any), is there any chance to revive this effort?

@axsaucedo
Copy link

+1

@Pomax
Copy link

Pomax commented Jun 23, 2017

Also note that coming from the Node world, there is npm init, which generates the metadata file without needing to write it yourself. A pip revision that takes its inspiration from npm and yarn in terms of ease of use would be super great.

@pdonorio
Copy link

pdonorio commented Jul 5, 2017

I have been discovering the "python packaging world" the hard way in the last two months.

Frankly I am quite disappointed as a Python teacher in how releasing your code is confusing and not straight as with all the other Python situations.

So a huge +1 for a pip publish and please tell us how to help!

@ihavenonickname
Copy link

I am looking forward to this improvement.

@ncoghlan
Copy link
Member

I still think tightly coupling the preferred publishing tool with the preferred package consumption tool again would be a major mistake, as the entire point of much of our work in PyPA has been to break the previous tight coupling of publishing tools and compatible download tools established by distutils and setuptools/easy_install. (Keep in mind that the tight coupling favoured by commercial publishing platform operators doesn't apply to the PSF or PyPA)

While twine itself is mainly a workaround for the limitations of the upload command in setuptools/distutils, there are also projects like @takluyver's flit or @ofek's hatch, which have their own flit publish and hatch release commands, such that introducing a new pip publish command would potentially create confusion around "What's the difference between publishing with pip and publishing with my publishing tool's publishing command?".

pip is an installer, let's keep it as an installer, and not try to morph it into an ever growing "all that and the kitchen sink" project manager. We've already seen in our attempts to incorporate virtual environment management into the packaging.python.org tutorials that doing so can significantly increase the learning curve for folks using Python for ad hoc personal scripting, and publishing capabilities fall into a similar category where they're only useful to folks writing software for publication, and completely irrelevant to folks that are only interested in local personal automation.

@njsmith
Copy link
Member

njsmith commented Feb 28, 2018

I think there's a big difference between putting virtualenv into tutorials, and putting uploading into pip. With virtualenv, you're adding a new tool and set of cognitive complexity to the first workflow people read about. With pip publish, we wouldn't mention it in the intro tutorial at all, and when people do get around to reading the tutorial on publishing their own packages, it would let us remove a tool and its associated cognitive complexity. "Oh, I just use pip, I already know pip."

It's really important that we've separated installing+uploading from building. It's not clear to me what the value is in separating installing from uploading. (Besides "we've always done it that way.") What's the value add in having an ecosystem of competing upload tools? Can't we have one obvious way to upload a wheel? Is there any reason flit publish exists, except to let people skip having to learn about and install yet another tool just for this? (These are genuine questions. I guess @takluyver is the one who can answer the last.)

@takluyver
Copy link
Member

Is there any reason flit publish exists, except to let people skip having to learn about and install yet another tool just for this?

That's definitely part of it. If someone has prepared a package using flit, I don't want to make them learn about twine to get it published.

There's also a difference in approach, though. I think that integrating build+upload into one command reduces the risk of mistakes where you upload the wrong files - for instance, if you make a last-minute change to the code and forget to rebuild the distributions. Other people would rather separate the steps so they can test the built distributions before uploading those precise files.

pip is an installer, let's keep it as an installer, and not try to morph it into an ever growing "all that and the kitchen sink" project manager.

I guess that the push for features like this and pip init are inspired by tools like cargo, which is one tool that can do ~everything you need to manage a typical rust project - from starting a new project to running tests to publishing a crate.

I admire how this helps make rust approachable, and I think we should keep 'tool overload' in mind when designing packaging tools and documentation for Python (*). But standardising and unifying a collection of different tools which people already use is a much bigger task than designing a unified tool on a blank canvas. I don't want to say it's impossible, and give up on a potentially valuable endeavour before it is begun, but I would expect it to take many years of wrangling with people who, if they aren't happy, can easily walk away and keep using existing tools.

(* It is of course a bit hypocritical for me to talk about tool overload after adding a new tool which serves the same purpose as existing tools.)

@pfmoore
Copy link
Member

pfmoore commented Feb 28, 2018

The fact that pip wheel exists makes this a grey area - in the sense that if there were no pip wheel, it would be obvious (to me, at least) that we shouldn't have pip publish. But pip wheel is (IMO) targeted at allowing users to ensure that they can build the same wheel that pip uses, rather than being particularly targeted at a developer workflow (although it's obviously an option in that case - but questions like reusing build artifacts have very different answers depending on whether pip wheel is for an end user workflow or a developer workflow).

Personally, I do not think pip should try to cover the development workflow. Specifically, I'm against adding pip publish.

As well as pip wheel, we currently have some "convenience" features that support the development workflow (notably editable installs). But editable installs cause a lot of support issues, because they sit somewhat uncomfortably with pip's other functionality. To be honest, if we wanted to make an even clearer split between end user and developer toolsets, I'd be OK with moving editable installs out of core pip as well (but that's a completely separate discussion, and not one I think needs raising at the moment).

(I've just seen @takluyver's comment - basically I agree with pretty much everything he said).

@njsmith
Copy link
Member

njsmith commented Feb 28, 2018

Oh, pip definitely shouldn't try to cover the development workflow: that would require implementing the functionality of tox, pipenv, flake8, pytest, flit, setuptools, and probably a bunch more I'm forgetting :-). Development is a complex activity that of course will require a wide variety of tools.

But none of this gives me any sense of why pip publish is a bad idea. Pip is already the tool I use to talk to pypi, to build things, and (once PEP 516 lands) to wrap build backends in a uniform interface. The point of pip publish would be to talk to pypi, and maybe wrap build backends and build things. So it feels very natural that pip might be the tool that covers these features.

Again, how does the pip/twine separation benefit users? Are there so many different ways to upload something to pypi that users benefit from a variety of options?

@Pomax
Copy link

Pomax commented Dec 15, 2018

I have to say I don't understand this part:

Personally, I think it's incredibly bad practice to have software publication tools installed by default alongside a language runtime, and consider it a design mistake that the Python standard library currently includes support for the legacy distutils setup.py upload command (unfortunately, it's a major compatibility issue to get rid of it, and in the environments where folks care, they tend to just remove the entirety of the distutils module).

While I share the opinion that legacy support for something like distutils should never have happened, I don't see the connection between the fact that that support is there, and the notion that it's somehow bad practice to have the publication tools bundled with the language suite. Those are two completely separate things, and I'd like to understand what makes you think it's bad practice to offer those tools as part of the standard library.

I'd also caution against drawing parallels between pip and apt/yum. In part because they're only similar on the surface, in the sense that they're might all fit the "installation managers" label while differing substantially in context, but also in large part because Python is a cross-platform language: discussions about its package manager that require drawing parallels should draw parallels to other cross-platform programming language package managers, not to OS-specific installation managers (which gets even worse in the case of apt or yum, which aren't even Linux-specific, but "only certain flavours of Linux"-specific).

So that means comparing pip, as a programming language dependency manager, to other such tools like cargo or npm. These tools of course have the benefit of being very new tools indeed, so there are lessons to be learned from the decisions they made after looking at what people want out of these tools, and what they actually get out of these tools, looking at all the languages that came before them, including how Python has handled package management. As it turns out, truly making these tools the package manager, not just the package installer (with up/downgrades just a more specific form of install), and having them be part of the default installation greatly benefits everyone.

So I'd like to understand comments around why it would be a bad thing to (in limited fashion from what I'm reading so far) effect this kind of empowerment for users of Python. The added disk space adds up to "basically irrelevant" except for embedded systems (where no sane person would use a standard install anyway), and it sounds like the maintainers of twine are up for folding its functionality into pip, so this all sounds like great stuff, and I still really hope to see a fully functional pip publish come with a near-future version of Python, ideally with an interim solution in the very next version where pip publish either tells people what to do, or asks them whether it should bootstrap everything for the user to, with minimal additional work, get that code pushed up and available to the rest of the world for use and collaboration.

@ncoghlan
Copy link
Member

ncoghlan commented Dec 16, 2018

(Note: thinking out loud in this comment so folks can see where I'm coming from in relation to this. I'll post a second comment describing a version of pip publish that would address all my technical design concerns without the UX awkwardness of pip enable-publishing)

The root of the design problem we face at the pip level actually lives at the Python interpreter level: unlike other ecosystems, we don't make a clear distinction between "development environments" and "runtime environments".

C/C++:

  • runtime is just libc
  • distribution is of the built binaries, not the original source code
  • development environments add a compiler, debugger, etc (accessed either as CLI tools or via an IDE)

Rust:

  • essentially the same set up as C/C++, except with proper structured dependency management in cargo

Java:

  • runtime is just a JVM bytecode interpreter
  • distribution is of the built JAR and WAR files, not the original source code
  • development environments add javac, maven, gradle, etc (accessed either as CLI tools or via an IDE)

JavaScript:

  • runtime is Node.js or a browser JS engine with a built-in source compiler
  • distribution is of either the output of a JS build pipeline, or else of a directory with node_modules embedded in it
  • browser debuggers live in the client browser as an optional add-on
  • Node.js debuggers live outside the interpreter and communicate over a defined WebSocket protocol

Python:

  • runtime is, for the purpose we're considering here, CPython, as PyPy typically doesn't get installed by beginners, and a lot of pip-installable things won't work out of the box with MicroPython anyway
  • CPython not only comes with a source compiler built-in, it also comes with a native debugger, a legacy build management system that we're looking to deprecate (distutils: Making setuptools the reference implementation of the distutils API #127), and a modern package installer that's more deliberately designed to be optional (pip/ensurepip).
  • distribution is a wild and wooly mess of different technologies adopted at different times (https://packaging.python.org/overview/)

So, in writing that out, I think my main concern is actually a factoring problem, in that I'd like users to be able to easily decide for themselves whether they want to configure a system as:

  1. A Python runtime system: no pip, no wheel, no twine, no setuptools, no distutils (the first 3 of those are readily achievable today, the latter two are still a work in progress)
  2. A Python application build & deployment system: able to consume from Python packaging repositories, but not set up to publish back to them (this is all pipenv needs, for example, along with any other pipeline that converts Python packages to a different packaging ecosystem, whether that's a Linux distro, conda, etc)
  3. A Python library build & deployment system: both consumes from and publishes back to Python packaging repositories

Point 1 is handled at the standard library level with ensurepip (once we figure out the thorny mess of having the distutils API be provided by setuptools instead of the standard library)

That means it's only points 2 & 3 that impact the design of a pip publish command. Saying "we don't care about the concerns of folks that want to minimise the attack surface of their build pipelines" is certainly an option, but I don't think it's a choice that needs to be made (hence the design in the next comment).

@ncoghlan
Copy link
Member

ncoghlan commented Dec 16, 2018

I realised there's a way of tackling pip publish that would actually address all my design concerns:

  1. pip would declare a publish extra, such that running pip install --upgrade pip[publish] instead of pip install --upgrade pip installed any extra dependencies needed to make pip publish work. (Declaring an extra this way covers points 2 & 3 in my previous comment)
  2. pip publish would be implemented using the in-progress Twine API @sigmavirus24 mentioned in Publishing a package is hard. New command 'pip publish'? #60 (comment) (and presumably influence the design of that API)
  3. pip publish would prompt to auto-install the pip[publish] extra if it found any of its import dependencies missing
  4. With that approach, the PyPI registration helper functionality would likely make the most sense as an addition to twine, so it could be iterated on outside the pip release cycle

From an end-user perspective, that would all end up looking like an "on first use" configuration experience for the pip publish command (which is always going to exist due to the need to set up PyPI credentials on the machine).

From a maintenance perspective, while the existence of twine as a support library would become a hidden implementation detail, the twine maintainers would likely still need to become pip maintainers as well, so they can handle pip publish issue reports, and bump the minimum twine version requirement as needed (similar to the way @dstufft originally became a CPython core dev primarily to handle updating the bundled pip to new versions).

As an added bonus, all the documentation about extras would gain a concrete example that it can point to: pip[publish] :)

@njsmith
Copy link
Member

njsmith commented Dec 16, 2018

Is attack surface your main concern? Because python already ships with multiple ways to make arbitrary HTTP requests, and doesn't ship with any PyPI credentials. So I'm having trouble seeing how having twine available would increase attack surface in a meaningful way? What's your threat model?

@ncoghlan
Copy link
Member

ncoghlan commented Dec 16, 2018

@njsmith Every other part of pip can interact with PyPI anonymously, but upload needs the ability to work with the user's PyPI credentials.

Not putting the credentials on the machine in the first place is obviously the primary defence against compromise, but if the code is useless without credentials, why have it there, instead of designing the tooling to add the dependencies at the same time as you add the credentials?

Keeping the dependencies separate also means that if a CVE is raised against the way twine accesses the system keyring, or the way it interacts with a user's account on PyPI, then it's only a vulnerability on systems that have twine installed, not on all systems that have pip installed. (A future version of pip would presumably raise the minimum required version of twine to a version without the vulnerability, but that would be a matter of dependency management hygiene, rather than urgent CVE response)

That said, laying out the considerations as I did above means I now think most of the cases where this kind of concern really matters will be ones where the feature to be removed from the deployment environment is the entire build and installation toolchain, and that's already possible by doing pip uninstall pip wheel setuptools once the target venv has been set up (getting rid of distutils is more difficult, but still possible).

So while I think the "extra"-based approach would be architecturally clearer (i.e. pip primarily remains an installation tool, but has some core publication functionality that relies on some optional dependencies), I don't think having it baked into the default install would create any unsolvable problems - worst case is that it would just give some folks an increased incentive to figure out how to remove pip from their deployment artifacts entirely, and there might end up being some future CVEs that impact more Python installs than they otherwise would have.

@pradyunsg
Copy link
Member

pradyunsg commented Dec 17, 2018

I like @ncoghlan's idea -- have a pip command that's providing (optional) upload functionality, implemented using twine's public API, with an extra in pip to install the dependencies for it. :)

@ArjunDandagi

This comment has been minimized.

@Pomax
Copy link

Pomax commented Sep 26, 2020

It's been over 5 years since this issue got filed, and almost 2 years since the discussion died down and nothing happened. However, the entire world would still benefit from being able to type pip publish, because publishing a package is still ridiculously hard in this ecosystem.

Just pick an approach, implement it, and then iterate on refining or even wholesale changing that implementation as the command sees adoption. As long as pip publish works at all, improving how it works can be a rolling target.

@astrojuanlu
Copy link

  1. If nobody has complained about this in 2 years maybe it's not that crucial.
  2. That last comment left 2 years ago actually expresses agreement on a way forward. What about sending a pull request instead of demanding free labour?
  3. In 2020 neither flit publish nor twine upload are "ridiculously hard" by any standards, and if they are perceived as such it's a documentation issue, not a tooling issue.

@hoechenberger
Copy link

hoechenberger commented Sep 26, 2020

@astrojuanlu

  1. If nobody has complained about this in 2 years maybe it's not that crucial.

Why should one constantly add complaints if there's an issue open already? I guess only few would agree that Python packaging tooling is a pleasant thing to use. Besides, there are complaints now, and you're complaining about those. So maybe you should make up your mind on this matter.

  1. In 2020 neither flit publish nor twine upload are "ridiculously hard" by any standards, and if they are perceived as such it's a documentation issue, not a tooling issue.

Oh come on, the tooling is really not great compared to what we're seeing e.g. with NPM. Nobody's saying that the pip / PyPA team hasn't been doing an amazing job, but in comparison to other ecosystems, Python is just so far behind.

@pfmoore
Copy link
Member

pfmoore commented Sep 26, 2020

Oh come on, the tooling is really not great compared to what we're seeing e.g. with NPM. Nobody's saying that the pip / PyPA team hasn't been doing an amazing job, but in comparison to other ecosystems, Python is just so far behind.

How many people work on and support npm? Wikipedia says "The company behind the npm software is npm, Inc, based in Oakland, California. [...] GitHub announced in March 2020 it is acquiring npm, Inc". The pip development team consists in total of about 5 people, all of whom only work on pip in their spare time. Frankly, I'd hope npm would be better than pip, with that level of disparity in development resource...

@layday
Copy link
Member

layday commented Sep 26, 2020

Most of the work in the Python packaging space appears to be - with the sole exception of the new dependency resolver - unfunded and is carried out by volunteers in their free time. npm was VC-funded as early as 2013 and is now maintained by GitHub.

Edit: heh, we posted almost the exact same thing at the exact same time.

@hoechenberger
Copy link

Yes, I don't challenge that. This is a totally acceptable explanation for why Python packaging is in such a bad shape. But still one should acknowledge that Python packaging is not great by any standards. Why that is the case is a different question. I'm thankful for the work people have put into the existing ecosystem either way, but this doesn't mean one cannot dislike or criticize it.

@pfmoore
Copy link
Member

pfmoore commented Sep 27, 2020

That's a very absolute statement. There are certainly some standards by which Python packaging is fine:

  • It's fine for something which is used by millions and maintained by less than 10 volunteers
  • It's fine compared to the state it was in 10 years ago
  • More personally, it's fine for what I use it for

Progress is slow. But it's not non-existent. And there are reasons why it's slow. People complaining that the volunteer labour "doesn't get things done faster" is one of the reasons it's slow, because it discourages and burns out the people whose freely given efforts are being dismissed as insufficient. I speak from experience here, as I know I'd do far more on pip if I didn't have to review so many issues that left me feeling demotivated.

this doesn't mean one cannot dislike or criticize it

However, finding ways to express such a dissatisfaction without implying some level of failure on the part of the people who voluntarily give their time to the work, is very hard. And people typically don't make any effort to do that, but simply throw out criticisms, and then follow up with "yes, but I appreciate the work people have done, I just dislike the result".

And furthermore, how is complaining and criticising without offering any help, productive? If you were to submit a PR implementing a pip publish command that took the discussion so far into account, your views would be much more relevant and welcome. But just out of the blue commenting that "this sucks" isn't really much help in moving the issue forward.

Never mind. I don't want to spend my Sunday worrying about explaining this to people. I'll go and find something more enjoyable to do. (And if that means I don't work on pip today, that's a good example of the consequences of this sort of discussion).

@pradyunsg
Copy link
Member

And if that means I don't work on pip today, that's a good example of the consequences of this sort of discussion

+1

The fact that this was the first notification/issue thread I've read on this Sunday, is directly the cause of why I'm not spending any more time today to work on pip.

@hoechenberger
Copy link

hoechenberger commented Sep 27, 2020

@pfmoore

Progress is slow. But it's not non-existent.

Nobody said that.

People complaining that the volunteer labour "doesn't get things done faster" is one of the reasons it's slow, because it discourages and burns out the people whose freely given efforts are being dismissed as insufficient. I speak from experience here, as I know I'd do far more on pip if I didn't have to review so many issues that left me feeling demotivated.

this doesn't mean one cannot dislike or criticize it

However, finding ways to express such a dissatisfaction without implying some level of failure on the part of the people who voluntarily give their time to the work, is very hard. And people typically don't make any effort to do that, but simply throw out criticisms, and then follow up with "yes, but I appreciate the work people have done, I just dislike the result".

I can very much empathize, I've been in your shoes before, many times. Maybe to clarify once more: I greatly appreciate the work and effort that people have put into PyPA and pip. But I think it's not okay to simply deny there are still many issues to be resolved when there clearly are issues. Because my impression was that this is exactly what was happening in response to @ArjunDandagi's and @Pomax's comments (and is the only reason why I joined the discussion)

And furthermore, how is complaining and criticising without offering any help, productive? If you were to submit a PR implementing a pip publish command that took the discussion so far into account, your views would be much more relevant and welcome. But just out of the blue commenting that "this sucks" isn't really much help in moving the issue forward.

First off, I never said "it sucks".
Secondly, I believe it's a mistake to only allow criticism if the one criticising has a solution for their problem right at hand. One must be able to express dissatisfaction even if one doesn't know how to resolve the problem.

Never mind. I don't want to spend my Sunday worrying about explaining this to people. I'll go and find something more enjoyable to do. (And if that means I don't work on pip today, that's a good example of the consequences of this sort of discussion).

@pradyunsg

+1

The fact that this was the first notification/issue thread I've read on this Sunday, is directly the cause of why I'm not spending any more time today to work on pip.

You can spend your time however you want to. Nobody's forcing you to do anything.

@takluyver
Copy link
Member

If you were to submit a PR implementing a pip publish command that took the discussion so far into account, your views would be much more relevant and welcome.

Just to add, as a maintainer of various open source projects (not pip), a PR like this is probably not as helpful as it initially sounds. If you're not familiar with the internals of a project, your first attempt at writing a significant new feature is likely to need a lot of work, and therefore take up a lot of reviewers' time. It can also cost a lot of mental & emotional energy to explain to a well-intentioned contributor that the changes they've spent hours or days on are not going to be merged, and at least for me, this really drains my enthusiasm to work on a project.

So, before contributing pip publish (or any other significant change to an open source project), it's a good idea to work out:

  • Is this something the project maintainers want to add? (E.g. the pip maintainers may say pip is for installing packages, not creating & publishing them - though I get the impression they're actually somewhat open to doing both)
  • Has anyone else attempted this? Where did they get stuck? Is their code a good starting point?
  • How do the maintainers envisage it working & fitting in to the project? Where would they start if they were implementing it? What's a sensible first step to submit and have reviewed?

@hoechenberger

I believe it's a mistake to only allow criticism if the one criticising has a solution for their problem right at hand

You are allowed to criticise. @pfmoore suggested that it was not productive for you to do so. It looks like you've contributed to dissuading two maintainers from spending time on pip today, so I'd have to agree with him.

The issue is that criticising Python packaging has been done to death for years. Anyone involved in Python packaging knows there are still plenty of warts and areas for improvement. So another round of "why isn't this fixed yet?" without engaging with the details of the discussion is not actually driving anything forwards.

I will endeavour to resist the urge to reply again for at least the rest of the day.

@nicoddemus
Copy link

Progress is slow. But it's not non-existent. And there are reasons why it's slow. People complaining that the volunteer labour "doesn't get things done faster" is one of the reasons it's slow, because it discourages and burns out the people whose freely given efforts are being dismissed as insufficient. I speak from experience here, as I know I'd do far more on pip if I didn't have to review so many issues that left me feeling demotivated.

As a maintainer of a project used by many (pytest), I definitely concur with this statement.

@merwok
Copy link

merwok commented Sep 27, 2020

It would really help if people made concrete notes about what is not good in Python tools, and what is great in other tools.

@Pomax
Copy link

Pomax commented Sep 29, 2020

That would be the comment that started this thread. In npm land, which is probably the best publishing experience, there is one tool, and just one tool, and the mirror-sequence of steps is:

  • Register on NPM: this has to be done through the website
  • Login to NPM: one-time npm login, until you uninstall NPM or reformat your machine or the like
  • Build distributions: there is no true "publishing-specific" building of distributions. Instead:
    • there is a one-time npm init to set up the publishing metadata, with a guided CLI experience that asks you for all the information required
    • there are dedicated npm version major, npm version minor, and npm version patch commands to make "sticking to semver" as easy as possible for maintainers, which makes it possible for people who use npm packages to trust that patch/minor changes don't break a codebase when uplifting (at least, to the degree where any package that breaks that trust is a genuine surprise).
    • there is no separate building/packing step, you just need to make sure your current state is synced to your repo (code + tags).
  • Upload distributions: npm publish. This command packs up your local files, with optional exclusions through either .gitignore or .npmignore, but that archive only exists in memory, for as long as it needs in order to be uploaded.

This is essentially frictionless, through a single tool. Yes, a competitor was written to address NPM's slowness, called "yarn": but they quite wisely decided to make it work in exactly the same way, so if you're coming to Python from the Node ecosystem at least (or if you're a long time user of Python and you started working with Node), you are effectively spoiled with an excellent publishing flow and tooling for that.

There were dicsussions around having pip "wrap" other tools, so that it could at least act as front-end for the publishing workflow and people would only need the one command: that would still be amazing. Sure, it would end up preferring one tool over another, but that's fine: folks who don't want to have to care, don't have to care, and folks who do care don't need to change their publishing flow and can keep using their preferred tools for (part of) the release chain as before.

@pfmoore
Copy link
Member

pfmoore commented Sep 29, 2020

There were dicsussions around having pip "wrap" other tools, so that it could at least act as front-end for the publishing workflow and people would only need the one command: that would still be amazing.

One suggestion - not intended as deflection, but as a genuine way for community members to help explore the design and maybe tease out some of the inevitable difficulties in integrating existing tools in such a front end. Maybe someone could build a tool that acted as nothing but that front end - providing the sort of user interface and workflow that node users find so attractive (I've never used node myself, so I have no feel for how npm "feels" in practice), while simply calling existing tools such as pip, twine etc, to deliver the actual functionality.

If we had such a frontend - even in the form of a working prototype - it would be a lot easier to iterate on the design, and then, once the structure of the UI has been sorted out to work in the context of Python, we could look at how (or maybe if) we would integrate the command structure into pip or whatever.

@layday
Copy link
Member

layday commented Sep 29, 2020

I think it is important to recognise that these complaints pertain to setuptools. Working with Flit and Poetry, which provide their own CLI, is not unlike working with npm. The addition of a pip publish command will not meaningfully improve the situation with setuptools - not least because pip does not have an accompanying build command (there is a wheel command but that only builds.... wheels) and neither does setuptools (the setuptools CLI is deprecated and slated for removal). There is work being done in this area but it is slow both for historical reasons and for lack of resources. python-build is a generic build tool which - as I understand it - will be adopted by pip once stable and blessed by setuptools. There has been discussion on improving the setuptools documentation and on adopting pyproject.toml. There are several PEPs under discussion which seek to standardise on how to specify package metadata in pyproject.toml. These are all things that open up new possibilities for pip and for other tools, like the hypothetical integrated frontend that @pfmoore has mentioned above.

@Pomax
Copy link

Pomax commented Sep 29, 2020

I think it's probably also worth noting that a (small?) elephant in the room is that if you're coming to Python from another language, or even if it's your first exposure, you get told by nearly everyone that "how you install things" is through pip. So even if in reality it's just one of many ways to install/publish packages, and some of the alternatives are quite streamlined, that's not what people are being taught to think of pip as. It's essentially "python and pip" in the same vein as "node and npm" or "rust and cargo" etc. That's not something anyone can even pretend to have any control over at this point, of course, but it's a strong factor in what people new to Python, or even folks "familiar enough with python to use it on a daily basis alongside other languages" have been implicitly conditioned to expect from pip.

Having someone write a "unified" CLI proof of concept tool sounds like a great idea, and I'd be more than happy to provide input around the "npm experience" (having published a fair number of packages there), although I would not consider myself familiar enough with the various tools (or with enough time to deep-dive) to write that PoC myself.

@pganssle
Copy link
Member

We had basically these same arguments about adding a pip build command. I am personally way more in favor of @pfmoore's idea (and it's one I've suggested beofre) of having a top-level tool that wrangles all the various other tools for you. There's a bunch of complications that come with cramming a bunch of different tools (publisher, builder, installer) into a single common tool.

For example, the motivation behind the unix philosophy does-one-thing-well tools for building distributions and installing wheels is that many downstream distributors feel the need to bootstrap their whole builds from source, and it's a pain in the ass to bootstrap a swiss army knife monolith like pip compared to small, targeted tools.

I also think that it's easy to look at cargo and npm and think that they have everything figured out, but these big monolithic tools can be a real problem when because of poor separation of concerns, nominally unrelated aspects of them become tightly coupled. I know a few places where we've had problems because cargo didn't (doesn't?) support any endpoint other than crates.io, and in general they are also in the early phase, when there's still a lot to build out, and not necessarily a lot where they are suddenly constrained from making any big changes.

I'm not saying that those ecosystems and all-in-one tools are worse than what we have or even that there's no benefits to them, but in the past we had an all-in-one tool for this: distutils. setup.py was extensible, had a bunch of built-in stuff for building, testing, installation, etc. Over time bitrot, tight coupling and poorly defined interfaces have made it mostly a minefield, and on top of that it's just fundamentally incompatible with how software development works these days.

I think a bunch of individual tools with one or more wrapper CLIs for various purposes makes a lot of those problems much more tractable in the long term, and might help the people clamoring for a "single endpoint".

@pfmoore
Copy link
Member

pfmoore commented Sep 29, 2020

Having someone write a "unified" CLI proof of concept tool sounds like a great idea, and I'd be more than happy to provide input around the "npm experience" (having published a fair number of packages there), although I would not consider myself familiar enough with the various tools (or with enough time to deep-dive) to write that PoC myself.

To be honest, no-one had that sort of familiarity with the tools/ecosystem when they started. Why not just write a gross hack and see how things develop from there?

mypip.py

import subprocess
import sys

if __name__ == "__main__":
    if sys.argv[1] == "publish":
        subprocess.run(["twine", "upload"] + sys.argv[2:]
    else:
        subprocess.run(["pip"] + sys.argv[1:])

In all seriousness, that's the bare bones of a utility that adds a "publish" command to pip's CLI. Clearly, there's a lot of work to make even a prototype out of this, but if you started with that and actually used it, and improved it as you hit annoyances/rough edges, you'd pretty soon end up with something worth sharing. Most of the tools I've ever written started out like this.

(I'm not trying to insist that you do this - just pointing out that "I don't know enough" is actually far less of a hurdle than people fear).

@Pomax
Copy link

Pomax commented Sep 29, 2020

No worries - for me personally, it's not just "I don't know enough" - I've done deep dives before, and they're usually fun - but it's very much also "and I don't have any free time for the next decade" because of the number of commitments I already have.

@duaneking
Copy link

The sequence of steps is complex and made stressful by all the decisions left to the user:

That link gives a 404.

@Pomax
Copy link

Pomax commented Jun 20, 2024

It sure does, but it's not exactly hard to just go to packaging.python.org and find the current page, which is https://packaging.python.org/en/latest/guides/distributing-packages-using-setuptools/

@ofek
Copy link

ofek commented Jun 20, 2024

Consider using a workflow tool instead, like Hatch, to simplify development: https://hatch.pypa.io/latest/publish/

@henryiii
Copy link
Contributor

I updated the link.

You can also use pipx run twine these days to avoid installing things manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests