Skip to content

Conversation

@Yashp002
Copy link

Fixes #13559

Adds a brief privacy notice to the root documentation page (docs/html/index.md) as suggested by @ichard26.

The notice is kept short and includes:

  • What data pip collects (anonymized usage data)
  • What is NOT collected (personally identifiable information)
  • A link to the full privacy policy

This addresses the need for transparency regarding pip's data collection practices.


## Privacy Notice

pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and
Pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and

@Yashp002
Copy link
Author

@pfmoore sorry I overlooked your suggestion before recommitting, could you bring it up again?


## Privacy Notice

pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and
Pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and

Repeated as requested.

I was surprised that github didn't simply carry this suggestion forward. As far as I can see, you didn't force-push or anything that would have made the original suggestion unmergeable. Weird.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, I see - you removed a trailing space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still hasn't been incorporated. Sentences should start with a capital letter.

Co-authored-by: Paul Moore <p.f.moore@gmail.com>
pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and
user experience. No personally identifiable information is collected. For more details, see pip's [Privacy Policy](https://www.
pypa.io/privacy/).
pypa.io/privacy/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it seems strange to line break in the middle of a URL. Even if it works, can you break the line somewhere else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep sure :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

god there seems to be pre commit issues, please standby for a minute

@Yashp002
Copy link
Author

Fixed, Moved the URL to a single line to avoid the line break.

Copy link
Member

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should've been clearer. I don't think we should be implying that the pip project is directly collecting any telemetry.1 What pip does is send over telemetry through the User-Agent HTTP header and then it's up to the remote index to store/process that if it wishes to. In other words, I'd like something along the lines of #13559 (comment).

Also, the PyPA doesn't have an official privacy policy. I'm not sure where that link came from, but it returns a 404.

Footnotes

  1. Given the related issue was raised because corporate environments are sometimes put off by the linehaul service, we want to make it clear that if they use their private indices, there are zero privacy implications.

## Privacy Notice

<<<<<<< HEAD
pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and
Copy link
Member

@notatallshaw notatallshaw Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collects

Collects implies pip is storing data somewhere, it is not, it transmits some environment information to the remote index, which may or may not collect that data.

anonymized

Anonymized is usually the term used for removing or obfuscating personally identifying data, pip isn't removing or obfuscating data so I don't think it's the correct term.

(pip version, Python version, and command success/failure)

This is not a complete list, so it should either be complete or make it clear it's not complete, e.g. by adding an "etc."

I also don't think it's an accurate list, "command success/failure" is that true? How does that even work? Pip doesn't know if a command will succeed or fail until after the HTTP request.

improve reliability and user experience

Are either of these stated goals of line haul?

<<<<<<< HEAD
pip collects anonymized usage data (pip version, Python version, and command success/failure) to help improve reliability and
user experience. No personally identifiable information is collected.
For more details, see pip's [Privacy Policy](https://www.pypa.io/privacy/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip's Privacy Policy.

This isn't a real thing, even if it was it wouldn't be pip's private policy it would be PyPA's.

As a side note, if you are using AI to generate the language here it is your responsibility to validate the accuracy of the language before submitting the PR.

@ichard26
Copy link
Member

ichard26 commented Nov 16, 2025

The pip project does not collect any telemetry, however, pip will send non-identifying
environment information (Python version, OS, etc.) to any remote indices used, who may
choose to retain such information. Please consult PyPI's privacy policy for their
data collection and retention practices.

How about something like this? The wording needs work, but the general idea is there.

Unfortunately, PyPI's privacy policy does not mention the linehaul service at all. That seems like a glaring omission.


## Privacy Notice

<<<<<<< HEAD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've managed to somehow leave a merge conflict marker in the file.

@pfmoore
Copy link
Member

pfmoore commented Nov 16, 2025

How about something like this?

That seems like much better wording.

@Yashp002
Copy link
Author

@pfmoore @ichard26
I've updated the notice to clarify that:

  • Pip itself doesn't collect or store any data
  • Environment info is only transmitted to remote indices via User-Agent headers
  • It's the remote index (like PyPI) that may retain the information

I've also removed the link, I seem to have followed an unverified link, my bad. Let me know if the wording further needs any improvements:)


## Privacy Notice

The pip project does not collect any telemetry, however, pip will send
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the phrasing "the pip project" is a little awkward here. It's the program, not the project, that we're talking about. Can't we just say "Pip does not collect any telemetry..."? If this is a circumlocution to avoid the debate over whether we capitalise "pip" at the start of a sentence, I'd rather just have that debate now (IMO, we should capitalise - "pip" is just a word like any other, it shouldn't have special capitalisation rules).

Having said this, I really don't care that much - if another maintainer wants to approve this PR as it stands, I'm fine with that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I agree on this, Pip does not collect any telemetry sounds more fitting imo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here's my suggested change:

Suggested change
The pip project does not collect any telemetry, however, pip will send
Pip does not collect any telemetry, however, it will send

I'll wait a while in case any other maintainers care enough to object, otherwise I'll make this change and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Include explicit privacy statement in docs

4 participants