You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What's the problem this feature will solve?
There's no clear picture in pip's internal code of when a project name is correctly normalised. As a result, code tends to call canonicalize_name() "just in case".
While the cost of the extra calls is small, it also makes it difficult to reason about the logic, and the more difficult it is, the more likely that people will "just call canonicalize_name() to be sure", compounding the issue.
Describe the solution you'd like
A well-documented and clear indication of which parts of the code are responsible for ensuring that project names are in canonical form, so that the rest of pip's code can confidently just use names as provided to it.
Alternative Solutions
It would be nice if we could use type checks to enforce normalisation hygene, but apparently MyPy treats type aliases as identical, so doesn't support this. It's not (IMO) obvious that it's worth the cost of having an actual NormalizedName class. I could be convinced otherwise, but I don't think it's a productive place to start.
Additional context
The issue here is similar in principle to Unicode-safety, and can be viewed in the same way - normalise at the boundaries of the code, and use normalised values exclusively throughout the internal code.
The text was updated successfully, but these errors were encountered:
Note: I do not think this should be classed as a "good first issue", as it will involve working on some of the more complex parts of pip's internals, and it's not clear that there's good test coverage of the types of issue that could arise from mistakes in the refactoring.
What's the problem this feature will solve?
There's no clear picture in pip's internal code of when a project name is correctly normalised. As a result, code tends to call
canonicalize_name()
"just in case".While the cost of the extra calls is small, it also makes it difficult to reason about the logic, and the more difficult it is, the more likely that people will "just call
canonicalize_name()
to be sure", compounding the issue.Describe the solution you'd like
A well-documented and clear indication of which parts of the code are responsible for ensuring that project names are in canonical form, so that the rest of pip's code can confidently just use names as provided to it.
Alternative Solutions
It would be nice if we could use type checks to enforce normalisation hygene, but apparently MyPy treats type aliases as identical, so doesn't support this. It's not (IMO) obvious that it's worth the cost of having an actual
NormalizedName
class. I could be convinced otherwise, but I don't think it's a productive place to start.Additional context
The issue here is similar in principle to Unicode-safety, and can be viewed in the same way - normalise at the boundaries of the code, and use normalised values exclusively throughout the internal code.
The text was updated successfully, but these errors were encountered: