You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on porting Warehouse over to using packaging.metadata to drive our validation, and as I'm teasing apart the existing code it's become clear to me that our metadata validation in Warehouse falls into two categories:
Validation that is required by the core metadata spec for the metadata to be valid at all.
Validation that Warehouse adds on top of the core metadata spec for various application specific policies.
I'm currently implementing this in Warehouse by basically doing this:
The metadata validation natively uses ExceptionGroup to collect all of the validation errors and raising them all at once, however there's not a clear and easy way to do that with our additional validation, so we don't see the extra validation errors if any "built in" validation errors happen.
We went to great lengths to support on demand validation of fields, allowing applications to have partial validation of only the data that they personally care about. However, implementing on demand "extra" validation would require either a wrapper around Metadata that dispatches to the underlying Metadata, fetches the value, then adds the extra validation.
An idea that popped into my mind is allowing passing in a "validation policy" of some sort, that would be used to inform the Metadata class of these additional validation checks, and have them handled inline with all of the other validation.
A policy could look something like:
importreadmeclassMetadataValidationPolicy:
# TODO: These returns an InvalidMetadata, mostly to keep it similar to full_validate, however# it's probably more ergonomic/expected to raise it? Maybe even full_vallidate should# raise and when the metadata class calls it, it supports either raising an InvalidMetadata# or a ExceptionGroup containing InvalidMetadata.defvalidate_metadata_version(self, metadata_version: str) ->InvalidMetadata|None:
# PyPI currently only supports *some* Metadata versions, but not all of them, and even if# we did support all current ones, new ones require integration work so we don't want to# start accepting them implicitly.ifmetadata_versionnotin {"1.0", "1.1", "1.2", "2.0", "2.1"}:
returnInvalidMetadata("metadata-version", f"{metadata_version!r} is an unsupported metadata version")
defvalidate_version(self, version: Version) ->InvalidMetadata|None:
ifversion.local:
returnInvalidMetadata("version", f"The use of local versions in {version!r} is not allowed.")
deffull_validate(self, metadata: Metadata) ->list[InvalidMetadata]:
errors: list[InvalidMetadata] = []
# This is sort of a bad example, because we wouldn't put this in here because we wouldn't# want to throw away the rendered value, but just as an example.ifmetadata.description:
description_content_type=metadata.description_content_typeor"text/x-rst"rendered=readme.render(metadata.description, description_content_type, use_fallback=False)
ifrenderedisNone:
errors.append(InvalidMetadata("description", f"Failed to render the description using {description_content_type!r}"))
returnerrors
Given a policy, the question then becomes how would we inform the Metadata class about it.
The easiest mechanism for doing so I think would be as an additional keyword parameter to the Metadata.from_* methods, so something like:
We don't currently support creating a Metadata instance dynamically other than through the from_* methods, but you could imagine if we did then we could still use a keyword argument to handle that:
Of course that would mean we can't ever have an core metadata field named validation_policy that we want to set through the constructor, but I think it's unlikely that we ever do (and of course if we are worried about it there are other tricks like using _validation_policy or something.
We could even support subclassing Metadata and setting a subclass wide validation policy, if a project wants that rather than manually passing it in to each invocation, using something like:
Of course, supporting this use case does add a non trivial amount of additional complexity to packaging.metadata, and since these validations are purely additive it is entirely possible to implement this outside of the metadata library with relatively minor downsides, so I think it would be perfectly reasonable to declare this out of scope. I wanted to raise it as an idea though, as it kind of surfaced during my work of integrating packaging.metadata into Warehouse.
The text was updated successfully, but these errors were encountered:
I'm working on porting Warehouse over to using
packaging.metadata
to drive our validation, and as I'm teasing apart the existing code it's become clear to me that our metadata validation in Warehouse falls into two categories:I'm currently implementing this in Warehouse by basically doing this:
This of course works, but it has a few negatives:
ExceptionGroup
to collect all of the validation errors and raising them all at once, however there's not a clear and easy way to do that with our additional validation, so we don't see the extra validation errors if any "built in" validation errors happen.Metadata
that dispatches to the underlyingMetadata
, fetches the value, then adds the extra validation.An idea that popped into my mind is allowing passing in a "validation policy" of some sort, that would be used to inform the
Metadata
class of these additional validation checks, and have them handled inline with all of the other validation.A policy could look something like:
Given a policy, the question then becomes how would we inform the
Metadata
class about it.The easiest mechanism for doing so I think would be as an additional keyword parameter to the
Metadata.from_*
methods, so something like:We don't currently support creating a
Metadata
instance dynamically other than through thefrom_*
methods, but you could imagine if we did then we could still use a keyword argument to handle that:Of course that would mean we can't ever have an core metadata field named
validation_policy
that we want to set through the constructor, but I think it's unlikely that we ever do (and of course if we are worried about it there are other tricks like using_validation_policy
or something.We could even support subclassing
Metadata
and setting a subclass wide validation policy, if a project wants that rather than manually passing it in to each invocation, using something like:Of course, supporting this use case does add a non trivial amount of additional complexity to
packaging.metadata
, and since these validations are purely additive it is entirely possible to implement this outside of the metadata library with relatively minor downsides, so I think it would be perfectly reasonable to declare this out of scope. I wanted to raise it as an idea though, as it kind of surfaced during my work of integratingpackaging.metadata
into Warehouse.The text was updated successfully, but these errors were encountered: