-
-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a Unique validator #1793
base: dev
Are you sure you want to change the base?
add a Unique validator #1793
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature has been requested in the past. See #1539. The resolution was fairly neutral, so I suspect a working implementation coupled with nested attribute handling gives this a better chance of landing.
I implemented something similar in a project of mine, but never clean enough to share it here. I had issues, and I think this implem has the same issues, if the iterable contains items that are not hashable because of the set being used. I ended up using a list rather than a set for the comparison if items are not hashable. I added a This is old and I didn't investigate it much more than this today. Not 100% sure it is relevant. Just bringing this up to avoid letting the issue slip through. |
I am tempted to suggest ignoring that use case, but it wouldn't take much code to fall back to the list based membership check when a def __call__(self, value):
ids = [
getattr(item, self.attribute) if self.attribute else item
for item in value
]
try:
duplicate = self._duplicate_hash(ids)
except TypeError:
duplicate = self._duplicate_equal(ids)
if duplicate:
raise ValidationError(self._format_error(duplicate)) Side rantI would be nice if the type system inherited a generic |
b6661df
to
1114b97
Compare
Thanks. This leaves open @deckar01's comment above about The docstring and error messages could be reworded for better english but this can be done afterwards. The non-hashable case is not tested. We generally don't send back faulty values in error messages. Just the fact that the value is invalid. Otherwise, I agree on the principle. |
Should we remove the value that is not unique in the error then? Regarding the use of attribute: That solves the case when you have this |
I understand the need to check the uniticy of a subitem (I managed the same case in an app of mine). What I'm saying is that unless you use a post-load decorator to instantiate an object, you're dealing with dicts at this stage so getattr shouldn't work. Or am I missing something? |
Actually that's a question I had when dealing with this. If I set a field of a Example: @dataclass
class Bar:
custom_id: int
@dataclass
class Foo:
bar_list: List[Bar] = field(
metadata={
"validate": Unique(attribute="custom_id")
}
)
foo = {"bar_list": [{"custom_id": 1}, {"custom_id": 1}]}
FooSchema = marshmallow_dataclass.class_schema(Foo)()
FooSchema.load(foo) This triggers the marshmallow.exceptions.ValidationError: {'bar_list': ['Found a duplicate object attribute (custom_id): 1.']} Debugging the variables when
Let me know if I'm also missing something... may it be because I'm using |
marshmallow-dataclass does instantiation on load so yes, that would be the reason you have an object at this stage and not a dict. A test should be added to validate unique "attribute" using dicts, not objects. |
@lafrech dict unittests have been added, also added a nested attribute unittests. Should object tests be removed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If value
is not attribute/item, I'm afraid get_value
will return missing
.
If the list has multiple elements, validation will fail due to multiple missing
but if it has a single element, it will pass silently, which is bad, at this stage (after deserialization).
Maybe it's not such a good idea to use get_value
, or missing should be caught. Or I'm totally mistaken.
Also, I wouldn't refer to value
as "attribute" name since it can also be an item. Maybe key
, to match data_key
.
It might be too complicated to achieve in a generic enough fashion to reach the core.
At @midokura we have encountered that
marshmallow
did not have any duplicate validator. We are sharing the implementation we are currently using so that others could benefit from it too.This new validator is designed to work only with iterables.
In the case of having a value which is a list of objects you can check if one is repeated using an
attribute
of the object to check if it is repeated or not.pytests
have also been added.Hope it can be reviewed and at some point included in a future release.