Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

A third batch of Pydantic validation for rest/client/account.py #13736

Merged
merged 11 commits into from
Sep 15, 2022
7 changes: 7 additions & 0 deletions synapse/rest/client/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from enum import Enum
from typing import TYPE_CHECKING, Dict, Optional

from pydantic import Extra, StrictInt, StrictStr, constr, validator
Expand All @@ -19,6 +20,12 @@
from synapse.util.threepids import validate_email


class ThreepidMedium(str, Enum):
# Per advice at https://pydantic-docs.helpmanual.io/usage/types/#enums-and-choices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be specific about what advice you're applying / what this is commenting on?
The constants being lowercased perhaps (main thing that strikes me as weird)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. My first thought was to say medium: StrictStr but I wanted to see if there was a way to say that "medium should be one of these specific strings". The link seems to be the blessed way to do this.

What I was suspicious of is the multiple inheritance from str and Enum. These ThreepidMedium types will get passed through to the rest of the application, where I want them to be treated like any other str. I'm not fully convinced that the rest of the application will be happy with that tbh... but I don't have anything more concrete than a bad feeling about this.

Having said that, looking here it might be possible to write medium: Literal["email", "msisdn"]. If it works, that seems a bit less magic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having said that, looking here it might be possible to write medium: Literal["email", "msisdn"]. If it works, that seems a bit less magic.

This seems to work pretty well:

class ThreepidMediumEnumTestCase(stdlib_unittest.TestCase):
    class Model(BaseModel):
        medium: Literal["email", "msisdn"]

    def test_accepts_valid_medium_string(self) -> None:
        """Sanity check that Pydantic behaves sensibly with an enum-of-str

        This is arguably more of a test of a class that inherits from str and Enum
        simultaneously.
        """
        model = self.Model.parse_obj({"medium": "email"})
        self.assertIsInstance(model.medium, str)
        self.assertEqual(model.medium, "email")

    def test_rejects_invalid_medium_value(self) -> None:
        with self.assertRaises(ValidationError) as ctx:
            self.Model.parse_obj({"medium": "interpretive_dance"})
        print(ctx.exception)

    def test_rejects_invalid_medium_type(self) -> None:
        with self.assertRaises(ValidationError) as ctx:
            self.Model.parse_obj({"medium": 123})
        print(ctx.exception)

The error messages generated are

Ran 3 tests in 0.002s

PASSED (successes=3)

Process finished with exit code 0
1 validation error for Model
medium
  unexpected value; permitted: 'email', 'msisdn' (type=value_error.const; given=123; permitted=('email', 'msisdn'))
1 validation error for Model
medium
  unexpected value; permitted: 'email', 'msisdn' (type=value_error.const; given=interpretive_dance; permitted=('email', 'msisdn'))

(the errors are a little technical, but see #13337)

Copy link
Contributor

@reivilibre reivilibre Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually totally fine with the enum, but enum constants usually have uppercase names — that's what I was mostly wondering about (alongside a slightly underspecified comment).

Inheriting from both str and Enum and supplying a proper __str__ is, as far as I remember, the sensible way to do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In passing, I noticed pydantic/pydantic#4505)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. That's fair.

I really want to like stdlib enums, but I've been bitten by their sharp corners in the past. We've even seen it ourselves with IntEnum a few times I think? In that light, I think I prefer the Literal approach. But I can also see that a proper Enum can be passed throughout the rest of the application.

Maybe we should dwell on this for a bit. (I'd be interested in other opinions too!)

email = "email"
msisdn = "msisdn"


class AuthenticationData(RequestBodyModel):
"""
Data used during user-interactive authentication.
Expand Down
41 changes: 37 additions & 4 deletions tests/rest/client/test_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,47 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import unittest
import unittest as stdlib_unittest

from pydantic import ValidationError
from pydantic import BaseModel, ValidationError

from synapse.rest.client.models import EmailRequestTokenBody
from synapse.rest.client.models import EmailRequestTokenBody, ThreepidMedium


class EmailRequestTokenBodyTestCase(unittest.TestCase):
class ThreepidMediumEnumTestCase(stdlib_unittest.TestCase):
class Model(BaseModel):
medium: ThreepidMedium

def test_accepts_valid_medium_string(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we should roll our own StrEnum that also does __repr__ and __str__ in a less surprising way (i.e. by calling it on the string value)?
I suppose it ultimately doesn't matter for the sake of Pydantic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that 3.11 will have a StrEnum in the stdlib: https://docs.python.org/3.11/library/enum.html?highlight=strenum#enum.StrEnum

There's a backport here. If we do this I'd prefer to import from stdlib and fallback to the backport.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely fine with that idea, as long as we have confidence that the usual pitfall around __str__ is addressed

"""Sanity check that Pydantic behaves sensibly with an enum-of-str

This is arguably more of a test of a class that inherits from str and Enum
simultaneously.
"""
model = self.Model.parse_obj({"medium": "email"})
self.assertIsInstance(model.medium, str)
self.assertEqual(model.medium, "email")
self.assertEqual(model.medium, ThreepidMedium.email)
self.assertIs(model.medium, ThreepidMedium.email)

self.assertNotEqual(model.medium, "msisdn")
self.assertNotEqual(model.medium, ThreepidMedium.msisdn)
self.assertIsNot(model.medium, ThreepidMedium.msisdn)

def test_accepts_valid_medium_enum(self) -> None:
model = self.Model.parse_obj({"medium": ThreepidMedium.email})
self.assertIs(model.medium, ThreepidMedium.email)

def test_rejects_invalid_medium_value(self) -> None:
with self.assertRaises(ValidationError):
self.Model.parse_obj({"medium": "interpretive_dance"})

def test_rejects_invalid_medium_type(self) -> None:
with self.assertRaises(ValidationError):
self.Model.parse_obj({"medium": 123})


class EmailRequestTokenBodyTestCase(stdlib_unittest.TestCase):
base_request = {
"client_secret": "hunter2",
"email": "alice@wonderland.com",
Expand Down