Skip to content

Domain validator allows invalid characters in some cases #366

Closed
@hagenrd

Description

@hagenrd

It appears that, currently, any character is valid for the final character in the gTLD if rfc1034 is True, for example:

>>> domain('example.com?', rfc_1034=True, rfc_2782=False)
True
>>> domain('example.com!', rfc_1034=True, rfc_2782=False)
True

I believe the '.' just needs to be escaped in the pattern string (link):

+ rf"[a-z]{r'.?$' if rfc_1034 else r'$'}",
             ^

Also, it appears question marks are allowed when rfc_2782 is True for domain validation:

>>> from validators import domain
>>> domain('example?.com', rfc_1034=False, rfc_2782=True)
True

This appears to be from the use of '?' after the '_' inside of a character class:

rf"^(?:[a-z0-9{r'_?'if rfc_2782 else ''}]"
                  ^

Presumably, this is to make the '_' optional, but since metacharacters aren't active in character classes (link), this is interpreted as a literal '?' instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue: Works not as designed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions