Skip to content

The POSIX standard does not appear to allow empty regex... #27

Open
@twhitehead

Description

@twhitehead

Just a quick note that a lot of the other-implementations-are-not-compliant examples appear to be about empty patterns (e.g., issues with the matching of () in (()|.)(b)).

If you read the linked to POSIX standard, however, it seems that such empty expressions are not actually valid regexs. For example, the defined extended regex grammar is

extended_reg_exp   :                      ERE_branch
                   | extended_reg_exp '|' ERE_branch
                   ;
ERE_branch         :            ERE_expression
                   | ERE_branch ERE_expression
                   ;
ERE_expression     : one_char_or_coll_elem_ERE
                   | '^'
                   | '$'
                   | '(' extended_reg_exp ')'
                   | ERE_expression ERE_dupl_symbol
                   ;
one_char_or_coll_elem_ERE  : ORD_CHAR
                   | QUOTED_CHAR
                   | '.'
                   | bracket_expression
                   ;
ERE_dupl_symbol    : '*'
                   | '+'
                   | '?'
                   | '{' DUP_COUNT               '}'
                   | '{' DUP_COUNT ','           '}'
                   | '{' DUP_COUNT ',' DUP_COUNT '}'
                   ;

from which I don't see how you can form () as it must contain a extended_reg_exp which has to consist of at least one ERE_branch which must consist of at least one ERE_expression which must have at least one character of some sort.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions