Skip to content

Search - expose new Lucene option for case insensitive RegExp queries #59235

Closed
@markharwood

Description

@markharwood

Lucene 9386 introduced an option for case insensitive matching of RegExp queries. This issue is to expose that capability to elasticsearch users.

One complexity with the feature is that it currently only supports ASCII characters and full support for Unicode has been left as a TODO. Rather than being a simple Boolean flag (eg case sensitive - yes or no?) there is a bitmask parameter which accepts only one flag at the moment - "ASCII_CASE_INSENSITIVE". In future the assumption is that we could use this same bitmask parameter and pass a new "(Unicode)CASE_INSENSITIVE" flag.

Another concern is in Lucene we renamed the previous "flags" constructor parameter to "syntax_flags" to make it distinct from the new "match_flags" parameter. Renaming constructor parameters works OK in Java because they are identified by position and type but in our REST api we rely on JSON field names so have to keep the existing flags parameter.
The question then perhaps is how best to expose the new case insensitive matching flag in the REST api?
Options are:

  1. Add an "ASCII_CASE_INSENSITIVE" value option for the existing flags parameter
  2. Add a new "match_flags" parameter with possible value option of "ASCII_CASE_INSENSITIVE"
  3. Add a simple Boolean flag that can only accept one value (the flag name choice would be case_sensitive:false or case_insensitive:true). We'd use documentation to declare that only ASCII is supported currently.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Search/SearchSearch-related issues that do not fall into other categories>enhancementTeam:SearchMeta label for search team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions