Description
Lucene 9386 introduced an option for case insensitive matching of RegExp queries. This issue is to expose that capability to elasticsearch users.
One complexity with the feature is that it currently only supports ASCII characters and full support for Unicode has been left as a TODO. Rather than being a simple Boolean flag (eg case sensitive - yes or no?) there is a bitmask parameter which accepts only one flag at the moment - "ASCII_CASE_INSENSITIVE". In future the assumption is that we could use this same bitmask parameter and pass a new "(Unicode)CASE_INSENSITIVE" flag.
Another concern is in Lucene we renamed the previous "flags" constructor parameter to "syntax_flags" to make it distinct from the new "match_flags" parameter. Renaming constructor parameters works OK in Java because they are identified by position and type but in our REST api we rely on JSON field names so have to keep the existing flags parameter.
The question then perhaps is how best to expose the new case insensitive matching flag in the REST api?
Options are:
- Add an "ASCII_CASE_INSENSITIVE" value option for the existing
flags
parameter - Add a new "match_flags" parameter with possible value option of "ASCII_CASE_INSENSITIVE"
- Add a simple Boolean flag that can only accept one value (the flag name choice would be
case_sensitive:false
orcase_insensitive:true
). We'd use documentation to declare that only ASCII is supported currently.