Description
Elasticsearch version (bin/elasticsearch --version
): 6.6.1
Plugins installed: []
JVM version (java -version
):
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
OS version (uname -a
if on a Unix-like system):
macOS Mojave 10.14.3
Description of the problem including expected versus actual behavior:
Not sure if this is a current limitation as I could not find any documentation explicitly stating this issue as described below:
When performing an SQL query using aggregate functions such as COUNT(), AVG() with LIKE
and NOT LIKE
, the actual performed query is not as expected.
Using the translate
API, we can see the issue more clearly in Kibana:
POST /_xpack/sql/translate
{
"query": "SELECT COUNT(*), station.keyword as station FROM call WHERE station.keyword LIKE 'B 4E%' and station.keyword NOT LIKE 'B 4E D%' GROUP BY station.keyword"
}
In the below response, we see that the wildcard in must
and wildcard in must_not
are exactly the same. Therefore the expected query data will not be returned correctly.
Response:
{
"size" : 0,
"query" : {
"bool" : {
"must" : [
{
"wildcard" : {
"station.keyword" : {
"wildcard" : "B 4E*",
"boost" : 1.0
}
}
},
{
"bool" : {
"must_not" : [
{
"wildcard" : {
"station.keyword" : {
"wildcard" : "B 4E*",
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"_source" : false,
"stored_fields" : "_none_",
"aggregations" : {
"groupby" : {
"composite" : {
"size" : 1000,
"sources" : [
{
"21587" : {
"terms" : {
"field" : "station.keyword",
"missing_bucket" : true,
"order" : "asc"
}
}
}
]
}
}
}
}
Expected:
{
"size" : 0,
"query" : {
"bool" : {
"must" : [
{
"wildcard" : {
"station.keyword" : {
"wildcard" : "B 4E*",
"boost" : 1.0
}
}
},
{
"bool" : {
"must_not" : [
{
"wildcard" : {
"station.keyword" : {
"wildcard" : "B 4E D*",
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"_source" : false,
"stored_fields" : "_none_",
"aggregations" : {
"groupby" : {
"composite" : {
"size" : 1000,
"sources" : [
{
"21587" : {
"terms" : {
"field" : "station.keyword",
"missing_bucket" : true,
"order" : "asc"
}
}
}
]
}
}
}
}
Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
- create an index named
call
and a fieldstation
that has mapping:
"station" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
- Seed some data for
station
(optional)
B 4E Ph 1
B 4E Ph 2
B 4E D Ph 1
- Using the
translate
API, check the result of using the below
POST /_xpack/sql/translate
{
"query": "SELECT COUNT(*), station.keyword as station FROM call WHERE station.keyword LIKE 'B 4E%' and station.keyword NOT LIKE 'B 4E D%' GROUP BY station.keyword"
}
Provide logs (if relevant):