Description
Bug Report:
In ES 5.x and prior, index names & templates could use YYYY.MM.DD
to bucket datetimes such as @timestamp
. This enabled ES admins to limit the max timerange a user could query by setting action.search.shard_count.limit
to number_of_shards_per_day x max_days
.
For a more concrete example, we currently store 30 days of data, dropping older indexes via Curator. Each day is ~4TB of data, split across 80 shards. Users querying 30 days of data generally brings our cluster to a halt, and so we set action.search.shard_count.limit
to 640
in order to give users a 7-day query limit (with tails on each side).
Starting in 6.0, time-based index patterns are no longer supported. This in and of itself is not a problem, but it appears that action.search.shard_count.limit
is being enforced before the shards determine if they have data within a Range Query.
The spirit of the setting feels like it should be applied after shards determine whether or not a query should be executed against them.
Elasticsearch version (bin/elasticsearch --version
):
Tested running via sebp's Docker image, but have also experienced this on a production ES 6.4.1 cluster.
root@aa0fc2c3283d:/# /opt/elasticsearch/bin/elasticsearch --version
Version: 6.4.2, Build: default/tar/04711c2/2018-09-26T13:34:09.098244Z, JVM: 1.8.0_181
Plugins installed: none
JVM version (java -version
):
root@aa0fc2c3283d:/# java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.16.04.1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
OS version (uname -a
if on a Unix-like system):
root@aa0fc2c3283d:/# uname -a
Linux aa0fc2c3283d 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
Actual: When submitting a query that is limited to one day (index) using a range
query, action.search.shard_count.limit
prevents a search from executing.
Expected: When submitting a query that is limited to one day, action.search.shard_count.limit
should be applied after determining which indexes to execute on.
Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
- Start Elasticsearch (for ease, use a Docker container:
docker run -p 5601:5601 -p 9200:9200 -p 5044:5044 -it sebp/elk:642
) - Insert some data on two different days:
a.curl -XPUT "http://localhost:9200/logstash-2018.11.25/doc/1?pretty" -H 'Content-Type: application/json' -d'{"custom_timestamp":"2018-11-25T19:25:51+00:00"}'
b.curl -XPUT "http://localhost:9200/logstash-2018.11.26/doc/1?pretty" -H 'Content-Type: application/json' -d'{"custom_timestamp":"2018-11-26T19:25:51+00:00"}'
- Set query limit to
5
shards (default limit per index as of 6.4.2) (curl -XPUT "http://localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d' {"persistent":{"action.search.shard_count.limit":"5"}}'
) - Query a very small time range that should only hit one index (
curl -XGET "http://localhost:9200/logstash-*/_search" -H 'Content-Type: application/json' -d' {"query":{"bool":{"must":[{"match_all":{}},{"range":{"custom_timestamp":{"gte":1543259576210,"lte":1543260476210,"format":"epoch_millis"}}}]}}}'
) - Observe
illegal_argument_exception