Description
Elasticsearch version: 7.4.2
Plugins installed: [repository-s3, discovery-ec2]
JVM version: Bundled version
OS version: 4.14.138-114.102.amzn2.x86_64 (Amazon Linux 2)
Description of the problem including expected versus actual behavior:
Running a date_histogram on a date_range field with document values that have null "lte" appears to cause the aggregation to create an infinite number of buckets and I get an out of memory error. The aggregation works well for fully defined date ranges (the first document in my example), but I have a large number of documents that have an undefined end point (value is null, second document in my example). I have been looking for a way to limit the buckets created in the query, but without luck. I've also tried to use a date_range aggregation on the fields to attempt to limit the potential buckets, but that caused cast exceptions.
Basically I'm attempting to find all of the months up to now that these documents cover/touch. I'd like to see buckets from 2017-10-01 to 2019-12-01 (as of the time of this issue being written).
Steps to reproduce:
- Create index:
PUT test
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"active_range": {
"type": "date_range",
"format": "yyyy-MM-dd"
}
}
}
}
- Add documents:
POST test/_doc/
{
"active_range": {
"gte": "2017-10-10",
"lte": "2018-10-10"
}
}
POST test/_doc/
{
"active_range": {
"gte": "2017-10-10",
"lte": null
}
}
- Run aggregation:
GET test/_search
{
"size": 0,
"aggs": {
"active": {
"date_histogram": {
"field": "active_range",
"calendar_interval": "month"
}
}
}
}