Description
Elasticsearch version (bin/elasticsearch --version
): 6.1.4
Plugins installed: n/a
JVM version (java -version
): 1.8.191
OS version (uname -a
if on a Unix-like system): Centos 7.5
Description of the problem including expected versus actual behavior:
In some conditions, the variance computed in an extended_stats aggregation is computed as a negative number that should never append.
The variance is a sum of positive numbers, hence cannot be negative. What makes it negative here is the way it is computed (probably as the difference of two positive numbers here: "sum_of_squares / count" and "avg * avg"). Due to the non-infinite precision of floating point numbers, both numbers are 'almost' the same...
Proposed solution:
At least prevent negative values to appear in the variance: add a "Math.max(0.0, ...)" to the existing computation formula.
Steps to reproduce:
Using the attached zip file, do the following :
- in env.properties: put the host/port to the elastic-search instance (default: localhost:9200)
- run ./reproduce.sh my-test-index (or any other index name, beware, this index will be dropped!)
What does it do?
- drop the given index
- create it new with a specific mapping: index: long, amount: double
- load 3 documents with same amount: 49.95
- ask for an "extended_stats" aggregation on the amounts
Provide logs (if relevant):
{"acknowledged":true}
{"acknowledged":true,"shards_acknowledged":true,"index":"my-test-index"}
{"_index":"my-test-index","_type":"document","_id":"BErfN2gBWLZgAGEHAdb8","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
{"_index":"my-test-index","_type":"document","_id":"BUrfN2gBWLZgAGEHAtY2","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
{"_index":"my-test-index","_type":"document","_id":"BkrfN2gBWLZgAGEHAtZd","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
{"_shards":{"total":10,"successful":5,"failed":0}}
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "my-test-index",
"_type" : "document",
"_id" : "BUrfN2gBWLZgAGEHAtY2",
"_score" : 1.0,
"_source" : {
"amount" : 49.95,
"index" : 2
}
},
{
"_index" : "my-test-index",
"_type" : "document",
"_id" : "BErfN2gBWLZgAGEHAdb8",
"_score" : 1.0,
"_source" : {
"amount" : 49.95,
"index" : 1
}
},
{
"_index" : "my-test-index",
"_type" : "document",
"_id" : "BkrfN2gBWLZgAGEHAtZd",
"_score" : 1.0,
"_source" : {
"amount" : 49.95,
"index" : 3
}
}
]
},
"aggregations" : {
"amount" : {
"count" : 3,
"min" : 49.95,
"max" : 49.95,
"avg" : 49.95000000000001,
"sum" : 149.85000000000002,
"sum_of_squares" : 7485.0075000000015,
"variance" : -3.0316490059097606E-13,
"std_deviation" : "NaN",
"std_deviation_bounds" : {
"upper" : "NaN",
"lower" : "NaN"
}
}
}
}
File used to reproduce:
to-reproduce-it.zip