Skip to content

Script queries using joda time getMillis() slower than using java time toInstant().toEpochMilli() #35754

Closed
@conet

Description

@conet

Elasticsearch version (bin/elasticsearch --version):
Version: 6.5.0, Build: default/tar/816e6f6/2018-11-09T18:58:36.352602Z, JVM: 1.8.0_191

Plugins installed: []
No extra plugins

JVM version (java -version):
java -version
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)

Description of the problem including expected versus actual behavior:
I've been using script queries that called getMillis() without issues until I upgraded to 6.5.0 after I started noticing serious query time deterioration (I'm talking about tens of seconds up from a few seconds), eventually by reading the changelog I realized that joda time was being deprecated, after switching the getMillis() call to toInstant().toEpochMilli() I got back the original query times (the ones before the upgrade). See a more detailed description in this PR. Since I've been asked to open an issue with the steps to reproduce it here they are.

Steps to reproduce:

  1. Download and start elasticsearch 6.5.0
  2. Download and start kibana 6.5.0
  3. Load the Sample web logs sample datasets
  4. Increase the doc count by copying the sample index (I copied it 40 times):
curl -XPOST "http://localhost:9200/_reindex" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "kibana_sample_data_logs"
  },
  "dest": {
    "index": "kibana_sample_data_logs_1"
  }
}'
  1. Run the fast query:
curl -XPOST "http://localhost:9200/kibana_sample_data_logs*/_search" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "script": {
            "script": {
              "source": "doc[\"timestamp\"].value.toInstant().toEpochMilli() < params.millis",
              "lang": "painless",
              "params": {
                "millis": 1542747500000
              }
            }
          }
        }
      ]
    }
  }
}'
---
{
  "took" : 38,
  "timed_out" : false,
  "_shards" : {
    "total" : 201,
    "successful" : 201,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 94259,
    "max_score" : 0.0,
    "hits" : [ ]
  }
}
  1. Run the slow query:
curl -XPOST "http://localhost:9200/kibana_sample_data_logs*/_search" -H 'Content-Type: application/json' -d'
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "script": {
            "script": {
              "source": "doc[\"timestamp\"].value.getMillis() < params.millis",
              "lang": "painless",
              "params": {
                "millis": 1542747500000
              }
            }
          }
        }
      ]
    }
  }
}'
---
#! Deprecation: Use of the joda time method [getMillis()] is deprecated. Use [toInstant().toEpochMilli()] instead.
{
  "took" : 1088,
  "timed_out" : false,
  "_shards" : {
    "total" : 201,
    "successful" : 201,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 94259,
    "max_score" : 0.0,
    "hits" : [ ]
  }
}

We're talking about 95K documents and the deprecated query is already 30 times slower, one could argue that it is because the deprecation log itself but from my experience with bigger indexes the problem becomes more noticeable, the slow query seems to slow down linearly with the number of documents checked by it.

Metadata

Metadata

Assignees

Labels

:Core/Infra/ScriptingScripting abstractions, Painless, and Mustache

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions