Skip to content

[ML] Data frame transform silently fails if using min(timestamp) #39974

Closed
@sophiec20

Description

@sophiec20

Found in master "version" : { "number" : "8.0.0-SNAPSHOT", "build_flavor" : "default", "build_type" : "tar", "build_hash" : "4957cad", "build_date" : "2019-03-11T15:48:39.514013Z", "build_snapshot" : true,

When creating a data frame using min and max against a date field, the data frame is not populated, however the response from _stats implies it has worked. From a user perspective, the data frame silently fails to populate (although does log an error on the server).

#DELETE _data_frame/transforms/farequote-a
#DELETE df-farequote-a
PUT _data_frame/transforms/farequote-a
{
  "source": "farequote-*",
  "dest": "df-farequote-a",
  "pivot": {
	  "group_by": { 
	    "airline": { "terms": { "field": "airline" }}
	  },
    "aggregations": {
	    "max_responsetime": { "max": { "field": "responsetime" }},
	    "mean_responsetime": { "avg": { "field": "responsetime" }},
	    "min_time": { "min": { "field": "@timestamp"}},
	    "max_time": { "max": { "field": "@timestamp"}}
    }
  }
}

POST _data_frame/transforms/farequote-a/_start
GET _data_frame/transforms/farequote-a/_stats
POST _data_frame/transforms/farequote-a/_stop
GET df-farequote-a/_search

_stats returns the following, which is the same as a successful data frame, i.e. one without min_time and max_time:

{
  "count" : 1,
  "transforms" : [
    {
      "id" : "farequote-a",
      "state" : {
        "transform_state" : "stopped",
        "current_position" : {
          "airline" : "VRD"
        },
        "generation" : 1
      },
      "stats" : {
        "pages_processed" : 2,
        "documents_processed" : 86274,
        "documents_indexed" : 19,
        "trigger_count" : 1,
        "index_time_in_ms" : 281,
        "index_total" : 1,
        "index_failures" : 0,
        "search_time_in_ms" : 6,
        "search_total" : 2,
        "search_failures" : 0
      }
    }
  ]
}

Error in log:

[2019-03-12T18:39:33,771][WARN ][o.e.x.c.i.AsyncTwoPhaseIndexer] [node1] Error while attempting to bulk index documents: failure in bulk execution:

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions