Accuracy metric fails when the dependent variable is of type 'boolean' or 'integer'

I've run a classification analysis on a synthetic dataset that tries to detect circle on a plane.
I've indexed docs with points on a 2D plane as well as a dependent variable ("is the point inside a unit circle"). The analysis finished correctly, but then I tried to evaluate the results using the following request:
```
{
  "index": "circle-ml",
  "query": {
    "term": {
      "ml.is_training": false
    }
  },
  "evaluation": {
    "classification": {
      "actual_field": "in_unit_circle",
      "predicted_field": "ml.in_unit_circle_prediction.keyword",
      "metrics": {
        "accuracy": {},
        "multiclass_confusion_matrix": {}
      }
    }
  }
}
```

The evaluation reported accuracy of `0` as it could not find any point for which `dependent_variable` was equal to the prediction.
The problem is that dependent variable is boolean and prediction is string, and the painless script is:
```
doc[''{0}''].value == doc[''{1}''].value
```

Two solutions I see here are:
- (simpler) relax the equality check so that it treats boolean `true` and string `"true"` as equal
- (more involved) make C++ code report prediction using the type of dependent variable. The type of the dependent variable can be passed down from Java.

Also, the same scenario should be reproduced for integer types.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accuracy metric fails when the dependent variable is of type 'boolean' or 'integer' #49796

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Accuracy metric fails when the dependent variable is of type 'boolean' or 'integer' #49796

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions