Skip to content

Add runtime field of type geo_shape #100492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Oct 16, 2023
Merged

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Oct 9, 2023

This PR adds the possibility to create runtime fields of type geo-shape. In order to create them, user s can define an emit function that takes either a geojson object or a WKT string that internally creates a geo_shape object. For example from a geo_shape doc value, we can generate the WKT representing the bounding box of the shape like (in yaml):

runtime:
              geometry_from_doc_value:
                type: geo_shape
                script:
                  source: |
                    GeoBoundingBox bbox = doc["geometry"].getBoundingBox();
                    emit("BBOX(" + bbox.topLeft().getLon() + "," + bbox.topLeft().getLat() + "," + bbox.bottomRight().getLon() + "," + bbox.bottomRight().getLat() + ")");

closes #61299

@iverase iverase added >enhancement :Analytics/Geo Indexing, search aggregations of geo points and shapes v8.12.0 labels Oct 9, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 9, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @iverase, I've created a changelog YAML for you.

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, but I have a question about concurrency that I think we need to address before merge.

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@iverase
Copy link
Contributor Author

iverase commented Oct 11, 2023

@elasticmachine update branch

Copy link
Contributor

@craigtaverner craigtaverner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice feature. I was worried about how most tests just used points, but I see the yaml tests do a much better job of covering other types. I did have a number of comments that could be considered, and also noticed that there exists documentation for using geo_point_field in painless, but did not see anything for geometry_field. Should that be added to a separate PR?

context.lookup(),
OnScriptError.FAIL
);
GeometryFieldScript geoPointFieldScript = leafFactory.newInstance(leafReaderContext);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variable name is from previous block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, updated.

@@ -259,6 +259,50 @@ public void testGeoPointFieldExecutionContext() throws IOException {
assertEquals("Point", points.get(1).get("type"));
}

@SuppressWarnings("unchecked")
public void testGeoShapeFieldExecutionContext() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems to be a copy of the geo_point test, with just the field context name changed. Is this because we cannot test actual shapes in server, so just verify that the new script context can handle points? Should we refactor this to a private method with a single parameter?

Copy link
Contributor Author

@iverase iverase Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the test, thanks!

) {
@Override
public void execute() {
emitFromObject("POINT(0 0)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there perhaps value in testing something more than a point, perhaps a two point linestring?

geoFormatterFactory,
meta.get()
);
if (script.get() == null) {
return new GeoShapeWithDocValuesFieldMapper(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed these two versions of the constructor, but when looking up the constructor stack, I see they do ultimately end up at the same place, so I wonder if it is not cleaner to have a single constructor with just null for the script parts when missing? This would reduce code in several classes that have multiple constructors just for scripts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base class expect a different constructor to be used. This might be a nice follow up but not for this PR as it touches other mappings like geo_point.

@@ -339,6 +393,21 @@ public GeoShapeWithDocValuesFieldMapper(
this.indexer = indexer;
}

protected GeoShapeWithDocValuesFieldMapper(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above that we could remove these extra constructors, I think.

import static java.util.Collections.emptyMap;
import static org.hamcrest.Matchers.equalTo;

public class GeoShapeScriptFieldTypeTests extends AbstractNonTextScriptFieldTypeTestCase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these tests seem to test only points. I wonder if there is value in testing something a little more complex, perhaps at least a two point linestring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to run with a LINESTRING

search:
index: geometries
body:
aggs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that max-lat==max-lon==6 and min-lat==min-lon==-1 makes it harder to test for the possibility that x/y get swapped. When I looked through the assertions below at first I thought there was a x/y swapping error, before I realized my mistake. These results are correct for the original data provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runtime fields to support runtime type geo-shape
5 participants