-
-
Notifications
You must be signed in to change notification settings - Fork 927
Description
API Platform version(s) affected: 2.6.8 (and likely earlier) through 3.x
Description
MatchFilter
and TermFilter
do not properly support multi-level nested fields in Elasticsearch if the top level is a collection.
What Works
It seems to work fine if the top level is an object. For example, given this sample document from the docs linked above:
{
"driver" : {
"last_name" : "McQueen",
"vehicle" : [
{
"make" : "Powell Motors",
"model" : "Canyonero"
},
{
"make" : "Miller-Meteor",
"model" : "Ecto-1"
}
]
}
}
Asking API Platform to filter on driver.vehicle.model
would generate a valid, working query like this:
{
"nested": {
"path": "driver.vehicle",
"query": {
"term": {"driver.vehicle.model": "Canyonero"}
}
}
}
Interestingly, the Elasticsearch docs suggest the query should instead be doubly-nested like this:
{
"nested": {
"path": "driver", // <-- instead of driver.vehicle
"query": {
"nested": { // <-- additional nested layer
"path": "driver.vehicle",
"query": {
"term": {"driver.vehicle.model": "Canyonero"}
}
}
}
}
}
But I'm assuming ES will happily accept either version. Regardless, everything described above seems to work just fine.
What's Broken
However, if we modify the structure to have an array of drivers
as the top-level property like this:
{
"race": "Piston Cup",
"drivers": [
{
"last_name" : "McQueen",
"vehicle" : [
{
"make" : "Powell Motors",
"model" : "Canyonero"
},
{
"make" : "Miller-Meteor",
"model" : "Ecto-1"
}
]
}
]
}
Attempting to filter on drivers.vehicle.model
generates this query:
{
"nested": {
"path": "drivers",
"query": {
"term": {"drivers.vehicle.model": "Canyonero"}
}
}
}
Which fails to match any documents because the path
is missing the .vehicle
part. Manually adding the .vehicle
part causes the expected documents to appear in the results:
{
"nested": {
"path": "drivers.vehicle", // <-- this change seems to work
"query": {
"term": {"drivers.vehicle.model": "Canyonero"}
}
}
}
Using a proper deeply-nested query also works (but API Platform won't generate this):
{
"nested": {
"path": "drivers", // <-- instead of drivers.vehicle
"query": {
"nested": { // <-- additional nested layer
"path": "drivers.vehicle",
"query": {
"term": {"drivers.vehicle.model": "Canyonero"}
}
}
}
}
}
Likely Cause
I think the current behavior is caused by FieldDatatypeTrait::getNestedFieldPath()
never recursively calling itself when the top-level type is a collection. It will never check the rest of the path to see if it might be a nested object.
How to reproduce
I don't have a reproducer for how the query fails to work in Elasticsearch, but I did tweak TermFilterTest::testApplyWithNestedProperty()
to demonstrate the unexpected query:
public function testApplyWithDeeplyNestedProperty(): void
{
$libraryType = new Type(Type::BUILTIN_TYPE_ARRAY, false, Library::class, true, new Type(Type::BUILTIN_TYPE_INT), new Type(Type::BUILTIN_TYPE_OBJECT, false, Library::class));
$bookType = new Type(Type::BUILTIN_TYPE_ARRAY, false, Book::class, true, new Type(Type::BUILTIN_TYPE_INT), new Type(Type::BUILTIN_TYPE_OBJECT, false, Book::class));
$messageType = new Type(Type::BUILTIN_TYPE_STRING);
$propertyMetadataFactoryProphecy = $this->prophesize(PropertyMetadataFactoryInterface::class);
$propertyMetadataFactoryProphecy->create(Library::class, 'libraries')->willReturn((new ApiProperty())->withBuiltinTypes([$libraryType]))->shouldBeCalled();
$propertyMetadataFactoryProphecy->create(Library::class, 'book')->willReturn((new ApiProperty())->withBuiltinTypes([$bookType]))->shouldBeCalled();
$propertyMetadataFactoryProphecy->create(Book::class, 'message')->willReturn((new ApiProperty())->withBuiltinTypes([$messageType]))->shouldBeCalled();
$resourceClassResolverProphecy = $this->prophesize(ResourceClassResolverInterface::class);
$resourceClassResolverProphecy->isResourceClass(Library::class)->willReturn(true)->shouldBeCalled();
$resourceClassResolverProphecy->isResourceClass(Book::class)->willReturn(true)->shouldBeCalled();
$nameConverterProphecy = $this->prophesize(NameConverterInterface::class);
$nameConverterProphecy->normalize('libraries.book.message', Library::class, null, Argument::type('array'))->willReturn('libraries.book.message')->shouldBeCalled();
$nameConverterProphecy->normalize('libraries.book', Library::class, null, Argument::type('array'))->willReturn('libraries.book')->shouldBeCalled();
$nameConverterProphecy->normalize('libraries', Library::class, null, Argument::type('array'))->willReturn('libraries')->shouldBeCalled();
$termFilter = new TermFilter(
$this->prophesize(PropertyNameCollectionFactoryInterface::class)->reveal(),
$propertyMetadataFactoryProphecy->reveal(),
$resourceClassResolverProphecy->reveal(),
$this->prophesize(IriConverterInterface::class)->reveal(),
$this->prophesize(PropertyAccessorInterface::class)->reveal(),
$nameConverterProphecy->reveal(),
['libraries.book.message' => null]
);
self::assertEquals(
['bool' => ['must' => [['nested' => ['path' => 'libraries', 'query' => ['nested' => ['path' => 'libraries.book', 'query' => ['term' => ['libraries.book.message' => 'Krupicka']]]]]]]]],
$termFilter->apply([], Library::class, null, ['filters' => ['libraries.book.message' => 'Krupicka']])
);
}
Possible Solution
First, fix FieldDatatypeTrait::getNestedFieldPath()
so it returns the complete parent path.
Once that's done, modify MatchFilter::getQuery()
and TermFilter::getQuery()
so that they return deeply nested
queries as shown in the Elasticsearch documentation. This seems "more correct" than the current approach of only using one nested
key regardless of how deeply nested the parameter actually is.
Additional Context
api-platform/api-platform#1375 describes an issue with filtering nested properties where getNestedFieldPath()
is also a suspected culprit. My case seems different because the top-level item in my path is a collection and not an object - the execution path is very different.