Skip to content

Rework parent-join to not require access to DocumentMapper #63738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Oct 19, 2020

Conversation

romseygeek
Copy link
Contributor

Parent joins work using a cluster of field mappers: the join field itself;
a set of subfields that allow multiple relationships between parents and
children to be defined; and a metadata field that acts to only allow a
single join field per index to be defined. The various queries and
aggregations that use this infrastructure retrieve the join field mapper
via a static method and then build themselves by pulling individual
relationship mappers from this main mapper.

Using mappers rather than MappedFieldTypes means that we need to
expose DocumentMapper at search time, which is something we are
trying to avoid. This commit refactors things so that the join relations
are encapsulated in a Joiner object, which lives instead on the
MappedFieldType associated with the metadata join field. Rather than
using the ParentJoinFieldMapper and connected ParentIdFieldMappers,
we can now build queries and aggregations using this Joiner object,
retrieved via the QueryShardContext or AggregationContext using
a static helper method on Joiner itself. In addition, the mappers are
refactored to parametrized form, and the Builders for the metadata and
relation mappers are removed as they are always constructed directly
by the join field mapper, and never via parsable user input.

@romseygeek romseygeek added :Search Foundations/Mapping Index mappings, including merging and defining field types >refactoring v8.0.0 v7.11.0 labels Oct 15, 2020
@romseygeek romseygeek requested review from nik9000 and javanna October 15, 2020 12:49
@romseygeek romseygeek self-assigned this Oct 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Mapping)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 15, 2020
@nik9000
Copy link
Member

nik9000 commented Oct 15, 2020

Rework parent-join to not require access to DocumentMapper

hell yeah!

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great change! Would it be possible to split it into smaller PRs? For instance one could be to move the mappers to parametrized field mapper, another one to replace the lookup mechanism so that looking up mappers is no longer required. Or are these two tied to each other?

@romseygeek
Copy link
Contributor Author

I can probably split the parametrization out into a separate PR, although encapsulating the connection logic into the Relations class is a major part of it - I'll give it a go.

@romseygeek
Copy link
Contributor Author

I've removed the parametrization, and also undone the moving of LateParsingQuery to its own file, which ended up adding a lot of noise - I can do that in a separate PR.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments, looks great though, thanks for untangling this, I would not have known where to start ;)

/**
* @return a filter for child documents of a specific parent type
*/
public Query getChildrenFilter(String parentType) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is there a reason why this method starts with get while the two above don't?

/**
* Get the Joiner for this context, or {@code null} if none is configured
*/
public static Joiner getJoiner(Predicate<String> isMapped, Function<String, MappedFieldType> getFieldType) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we make this package private?

/**
* Constructs a Joiner based on a join field and a set of relations
*/
public Joiner(String joinField, List<Relations> relations) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be package private?

/**
* Utility class to help build join queries and aggregations, based on a join_field
*/
public class Joiner {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this final?

return Objects.hash(parent, children);
}

static List<Relations> parse(Object node) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse looks unused?

@@ -187,6 +189,11 @@ public MappedFieldType getFieldType(String path) {
return context.getFieldType(path);
}

@Override
public boolean isFieldMapped(String field) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may also remove getMapper from this class , one less usage of getMapperService ;)

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

@romseygeek romseygeek merged commit 70d88ef into elastic:master Oct 19, 2020
@romseygeek romseygeek deleted the mapper/join branch October 19, 2020 11:17
romseygeek added a commit that referenced this pull request Oct 19, 2020
Parent joins work using a cluster of field mappers: the join field itself;
a set of subfields that allow multiple relationships between parents and
children to be defined; and a metadata field that acts to only allow a
single join field per index to be defined. The various queries and
aggregations that use this infrastructure retrieve the join field mapper
via a static method and then build themselves by pulling individual
relationship mappers from this main mapper.

Using mappers rather than MappedFieldTypes means that we need to
expose DocumentMapper at search time, which is something we are
trying to avoid. This commit refactors things so that the join relations
are encapsulated in a Joiner object, which lives instead on the
MappedFieldType associated with the metadata join field. Rather than
using the ParentJoinFieldMapper and connected ParentIdFieldMappers,
we can now build queries and aggregations using this Joiner object,
retrieved via the QueryShardContext or AggregationContext using
a static helper method on Joiner itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v7.11.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants