Skip to content

Review our LEFT vs. INNER join behavior for Include #29645

@roji

Description

@roji

When Include is used on a required navigation, we currently generate an INNER join. Since we prune "unneeded" Includes (e.g. because the terminating operator is Count), this leads to #19649, where the following two queries yield different results:

_dbContext.Activities.Include(a => a.Customer).Count();
_dbContext.Activities.Include(a => a.Customer).ToList().Count();

To fix this, we could switch to using LEFT join when the included navigation is required; this would prevent the query filter from affecting which principals are returned, and make the query results the same above.

A more extreme question is why we use INNER join in the first place for Include; since Include isn't supposed to be a filter - only include related entities - it logically corresponds more to a LEFT join. There's an assertion that INNER join is more efficient than LEFT join, but this is something we should investigate more deeply. Note that the relationship is still defined as required, which means that there should never be a principal without a dependent in the database (unless the user misconfigured the model).

A change from INNER to LEFT join would be breaking when the required dependent has a query filter, since Include would start returning principals where no dependent pass the query filter. The more extreme change to always do a LEFT join in principle wouldn't be more breaking (i.e. when there's no query filter), since the relationship is required and there should always be at least one dependent for all principals. Users who misconfigured their model would, however, be broken by this, as principals with no dependents would start getting returned.

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Feature.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions