Description
openedon Jan 11, 2020
Entity Framework is generating a composite ORDER BY clause when a query on one entity .Includes() an entity under a one-to-many relation and that related entity is loaded in the SELECT clause. The order by clause includes the primary keys of every single entity in the query as a whole, not just those under that relation. None of this happens if the relation is on a *-to-one relation. I haven't tested many-to-many.
This in combination with the change in EF 3.0 to utilize joins instead of multiple selects can have severe negative performance implications for queries with .Include() calls that worked well enough prior to EF 3.0. Not only does that mean there's extra load on the database server to do these orderings - the sql generated by EF 3.1 for one of our large queries executes in two minutes with the ORDER BY clause and 30 seconds without it - but also the database server isn't able to even start streaming results until it more or less has all the results assembled, leading to client-side timeouts.
I'm not sure if this is a bug or just undesirable behavior. I've found nothing documenting this on pages like https://docs.microsoft.com/en-us/ef/core/querying/related-data, and I've found no clear way to disable it or get around it, absent rewriting these queries or doing something ugly with a command interceptor, but I could have just missed a source of information. I totally understand the pivot to joins implies some things need to be rewritten, but this ORDER BY especially on every entity is excessive and mandates us to manually fine tune data loading where we were previously able to rely on EF for it, even when joins alone are fine.
Solutions in order of preference:
- Don't introduce these order by clauses unless asked for in the query
- Document this and provide some way to opt out of this behavior on a global or per-query basis.
- Document this and don't order by other entitles' keys outside the one-to-many-relation
- Document this behavior loudly and provide best practices to work around it
Steps to reproduce
See my gist with code and comments at https://gist.github.com/cwhitelaw/df7d12778966e16e0119a4ffce0a06dc?ts=4
Model:
public class MainEntity
{
public int Id { get; set; }
public int SingleRelatedEntityId { get; set; }
public SingleRelatedEntity SingleRelatedEntity { get; set; }
public List<ManyRelatedEntity> ManyRelatedEntities { get; set; }
}
public class SingleRelatedEntity
{
public int Id { get; set; }
}
public class ManyRelatedEntity
{
public int Id { get; set; }
public int MainEntityId { get; set; }
public MainEntity MainEntity { get; set; }
}
Output (the problem manifests in the queries labeled with main entities including to-many relation
):
Query: main entities
SELECT [m].[Id], [m].[SingleRelatedEntityId]
FROM [MainEntities] AS [m]
Query: single-related entities
SELECT [s].[Id]
FROM [SingleRelatedEntities] AS [s]
Query: many-related entities
SELECT [m].[Id], [m].[MainEntityId]
FROM [ManyRelatedEntities] AS [m]
Query: main entities including to-one relation
SELECT [m].[Id], [m].[SingleRelatedEntityId], [s].[Id]
FROM [MainEntities] AS [m]
INNER JOIN [SingleRelatedEntities] AS [s] ON [m].[SingleRelatedEntityId] = [s].[Id]
Query: main entities including to-many relation
SELECT [m].[Id], [m].[SingleRelatedEntityId], [m0].[Id], [m0].[MainEntityId]
FROM [MainEntities] AS [m]
LEFT JOIN [ManyRelatedEntities] AS [m0] ON [m].[Id] = [m0].[MainEntityId]
ORDER BY [m].[Id], [m0].[Id]
Query: main entities including to-many relation, order by MainEntity.Id descending
SELECT [m].[Id], [m].[SingleRelatedEntityId], [m0].[Id], [m0].[MainEntityId]
FROM [MainEntities] AS [m]
LEFT JOIN [ManyRelatedEntities] AS [m0] ON [m].[Id] = [m0].[MainEntityId]
ORDER BY [m].[Id] DESC, [m0].[Id]
Query: main entities including both to-one and to-many relation
SELECT [m].[Id], [m].[SingleRelatedEntityId], [s].[Id], [m0].[Id], [m0].[MainEntityId]
FROM [MainEntities] AS [m]
INNER JOIN [SingleRelatedEntities] AS [s] ON [m].[SingleRelatedEntityId] = [s].[Id]
LEFT JOIN [ManyRelatedEntities] AS [m0] ON [m].[Id] = [m0].[MainEntityId]
ORDER BY [m].[Id], [s].[Id], [m0].[Id]
Query: many-related entities through MainEntity relation
SELECT [m0].[Id], [m0].[MainEntityId]
FROM [MainEntities] AS [m]
INNER JOIN [ManyRelatedEntities] AS [m0] ON [m].[Id] = [m0].[MainEntityId]
Query: many-related entities including their MainEntity
SELECT [m].[Id], [m].[MainEntityId], [m0].[Id], [m0].[SingleRelatedEntityId]
FROM [ManyRelatedEntities] AS [m]
INNER JOIN [MainEntities] AS [m0] ON [m].[MainEntityId] = [m0].[Id]
Further technical details
EF Core version: 3.1.0
Database provider: Microsoft.EntityFrameworkCore.SqlServer 3.1.0
Target framework: .NET Core 3.1
Operating system: Windows 10 x64
IDE: Visual Studio 2019