Significant Query Slowdown When Using Multiple Joins Due To Changes In 3.0

Note:

* Issue #20892 has been created to track actual work on split queries in EF Core 5.0. The new issue is locked so that it can be subscribed to for updates on the implementation without noise.
* This older issue has been re-purposed to be discussion about different approaches, etc. It will not be locked because it is important to us that we don’t shut down channels of communication with the community.
---

Preview6 introduces `AsSplitQuery` API to load collections using multiple queries. Details of the API has been posted https://github.com/dotnet/efcore/issues/20892#issuecomment-644456029

---

A way to rewrite the query to avoid large result set & mimick previous version behavior is posted https://gist.github.com/smitpatel/d4cb3619e5b33e8d9ea24d3f2a88333a
<hr />

_Note: Per @roji's request, I am opening this in response to the comments on https://github.com/aspnet/EntityFrameworkCore/issues/17455_

I have a very large query that contains about 57 includes that is being used to copy down a large entity so that it can be modified and cloned. 

With EF 2.2.6, this large query ran successfully in about 1-3 seconds (variable).  With the changes in 3.0 (all includes create one entire SQL statement with joins), the query takes significantly longer and always times out with the default execution timeout settings.

### Steps to reproduce

Use a (IMO nasty)  Linq query similar to the one as follows:

``` csharp
dbContext.Projects
	.Include(z => z.Financial)
	.Include(z => z.ProjectProtocol)
	.Include(z => z.ReportCategories)
	.Include(z => z.ReportSubCategories)
	.Include(x => x.SubProjects).ThenInclude(y => y.Address)
	.Include(x => x.SubProjects).ThenInclude(y => y.LegacyAddress)
	.Include(x => x.SubProjects).ThenInclude(z => z.BuildingTypes)
	.Include(x => x.SubProjects).ThenInclude(z => z.Buildings).ThenInclude(b => b.BuildingType)
	.Include(x => x.SubProjects).ThenInclude(z => z.Buildings).ThenInclude(b => b.Site)
	.Include(x => x.SubProjects).ThenInclude(z => z.Sites).ThenInclude(s => s.Address)
	.Include(x => x.SubProjects).ThenInclude(z => z.Participants).ThenInclude(p => p.Address)
	.Include(x => x.SubProjects).ThenInclude(z => z.ExcelFileJson)
	.Include(x => x.SubProjects).ThenInclude(z => z.CompanionAddress)
	.Include(x => x.SubProjects).ThenInclude(z => z.UtilityTypes)
	.Include(x => x.SubProjects).ThenInclude(z => z.InspectedUnits).ThenInclude(i => i.Building)
	.Include(x => x.SubProjects).ThenInclude(z => z.InspectedUnits).ThenInclude(i => i.UnitType)
	.Include(x => x.SubProjects).ThenInclude(z => z.Utilities).ThenInclude(i => i.UtilityType)
	.Include(x => x.SubProjects).ThenInclude(z => z.Units).ThenInclude(i => i.Building)
	.Include(x => x.SubProjects).ThenInclude(z => z.UnitTypes)
	.Include(x => x.SubProjects).ThenInclude(z => z.CommonAreas).ThenInclude(ca => ca.Building)
	.Include(x => x.SubProjects).ThenInclude(z => z.FixtureAreas).ThenInclude(w => w.Fixtures)
	.Include(x => x.SubProjects).ThenInclude(y => y.LineItems).ThenInclude(z => z.EnergyOneItems)
	.Include(x => x.SubProjects).ThenInclude(y => y.LineItems).ThenInclude(z => z.EnergyTwoItems)
	.Include(x => x.SubProjects).ThenInclude(y => y.LineItems)
	.ThenInclude(z => z.TraditionalItem).AsNoTracking().FirstOrDefault(x => x.Id == project.Id);
```

### Results:

With EF Core 2.2.6 - I can see in the output via the SQL Server Profiler that EF is breaking up the LINQ statement into smaller queries. The overall process takes about 1-3 seconds.

With EF Core 3.0 - I can see in the output via the SQL Server Profiler that EF is emitting one massive query with lots and lots of joins.  The overall process always times out with the default execution timeout setting.

At this point, I am open to the notion that this query needs to be either re-written or the process needs to be changed for handling the cloning. I would still like to hear if there are any workarounds, findings that this is a bug or other suggestions to avoid having to devote a significant amount of time on rewriting.

**Edit**
For now we worked around this by splitting the query up manually using [EF Plus' "Includes Optimized" ](https://entityframework-plus.net/ef-core-query-include-optimized)method and then looping through the change tracker to set all of the entities as untracked so we can then reset their keys so that the graph can be comitted as a clone (this gave me a flashback to my EF 6 days).

Note: The model changed somewhat between the time this issue was first encountered and now due to user requests and other factors. I should also note that the system is now in production and users are pretty happy with the performance. 

``` csharp
            var clone = dbContext.Projects
                .IncludeOptimized(z => z.Financial)
                .IncludeOptimized(z => z.ProjectProtocol)
                .IncludeOptimized(z => z.ReportCategories)
                .IncludeOptimized(z => z.ReportSubCategories)
                .IncludeOptimized(z => z.SubProject)
                .IncludeOptimized(z => z.SubProject.ValidationFlag)
                .IncludeOptimized(z => z.SubProject.ExcelFileJson)
                .IncludeOptimized(z => z.SubProject.EnergyAuditUtilities)
                .IncludeOptimized(z => z.SubProject.EnergyAuditData)
                .IncludeOptimized(z => z.SubProject.EnergyAuditData.EnergyAuditAreas)
                .IncludeOptimized(z => z.SubProject.Address)
                .IncludeOptimized(z => z.SubProject.Utilities)
                .IncludeOptimized(z => z.SubProject.UtilityTypes)
                .IncludeOptimized(z => z.SubProject.Units)
                .IncludeOptimized(z => z.SubProject.Units.Select(y => y.Building))
                .IncludeOptimized(z => z.SubProject.BuildingTypes)
                .IncludeOptimized(z => z.SubProject.Buildings)
                .IncludeOptimized(z => z.SubProject.Buildings.Select(y => y.BuildingType))
                .IncludeOptimized(z => z.SubProject.Buildings.Select(y => y.Site))
                .IncludeOptimized(z => z.SubProject.Sites)
                .IncludeOptimized(z => z.SubProject.Sites.Select(y => y.Address))
                .IncludeOptimized(z => z.SubProject.Participants)
                .IncludeOptimized(z => z.SubProject.Participants.Select(y => y.Address))
                .IncludeOptimized(z => z.SubProject.InspectedUnits)
                .IncludeOptimized(z => z.SubProject.InspectedUnits.Select(y => y.Building))
                .IncludeOptimized(z => z.SubProject.InspectedUnits.Select(y => y.UnitType))
                .IncludeOptimized(z => z.SubProject.UnitTypes)
                .IncludeOptimized(z => z.SubProject.CommonAreas)
                .IncludeOptimized(z => z.SubProject.CommonAreas.Select(y => y.Building))
                .IncludeOptimized(z => z.SubProject.CommonAreas.Select(y => y.Building).Select(z => z.Units))
                .IncludeOptimized(z => z.SubProject.FixtureAreas)
                .IncludeOptimized(z => z.SubProject.FixtureAreas.Select(y => y.Fixtures))
                .IncludeOptimized(x => x.SubProject.LineItems)
                .IncludeOptimized(x => x.SubProject.LineItems.Select(y => y.EnergyOneItem))
                .IncludeOptimized(x => x.SubProject.LineItems.Select(y => y.EnergyTwoTwoItem))
                .IncludeOptimized(x => x.SubProject.LineItems.Select(y => y.TraditionalItem))
                .FirstOrDefault(x => x.Id == project.Id);

            if (clone != null)
            {
                foreach (var entityEntry in dbContext.ChangeTracker.Entries())
                {
                    if (entityEntry.Entity != null)
                    {
                        entityEntry.State = EntityState.Detached;
                    }
                }
                return clone;
            }
```

My team was struggling with this at first due to the fact that EF was at first wiping out the entities when we detatched them due to an issue on https://github.com/dotnet/efcore/issues/18982. Using the work around that was posted there allowed for things to work. The overall performance is actually better due to the fact there isn't any client side evaluation. However, I would still prefer if the behavior of Includes Optimized (which is pretty much what EF 2.x did, splitting the query) was something that came out of the box with EF Core. I also do not like it how this entire scenario can no longer be done with a no tracking query (or so it seems), as it was possible to do it before with EF Core 2.x. 

### Further technical details

EF Core version: 3.0.0
Database provider: Microsoft.EntityFramework.SqlServer
Target framework: .Net Core 3.0
Operating system: Windows 10 x64 / Windows Server 2016 
IDE: Visual Studio 2019 Pro (16.3)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant Query Slowdown When Using Multiple Joins Due To Changes In 3.0 #18022

dragnilar
openedon Sep 24, 2019

Steps to reproduce

Results:

Further technical details

Assignees

Labels

Type

Projects

Milestone

Relationships

Development