Skip to content

Can't prefetch base instance #359

@lerela

Description

@lerela

Hi,

I am struggling to understand how to make my queries more efficient when the parent object must be accessed (mass delete use case that led me here is described below).

Here is a basic scenario:

from django.db import models
from polymorphic.models import PolymorphicModel

class Base(PolymorphicModel):
    field1 = models.CharField(max_length=8)

class Child1(Base):
    field2 = models.CharField(max_length=8)

All I want to do is to load both Child1 and the Base instance so that I can access both Child1 & its Base counterpart (child1.base_ptr) without additional SQL hits.

I create my object with:

child1 = Child1(field1="a", field2="b")

And I fetch it with:

child1 = Child1.objects.filter(pk=1)[0]

This triggers the following SQL query:

 SELECT "test_polymorphic_base"."id", "test_polymorphic_base"."polymorphic_ctype_id", "test_polymorphic_base"."field1", "test_polymorphic_child1"."base_ptr_id", "test_polymorphic_child1"."field2" FROM "test_polymorphic_child1" INNER JOIN "test_polymorphic_base" ON ("test_polymorphic_child1"."base_ptr_id" = "test_polymorphic_base"."id") WHERE "test_polymorphic_child1"."base_ptr_id" = 1  LIMIT 1; args=(1,)

Note that both field1 and field2 were successfully SELECTed, as well as any field allowing to directly instantiate the Base instance.
Calling child1.field1 and child1.field2 do not trigger additional query, as expected.

However, calling child1.base_ptr does trigger a new database hit:

SELECT "test_polymorphic_base"."id", "test_polymorphic_base"."polymorphic_ctype_id", "test_polymorphic_base"."field1" FROM "test_polymorphic_base" WHERE "test_polymorphic_base"."id" = 1; args=(1,)

Note that all those fields were already fetched with previous query. I figure I'd use .select_related, but the result is the same:

child1 = Child1.objects.select_related('base_ptr').filter(pk=1)[0]
child1.base_ptr # triggers an extra SQL query

I also tried .prefetch_related:

child1 = Child1.objects.prefetch_related('base_ptr').filter(pk=1)[0]

Which fails immediately with:

ValueError: 'base_ptr' does not resolve to an item that supports prefetching - this is an invalid parameter to prefetch_related().

This seems to be related to the fact that Child1.base_ptr does not have a .get_prefetch_queryset() method.

Now, my initial (and real) use case, which explains why all my examples use querysets: when trying to delete children objects, the Django deletion collector performs (in this case) getattr(child1, 'base_ptr'), which triggers as described one spurious query for each item, even though all data required for deletion was already loaded in memory. This is greatly slowing down the delete process and seems unneeded:

Child1.objects.select_related('base_ptr').delete()

# Objects _and_ base fields are fetched when the queryset is loaded
DEBUG:django.db.backends:(0.001) SELECT "test_polymorphic_base"."id", "test_polymorphic_base"."polymorphic_ctype_id", "test_polymorphic_base"."field1", "test_polymorphic_child1"."base_ptr_id", "test_polymorphic_child1"."field2" FROM "test_polymorphic_child1" INNER JOIN "test_polymorphic_base" ON ("test_polymorphic_child1"."base_ptr_id" = "test_polymorphic_base"."id"); args=()
# Now we are in the deletion Collector .collect at line 201: useless query! There is one for each queryset object.
DEBUG:django.db.backends:(0.000) SELECT "test_polymorphic_base"."id", "test_polymorphic_base"."polymorphic_ctype_id", "test_polymorphic_base"."field1" FROM "test_polymorphic_base" WHERE "test_polymorphic_base"."id" = 1; args=(1,)
# Actual deletion
DEBUG:django.db.backends:(0.000) DELETE FROM "test_polymorphic_child1" WHERE "test_polymorphic_child1"."base_ptr_id" IN (1); args=(1,)
DEBUG:django.db.backends:(0.000) DELETE FROM "test_polymorphic_base" WHERE "test_polymorphic_base"."id" IN (1); args=(1,)

I figure that to prevent this, I'd need child1.base_ptr not to trigger an extra query but I can't find a way to let Django know it already has everything it needs to instantiate it.

Am I missing something? Is there any way around this?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions