-
-
Notifications
You must be signed in to change notification settings - Fork 302
Defer to chunk_size parameter on .iterators for fetching get_real_instances() #672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances django-polymorphic's iterator behavior by respecting the chunk_size parameter in .iterator() calls, rather than always chunking at 100 objects. The change removes the undocumented Polymorphic_QuerySet_objects_per_request constant and implements database-aware chunking that respects max_query_params limits.
Key changes:
- Polymorphic querysets now honor the
chunk_sizeparameter from.iterator()calls forget_real_instances()queries - Chunking automatically respects database-specific
max_query_paramslimits (e.g., SQLite's 1000 parameter limit) - On PostgreSQL without explicit
chunk_size, all results are now fetched in a single batch per subclass type (1+N queries where N is the number of subclasses)
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/polymorphic/query.py | Removed Polymorphic_QuerySet_objects_per_request constant and refactored _polymorphic_iterator() to respect chunk_size parameter and database max_query_params limits |
| src/polymorphic/tests/test_orm.py | Added comprehensive test_iteration() covering base object iteration, polymorphic iteration with different chunk sizes, and query count verification across database vendors |
| docs/changelog.rst | Added changelog entry for v4.2.0 documenting the new iterator behavior |
| justfile | Enhanced test-db target to accept optional test arguments for more flexible test execution |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #672 +/- ##
==========================================
+ Coverage 74.71% 74.96% +0.24%
==========================================
Files 21 21
Lines 1337 1342 +5
Branches 211 211
==========================================
+ Hits 999 1006 +7
Misses 261 261
+ Partials 77 75 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
07f39dd to
3e61fba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0191295 to
c51d67e
Compare
- Add iteration tests. - Update documentation with iteration performance considerations.
336fe8f to
9dda8d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This PR fixes #465
Previous behavior would generate a new WHERE IN query for every 100 objects. This behavior is not problematic for most workflows because they probably fetch fewer than 100 rows - but it is clearly undesirable and overly cautious for cases that pull many more rows.
Django ORM iterators accept a chunk_size parameter that is currently ignored by polymorphic querysets after the initial query during the get_real_instances phase. Nominally, chunk_size does not modify the number of sql queries performed but instead the number of rows fetched by the connection cursor. However, there is precedent in the Django ORM for using it this way, as it does control the number of queries issued when prefetch_related is used. prefetch_related is also the most directly analogous behavior in the ORM to how polymorphic iteration is performed. I think it is therefore appropriate to allow it to control the size of the get_real_instance queries.
The other complexity here is that different rdbms have different limits on the number of parameters that can be used in a SQL query. This comes into play because get_real_instances uses a WHERE IN query to fetch rows by pk. This PR will set the chunk_size to the minimum of the passed in chunk_size and the value of
connection.features.max_query_params.When there is no max param limit (postgresql) and the chunk limit exceeds the queryset size, django-polymorphic will now require 1+N queries where N is the number of polymorphic subclasses when iterating over Parent.objects.all(). On sqlite the max param size is 999, so the queries will require 1+N queries per 999 rows.
We still use
polymorphic.query.Polymorphic_QuerySet_objects_per_requestto set a default global chunk size. We bump this value from 100 to 2000 though to be in line with the default chunk_size described in the Django docs.