-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: added parameter to optionally rename aliases of aggregated columns used in GROUP BY statments #28444
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fhyy thanks for the PR! I believe some other engines have tackled this with the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #28444 +/- ##
==========================================
+ Coverage 69.76% 70.15% +0.39%
==========================================
Files 1911 1921 +10
Lines 74994 76157 +1163
Branches 8353 8353
==========================================
+ Hits 52316 53427 +1111
- Misses 20629 20681 +52
Partials 2049 2049
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Thanks @fhyy for the PR. My thinking was the underlying SQLAlchemy dialect should handle the various nuances in terms of how/where aliases are defined. |
CC @cgivre |
Thanks for the quick feedback! @villebro I tried the Drill supports the aliases as they are, it just behaves unexpectedly with I agree with @john-bodley that the dialects should handle it instead. A boolean parameter isn't as flexible as it may need to be. In this case I would also propose updating the previous Thoughts on this? |
I know I'm a little late to this conversation, and I'm happy to help out where I can, but I'm very unclear as to what the actual issue is. Is the fact that Drill supports aliases in |
Yes. The fact that Drill supports aliases in
Drill will use the alias Renaming such aliases solves the problem, but not all aliases needs to be renamed. You can find more information in issue #28443 |
@cgivre : See my comment on issue 20349 for what the problem is with GROUP BY in Apache Drill. |
The value is true if the engine is able to pick the source column for aggregation clauses used in ORDER BY when a column in SELECT has an alias that is the same as a source column.
* Renamed attribute allows_alias_to_source_column to order_by_allows_alias_to_source_column * Added attribute group_by_allows_alias_to_source_column * Rename aliases for source columns if used in GROUP BY and if group_by_allows_alias_to_source_column is false
Closing this for now. I'm working on fixing the issues with the tests and also updating my fork to the latest version. I managed to break this PR in the process... |
I created a new pull request with a slightly different implementation and based of the newer version: |
SUMMARY
Fixed issue #28443
Renamed
db_engine_specs
parameterallows_alias_to_source_column
toorder_by_allows_alias_to_source_column
and added the new parametergroup_by_allows_alias_to_source_column
.The previous parameter is used to tell the SQLA generator to rename aliases used in
ORDER BY
statements with aggregations, to ensure that the source column is referenced. Some engines (e.g. Drill) needs to be able to do the same thing for aliases inGROUP BY
statements.The new parameter is used to tell the SQLA generator to rename any alias of a source column that is used in an aggregation in a
GROUP BY
statement, to ensure that the source column is referenced.Added documentation of the new
group_by_allows_alias_to_source_column
parameter, and fixed errors in the documentation oforder_by_allows_alias_to_source_column
/(previously)allows_alias_to_source_column
For example this query
becomes
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
Example:
ADDITIONAL INFORMATION