Skip to content

draft and include a field naming guidance for window functions #632

@jimexist

Description

@jimexist

We do allow query outputs (i.e. record batch schema and physical plan schema) to contain columns with same names. This is quite common in join queries where more than one relations contain the same column name.

My previous comment was more around the inconsistency on how window function names are generated in logical plane v.s. physical plane. For example, in physical plane, we don't take partition by clause, order by clause, and window frames into account when creating the column name. If we don't want to maintain this consistency, I think we should update this invariant instead: https://github.com/apache/arrow-datafusion/blob/master/docs/specification/invariants.md#the-physical-schema-is-invariant-under-planning.

https://github.com/apache/arrow-datafusion/blob/master/docs/specification/output-field-name-semantic.md contains some rules on how we want to generate field names for physical schema. We don't have any rules for window functions at the moment, so maybe worth formalize it in that document as well.

Originally posted by @houqp in #622 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions