Skip to content

Always inline static data in every row of query results that selected a static column #7668

@jleibs

Description

@jleibs

This could either fully replace #7667 as the default behavior, or alternatively complement through introduction of a static = INLINE configuration option.

Motivation

When static data is included in a query result, it ends up in a special row 0 with null index values. Users need to know to either use LatestAt in their query, or know to combine the static data from row 0 with the data from other columns.

However the general understanding of static data is actually somewhat distinct from the concept of logging data at TimeInt::MIN and then joined via LatestAt.

I believe we have semantically suggested that this data should act equivalent to a constant value across all time and all timelines. By that interpretation, Static data should NOT require a LatestAt join. It should just be there regardless of the index or TimeInt you use to query.

Proposal

  • Never include a separate row 0 with static results.
  • When a Column is selected that corresponds to static data, include that same data in every row we return.
  • Add an additional helper: .select_static() that always returns a single row result with any static data columns.

Benefits

  • More intuitive behavior for new users.
  • No special handling of row 0
  • Returned Index column never has nulls

Disadvantages

  • Data duplication for users that don't know whey are working with static data
    • Static data can sometimes be used for large things like maps. If .select() includes static columns, the user could end up inadvertently creating situations where they run out of memory.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestfeat-dataframe-apiEverything related to the dataframe API

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions