Skip to content

[Feature] Support multiple partitioned fields for partitioned table #195

@wuchong

Description

@wuchong

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently, the Fluss repository supports partitioned tables with a single partitioned field. This feature request proposes extending this capability to support multiple partitioned fields. Supporting multiple partitioned fields would allow for more flexible and efficient data partitioning, enhancing query performance and data management in real-time analytics use cases.

Use-case:

Consider a scenario where a user needs to partition their data by both region and date fields. With the current single partitioned field support, the user may need to create complex and less efficient workarounds to achieve the desired partitioning. By supporting multiple partitioned fields, users can directly partition their tables by both region and date, simplifying their data organization and improving query performance (only read specific region data by partitioning pushdown).

Solution

  • Extend the current partitioning logic to handle a list of partitioned fields.
  • Update the table creation syntax to allow specifying multiple partitioned fields.
  • Modify the storage and indexing mechanisms to support multiple partitions.
  • Ensure backward compatibility with existing tables that use a single partitioned field.

Anything else?

No response

Willingness to contribute

  • I'm willing to submit a PR!

Sub-issues

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions