-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
DataFrameWriteOptions is missing an order by / sort by like available in SQL.
For sql we have the option to sort, for example:
You can use the WITH ORDER clause of the CREATE EXTERNAL TABLE if your data is already ordered
https://datafusion.apache.org/user-guide/sql/ddl.html#create-external-table
CREATE EXTERNAL TABLE test (
c1 VARCHAR NOT NULL,
c2 INT NOT NULL,
c3 SMALLINT NOT NULL,
c4 SMALLINT NOT NULL,
c5 INT NOT NULL,
c6 BIGINT NOT NULL,
c7 SMALLINT NOT NULL,
c8 INT NOT NULL,
c9 BIGINT NOT NULL,
c10 VARCHAR NOT NULL,
c11 FLOAT NOT NULL,
c12 DOUBLE NOT NULL,
c13 VARCHAR NOT NULL
)
STORED AS CSV
-- this line tells DataFusion the data in the file is already ordered by (c2 ASC)
WITH ORDER (c2 ASC)
LOCATION '/path/to/aggregate_test_100.csv'
OPTIONS ('has_header' 'true');But for writing my parquet or other format files, we don't support it.
Describe the solution you'd like
Add the sort support option for DataFrameWriteOptions
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request