Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cloud storage as the source/destination of SELECT ... INTO OUTFILE & LOAD DATA #20582

Open
SunRunAway opened this issue Oct 22, 2020 · 6 comments
Labels
sig/execution SIG execution type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@SunRunAway
Copy link
Contributor

SunRunAway commented Oct 22, 2020

Feature Request

Is your feature request related to a problem? Please describe:

In the cloud environment, it is impossible to load files locally.

Describe the feature you'd like:

Syntax:

SELECT * FROM t INTO OUTFILE S3 's3://bucket-name/data.txt?region=us-west-2';
--- or
SELECT * FROM t INTO OUTFILE S3 's3-us-west-2://bucket-name/data.txt';
LOAD DATA FROM S3 's3://bucket-name/data.txt?region=us-west-2' INTO TABLE t;

The scheme and the parameters of S3 file can be similar with BR's, along with the credential policy.

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

@SunRunAway SunRunAway added the type/feature-request Categorizes issue or PR as related to a new feature. label Oct 22, 2020
@SunRunAway
Copy link
Contributor Author

SunRunAway commented Oct 22, 2020

cc @zz-jason @nullnotnil

@ghost
Copy link

ghost commented Oct 22, 2020

I think this is a very useful FR. In terms of syntax, I suggest we follow AWS Aurora, which is slightly different:

SELECT * FROM employees INTO OUTFILE S3 's3-us-west-2://aurora-select-into-s3-pdx/sample_employee_data'
    FIELDS TERMINATED BY ','
    LINES TERMINATED BY '\n';

LOAD DATA FROM S3 's3://mybucket/data.txt'
    INTO TABLE table1
    (column1, @var1)
    SET table_column2 = @var1/100;

Docs:

@SunRunAway
Copy link
Contributor Author

SunRunAway commented Oct 23, 2020

I suggest we follow AWS Aurora

I've updated @nullnotnil's suggestion into the issue description.


The current URI in BR places the region information as a URL query like s3://bucket-name/data.txt?region=us-west-2, and AWS Aurora places it in the Schema.
Is it better to support them both for consistency?

@SunRunAway SunRunAway added the sig/execution SIG execution label Oct 23, 2020
@zz-jason
Copy link
Member

@IANTHEREAL PTAL

@yufan022
Copy link
Contributor

Hi I'm happy to work for this issue.

@lance6716
Copy link
Contributor

LOAD DATA implements this feature in #40499 . Please notice that some syntax is reverted in later commits than #40499

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants