Skip to content

Minor: allow creating infinite external tables via SQL (for testing) #6235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 4, 2023

Which issue does this PR close?

Part of porting tests to sqllogic test #6234

Rationale for this change

Some tests in window.rs programmatically create "infinite" external tables. I want to port them to SQL and thus I need to be able to do the same thing programatically

What changes are included in this PR?

Add support for ('infinite_source', 'true1') in CREATE EXTERNAL TABLE:

CREATE external table t ...
OPTIONS('infinite_source' 'true')
LOCATION 'tests/data/empty.csv';

Add a infinite_source=true in the explain plan (which is super easy after #6202 from @tz70s ❤️ )

logical_plan TableScan: t projection=[c1]
physical_plan CsvExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/core/tests/data/empty.csv]]}, projection=[c1], infinite_source=true, has_header=true

Are these changes tested?

yes

Are there any user-facing changes?

I intend this to be used for testing mostly, so I didn't add any entry to the user SQL guide

@alamb alamb marked this pull request as ready for review May 4, 2023 18:23
@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels May 4, 2023
@alamb alamb requested a review from mustafasrepo May 4, 2023 20:28
@alamb
Copy link
Contributor Author

alamb commented May 4, 2023

@mustafasrepo or @ozankabak do you have time to review this small PR? I want to use this feature to simplify tests for infinite stream datasources (that can't be created via sql at this time) including tests in window.rs

Copy link
Contributor

@ozankabak ozankabak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM code-wise.

BTW, we intended to do this pretty soon -- we were thinking about naming; i.e. thinking about whether it is better UX to say OPTIONS('infinite_source' 'true'), or OPTIONS('infinite' 'true'), or OPTIONS('unbounded' 'true'). Any ideas on this?

@mustafasrepo
Copy link
Contributor

LGTM!. Thanks for this PR @alamb.

@alamb
Copy link
Contributor Author

alamb commented May 5, 2023

BTW, we intended to do this pretty soon -- we were thinking about naming; i.e. thinking about whether it is better UX to say OPTIONS('infinite_source' 'true'), or OPTIONS('infinite' 'true'), or OPTIONS('unbounded' 'true'). Any ideas on this?

I agree infinite_source is not a good name (I chose it to mirror the field in the code). It is also not documented.

In terms of a better UX, I think the ideal UX would be to have some explicit syntax (rather than string name/value pairs). Something like

CREATE external table t ...
INFINITE
LOCATION 'tests/data/empty.csv';

If you agree, perhaps we can merger this PR as is (as a testing tool) and I can file a follow on ticket to make the syntax better.

@alamb alamb changed the title Minor: allow creating infinite external tables via SQL Minor: allow creating infinite external tables via SQL (for testing) May 5, 2023
@alamb
Copy link
Contributor Author

alamb commented May 5, 2023

I filed #6251 to track making a nicer syntax -- I would like to merge this PR to unblock some test updates.

@alamb alamb merged commit 7ef4de5 into apache:main May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants