Skip to content

LogicalPlan::TableScan should not depend on the physical plan #2247

Closed
@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The logical plan currently depends on the physical plan, which prevents us from moving the logical plan into its own crate alongside the logical expressions (which is needed in order to support subqueries, and also to better support the use case where DataFusion is being used just for SQL query planning and then using a different execution engine).

The reason that the logical plan depends on the physical plan is that LogicalPlan::TableScan has a reference to a TableProvider which has a scan method that returns ExecutionPlan.

Describe the solution you'd like
I think the cleanest solution is going to be to have TableScan refer to a TableProvider by name rather than containing a reference to the TableProvider directly. The LogicalPlanBuilder would then need to maintain some state and that would need to get passed along to the physical planner somehow.

Describe alternatives you've considered
Here is the original solution I was pursuing. It did not work out.

Split the TableProvider trait into two separate traits - TableProvider and TableScanProvider (which can extend TableProvider). We have no need to call the scan method when building or optimizing the logical plan.

We will need to add extra information, such as table_scan_provider_name (or maybe listing config?) to the TableScan struct so that the physical planner can lookup or create the TableScanProvider.

Additional context
This came up when attempting to add support for subqueries.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions