Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The logical plan currently depends on the physical plan, which prevents us from moving the logical plan into its own crate alongside the logical expressions (which is needed in order to support subqueries, and also to better support the use case where DataFusion is being used just for SQL query planning and then using a different execution engine).
The reason that the logical plan depends on the physical plan is that LogicalPlan::TableScan
has a reference to a TableProvider
which has a scan
method that returns ExecutionPlan
.
Describe the solution you'd like
I think the cleanest solution is going to be to have TableScan
refer to a TableProvider
by name rather than containing a reference to the TableProvider
directly. The LogicalPlanBuilder
would then need to maintain some state and that would need to get passed along to the physical planner somehow.
Describe alternatives you've considered
Here is the original solution I was pursuing. It did not work out.
Split the TableProvider trait into two separate traits - TableProvider
and TableScanProvider
(which can extend TableProvider
). We have no need to call the scan
method when building or optimizing the logical plan.
We will need to add extra information, such as table_scan_provider_name
(or maybe listing config?) to the TableScan
struct so that the physical planner can lookup or create the TableScanProvider
.
Additional context
This came up when attempting to add support for subqueries.