Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move PhysicalPlanner to physical_planer module #6570

Merged
merged 4 commits into from
Jun 16, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jun 6, 2023

Which issue does this PR close?

Part of #1754 (see #1754 (comment))

Rationale for this change

I am trying to extract physical_plan into its own crate to improve modularity and compile times

I am also trying to make sure the organization of the DataFusion crate is easier to understand for beginners.

What changes are included in this PR?

Move PhysicalPlanner and related structures to physical_planner module

Are these changes tested?

Existing tests

Are there any user-facing changes?

Yes, documentation is improved (the planner is no longer buried in a sub module). Also, if users used the PhysicalPlanner directly they will need to update their use path

@alamb alamb added the api change Changes the API exposed to users of the crate label Jun 6, 2023
@alamb alamb marked this pull request as ready for review June 6, 2023 20:48
@github-actions github-actions bot added the core Core DataFusion crate label Jun 6, 2023
@alamb alamb changed the title Move PhysicalPlanner to physical_planer module Move PhysicalPlanner to physical_planer module Jun 6, 2023
@alamb alamb force-pushed the alamb/move_physical_planner branch from 3d73f4c to f004a97 Compare June 7, 2023 11:08
@alamb alamb requested review from andygrove and Ted-Jiang June 8, 2023 13:53
Copy link
Member

@Ted-Jiang Ted-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense to me 👍

btw: its own crate to improve modularity and compile times
How to improve the compile times ? increase the degree of parallelism 🤔

@alamb
Copy link
Contributor Author

alamb commented Jun 14, 2023

How to improve the compile times ? increase the degree of parallelism 🤔

Yes that is basically the hope/plan

@viirya
Copy link
Member

viirya commented Jun 15, 2023

I am trying to extract physical_plan into its own crate to improve modularity and compile times

So physical_planner will be into the same crate as physical_plan? Or a different crate?

@alamb
Copy link
Contributor Author

alamb commented Jun 15, 2023

So physical_planner will be into the same crate as physical_plan? Or a different crate?

I think initially it will be in the datafusion (datafusion/core) crate (where it is now).

We could potentially move it to its own crate in the future too (though given it is not that much code, I don't know how valuable that would be)

The issue is that the planner instantiates a bunch of things defined in datasource;

The key issue in extracting physical_plan from datafusion/core is that datasource is ALSO in datafusion/core and I can't easily take it out (b/c TableProvider depends directly on SessionState).

Since the physical planner depends on datasource it has to stay in datafusion/core too, at least initailly.

Once I have physical_plan and datasource detangled (so datasource depends on physical_plan but not the other way round) I can finally pull out physical_plan 😅 -- at least that is the hope

@alamb alamb merged commit b7cd7ba into apache:main Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants