Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Move TableProviderFactories up out of RuntimeEnv and into SessionState #5477

Merged
merged 1 commit into from
Mar 7, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Mar 4, 2023

Which issue does this PR close?

Part of #1754

Rationale for this change

I am trying to extract the physical_plan code into its own crate; and to do so I need to remove the circular dependencies between core --> datasource --> execution --> datasource

Table factories are used for planning, not for execution, but RuntimeEnv is used for execution (and I am trying to move it into the datafusion_execution crate)

See more details in #1754 (comment)

What changes are included in this PR?

  1. Move table_factories off of RuntimeEnv and directly to SessionState

Are these changes tested?

Covered by existing tests (and the test changes illustrate what happened to the API). I believe this will affect Ballista (@avantgardnerio ) and delta-rs (cc @roeap )

Are there any user-facing changes?

Users who are specifying custom TableFactoryProviders have a slightly different API to register them.

@alamb alamb added the api change Changes the API exposed to users of the crate label Mar 4, 2023
@github-actions github-actions bot added the core Core DataFusion crate label Mar 4, 2023
@@ -278,6 +282,15 @@ impl SessionContext {
self.session_id.clone()
}

/// Return the [`TableFactoryProvider`] that is registered for the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Return the [`TableFactoryProvider`] that is registered for the
/// Return the [`TableProviderFactory`] that is registered for the


use crate::datasource::datasource::TableProviderFactory;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole point of the PR is to remove the datasource dependency from the runtime_env module (as I am trying to move the runtime_env module into a new crate, datafusion-execution, that does not depend on datasource)

I plan to move object_store registry in a separate PR to keep the reviews smaller

}

impl RuntimeConfig {
/// New with default values
pub fn new() -> Self {
let mut table_factories: HashMap<String, Arc<dyn TableProviderFactory>> =
HashMap::new();
table_factories.insert("PARQUET".into(), Arc::new(ListingTableFactory::new()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this construction now happens as part of creating SessionState

let env = RuntimeEnv::new(cfg).unwrap();
let ses = SessionConfig::new();
let ctx = SessionContext::with_config_rt(ses, Arc::new(env));
let mut state = SessionState::with_config_rt(ses, Arc::new(env));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the new pattern of how to add a table provider factory

@alamb alamb changed the title Minor: Move TableProviderFactories up out of RuntimeEnv and into Sess… Minor: Move TableProviderFactories up out of RuntimeEnv and into SessionState Mar 4, 2023
@avantgardnerio avantgardnerio self-requested a review March 6, 2023 18:56
Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay...

@alamb
Copy link
Contributor Author

alamb commented Mar 7, 2023

Thank you for the review @avantgardnerio

@alamb alamb merged commit deeaa56 into apache:main Mar 7, 2023
@alamb alamb deleted the alamb/sever_runtime_env_dep branch October 18, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants