Skip to content

Add downcast_to_source method for DataSourceExec #15416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 27, 2025

Conversation

xudong963
Copy link
Member

@xudong963 xudong963 commented Mar 25, 2025

Rationale for this change

When I upgraded the df46, It was annoying for me to do the following thing(a series of downcast_ref) and it's also easy to call the wrong method, such as mixing file_source() and data_source():

if let Some(scan_config) = self.data_source().as_any().downcast_ref::<FileScanConfig>() {
      if let Some(parquet_source) = scan_config
          .file_source()
          .as_any()
          .downcast_ref::<ParquetSource>(){...}

What changes are included in this PR?

Add the downcast_to_source method for DataSourceExec to make life easy

Are these changes tested?

Yes, I replace the existing code with the new method.

Are there any user-facing changes?

It'll be useful for users to upgrade df46.

@github-actions github-actions bot added core Core DataFusion crate substrait Changes to the substrait crate proto Related to proto crate datasource Changes to the datasource crate labels Mar 25, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 26, 2025
@xudong963
Copy link
Member Author

I added the change to the upgrading doc to let users find it easily: 4682fcd

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xudong963 -- I agree this is a nice improvement. I don't think we should mention this function in the 46 upgrade guide given that the function isn't available until 47

@@ -230,4 +231,18 @@ impl DataSourceExec {
Boundedness::Bounded,
)
}

/// Downcast the `DataSourceExec` to a specific file source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Downcast the `DataSourceExec` to a specific file source
/// Downcast the `DataSourceExec`'s `data_source` to a specific file source
///
/// Returns `None` if
/// 1. the datasource is not scanning files (`FileScanConfig`)
/// 2. The [`FileScanConfig::file_source`] is not of type <T>

@@ -129,6 +129,20 @@ if let Some(datasource_exec) = plan.as_any().downcast_ref::<DataSourceExec>() {
# */
```

There's also a more convenient helper method `downcast_to_file_source` on `DataSourceExec`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code will not be available until DataFusion 47, so we probably need to put this into a new heading for 47 upgrade guide

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i totally forgot that 🤦‍♂️

@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Mar 27, 2025
@xudong963
Copy link
Member Author

xudong963 commented Mar 27, 2025

@mertak-synnada @alamb Thanks for your review!

@xudong963 xudong963 merged commit fdb4e84 into apache:main Mar 27, 2025
27 checks passed
qstommyshu pushed a commit to qstommyshu/datafusion that referenced this pull request Mar 28, 2025
* Add downcast_to_source method for DataSourceExec

* rename

* fix conflicts

* fix cippy

* add the change to upgrading doc

* prettier

* remove

* address comments
@alamb alamb mentioned this pull request Apr 14, 2025
9 tasks
nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
* Add downcast_to_source method for DataSourceExec

* rename

* fix conflicts

* fix cippy

* add the change to upgrading doc

* prettier

* remove

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate datasource Changes to the datasource crate proto Related to proto crate substrait Changes to the substrait crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants