Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move make_date, to_char to datafusion-functions #9601

Merged
merged 6 commits into from
Mar 14, 2024

Conversation

Omega359
Copy link
Contributor

Which issue does this PR close?

Closes #9538

Rationale for this change

Function migration to new crate

What changes are included in this PR?

Code, tests

Are these changes tested?

Yes

Are there any user-facing changes?

No.

# Conflicts:
#	datafusion/expr/src/built_in_function.rs
#	datafusion/expr/src/expr_fn.rs
#	datafusion/functions/src/datetime/mod.rs
#	datafusion/physical-expr/src/datetime_expressions.rs
#	datafusion/physical-expr/src/functions.rs
#	datafusion/proto/src/logical_plan/from_proto.rs
#	datafusion/proto/src/logical_plan/to_proto.rs
@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels Mar 13, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Omega359 -- I have only one question related to the benchmark, otherwise this looks great

This completes the migration of date/time functions, right?

@@ -42,8 +38,8 @@ fn random_date_in_range(
start_date + TimeDelta::try_days(random_days).unwrap()
}

fn data(rng: &mut ThreadRng) -> Date32Array {
let mut data: Vec<i32> = vec![];
fn data(rng: &mut ThreadRng) -> Vec<Expr> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this change the meaning of the benchmark from testing to_char on a single array to testing to_char on a single row with 1000's of arguments?

Like to_date(v1, v2, v3, .....)

instead of to_date(arr) ?

Is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark functions iterate on the data and patterns so that shouldn't be the case.

        let data = data(&mut rng);
        let patterns = patterns(&mut rng);

        b.iter(|| {
            data.iter().enumerate().for_each(|(idx, i)| {
                black_box(to_char(i.clone(), patterns.get(idx).unwrap().clone()));
            })
        })

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but then the benchmark is simply testing how fast the to_char expr_fn function is creating a Expr::Function call (but not actually evaluting the expression)

data.iter().enumerate().for_each(|(idx, i)| {
   black_box(to_char(i.clone(), patterns.get(idx).unwrap().clone())); 
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbers looked amazing though :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

use datafusion_expr::ColumnarValue;
use datafusion_physical_expr::datetime_expressions::to_char;
use datafusion_expr::{lit, Expr};
use datafusion_functions::expr_fn::to_char;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the wrong function to be benchmarking -- it should be the implementation (not the expr_fn)

# Conflicts:
#	datafusion/expr/src/built_in_function.rs
#	datafusion/physical-expr/src/lib.rs
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Omega359 🙏

@alamb alamb merged commit 205777b into apache:main Mar 14, 2024
25 checks passed
@Omega359 Omega359 deleted the feature/9538 branch March 14, 2024 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move make_date, to_char to datafusion-functions
2 participants