-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
clippy::needless_pass_by_value is a non-default clippy lint rule, however this rule is quite useful for preventing certain unnecessary clones.
See more about supporting more non-default Clippy rules for datafusion in #18467
The initial PR to enforce this rule in datafusion-common crate: #18468
Task: Enforce needless_pass_by_value rule to all crates
TODO: list the packages / find some suitable packages to start with
- Enforce lint rule
clippy::needless_pass_by_valuetodatafusion-expr#18504 - Enforce lint rule
clippy::needless_pass_by_valuetodatafusion-optimizer#18505
Rationale
clippy::needless_pass_by_value's description can be found in https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_value. It is non-default lint rule probably it's too annoying for the general cases. Specifically, it has the Clippy category pedantic, and its description is lints which are rather strict or have occasional false positives` from https://doc.rust-lang.org/nightly/clippy
However, the violation of this rule might further introduce unnecessary clones, and datafusion has been suffering from the performance issue caused by redundant clones for quite some time. (TODO: maybe we can find some issue links here)
Example: violation of needless_pass_by_value that leads to an unnecessary clone
#[allow(clippy::needless_pass_by_value)]
fn print_len(v: Vec<i32>) {
println!("{}", v.len());
}
fn main() {
let data = vec![1, 2, 3, 4];
// We need to CLONE because `print_len` takes ownership.
print_len(data.clone());
println!("Still need data: {:?}", data);
}If the linter can catch such issues, there will be no unnecessary copies
fn print_len(v: &[i32]) {
println!("{}", v.len());
}
fn main() {
let data = vec![1, 2, 3, 4];
print_len(&data); // no clone, just borrow
println!("Still need data: {:?}", data);
}Describe the solution you'd like
Initial/example PR: #18468
- For a certain package, add this extra lint rule in
Cargo.toml
[lints] # Overrides / extends workspace.lints
[lints.clippy]
# Avoid unnecessary clones for performance, okay to ignore for tests or intentional
# moves.
# Reference: <https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_value>
needless_pass_by_value = "deny"If a package is too large, we can carry it out module by module like
datafusion/datafusion/common/src/lib.rs
Lines 23 to 25 in 7b5685b
| // Make sure fast / cheap clones on Arc are explicit: | |
| // https://github.com/apache/datafusion/issues/11143 | |
| #![deny(clippy::clone_on_ref_ptr)] |
2. Fix violations if they might be on the performance critical path. For the following cases, it's okay to suppress lint violations using
#[expect(clippy::needless_pass_by_value)]:
- Tests. See CI: add
clippy::needless_pass_by_valuerule #18468 (comment) to turn it off for all tests within the package - Intentional moves
- Public APIs
- Clearly not performance critical and also very hard to fix
- Ensure
./dev/rust_lint.shpasses
Describe alternatives you've considered
No response
Additional context
The individual tasks are quite AI-friendly, but make sure you understand the rationale and do a self-review before sending a PR; otherwise, the code review process can become painful.
See datafusion AI PR policy for details: https://datafusion.apache.org/contributor-guide/index.html#ai-assisted-contributions