Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new lint: manual_contains #13817

Merged
merged 2 commits into from
Feb 15, 2025
Merged

Conversation

lapla-cogito
Copy link
Contributor

@lapla-cogito lapla-cogito commented Dec 12, 2024

close #13353

Using contains() for slices are more efficient than using iter().any().

changelog: [manual_contains]: new lint

@rustbot
Copy link
Collaborator

rustbot commented Dec 12, 2024

r? @Manishearth

rustbot has assigned @Manishearth.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label Dec 12, 2024
@lapla-cogito lapla-cogito force-pushed the contains_for_u8i8 branch 2 times, most recently from acb27cb to 8eb9d35 Compare December 12, 2024 01:30
@lapla-cogito
Copy link
Contributor Author

lapla-cogito commented Dec 12, 2024

On second thought, I thought it would be more appropriate to place the lint under the methods directory. Therefore, I am sorry, but I'm going to convert to draft this PR and modify it. now fixed

@lapla-cogito lapla-cogito marked this pull request as draft December 12, 2024 02:35
@lapla-cogito lapla-cogito marked this pull request as ready for review December 12, 2024 02:59
/// Checks for usage of `iter().any()` on slices of `u8` or `i8` and suggests using `contains()` instead.
///
/// ### Why is this bad?
/// `iter().any()` on slices of `u8` or `i8` is optimized to use `memchr`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you mean .contains()? Anyway, I think this is too specific, on my machine it seems to use compiler intrinsics and make no calls to memchr.

You're right about performances though: I benchmarked both .iter().any(==) and .contains() and the latter runs more than 10 times faster on large areas.

Copy link
Contributor Author

@lapla-cogito lapla-cogito Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I had assumed from the implementation that memchr would be used in any environment (including my local environment). Also, this godbolt does so as far as the assembly is concerned: https://rust.godbolt.org/z/Kxrzr81Me

However, if it may differ depending on the environment, the description should be indeed modified, so I'll make a change. Could you please tell me for reference, what's the environment you have checked that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the memchr itself has been replaced by an intrinsic. I'm using the latest nightly compiler on x86_64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! In any case, it looks like this lint description should be modified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the memchr() exists for larger integer types too, yes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, nope, it doesn't. It could be extended, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that contains() is now faster for certain types (u16, u32, u64, i16, i32, i64, f32, f64, usize, isize) compared to before (see: rust-lang/rust#130991). Therefore, this performance lint should be extended to cover these types starting from Rust 1.84.0 and I'll make a change for this.

@lapla-cogito lapla-cogito force-pushed the contains_for_u8i8 branch 5 times, most recently from 2b9d742 to b3837dc Compare December 14, 2024 02:10
@lapla-cogito lapla-cogito changed the title new lint to use contains() instead of iter().any() for u8 and i8 slices new lints to detect inefficient iter().any() Dec 14, 2024
Copy link
Member

@Manishearth Manishearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thing I'm unsure about is the mixing of two lints like this. I'm going to ask on Zulip if we should be doing two lints here.

https://rust-lang.zulipchat.com/#narrow/channel/257328-clippy/topic/iter.2Eany.28.29.20lint.3A.20one.20or.20two.20lints.3F

@Manishearth
Copy link
Member

I'm traveling so I'll wait for @samueltardieu to finish approving

@lapla-cogito lapla-cogito force-pushed the contains_for_u8i8 branch 2 times, most recently from 8a4f434 to e72699f Compare February 11, 2025 08:59
@lapla-cogito lapla-cogito changed the title new lint: slow_manual_contains new lint: manual_contains Feb 11, 2025
Copy link
Contributor

@samueltardieu samueltardieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks, approving on my side, provided you fix the latest nit on if/match collapse.

@samueltardieu
Copy link
Contributor

@Manishearth All good for me, hits from lintcheck are legit.

@samueltardieu
Copy link
Contributor

I'm not sure if that requires a FCP given that it has been discussed in Zulip already.

@lapla-cogito
Copy link
Contributor Author

Rebased.

@lapla-cogito
Copy link
Contributor Author

Rebased.

@Manishearth Manishearth added this pull request to the merge queue Feb 15, 2025
@Manishearth
Copy link
Member

Thanks for your patience!

Merged via the queue into rust-lang:master with commit 8cef0b6 Feb 15, 2025
11 checks passed
@nyurik
Copy link
Contributor

nyurik commented Feb 15, 2025

@lapla-cogito thanks for implementing! Two questions:

  • would it make sense to catch any non-number cases, i.e. there are a lot of contains() methods, some of which might be a good fit for this method? Technically the restriction should be "anything that implements PartialEq" rather than being a number
  • would it handle the macro-expansion variants ok (should there be a test for that?)

@lapla-cogito
Copy link
Contributor Author

@nyurik

  • This lint applies to general slices, meaning non-numeric slices are already covered (this might help clarify why it applies to general slices). There are indeed more cases that use .contains() as you mentioned, and we may add further implementations if there are less efficient ways to write .contains().
  • I think so. However, there are currently no tests for such cases. I'll consider adding them if additional test cases seem appropriate. Thanks!

@nyurik
Copy link
Contributor

nyurik commented Feb 16, 2025

@lapla-cogito thx for the link! So if I read this right, the general agreement is that we should always promote .contains because it offers better performance guarantees (let alone it is far more idiomatic and easier to read than the iter.any combination). Wouldn't that mean we would want to offer this specific lint to replace any such cases, even if it is not just for the slice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggest faster .contains() instead of .iter().any() for [u8] and [i8] slices
5 participants