-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustdoc-search: count path edits with separate edit limit #119331
rustdoc-search: count path edits with separate edit limit #119331
Conversation
r? @jsha (rustbot has picked a reviewer for you, use r? to override) |
Some changes occurred in HTML/CSS/JS. cc @GuillaumeGomez, @jsha |
This comment has been minimized.
This comment has been minimized.
Since the two are counted separately elsewhere, they should get their own limits, too. The biggest problem with combining them is that paths are loosely checked by not requiring every component to match, which means that if they are short and matched loosely, they can easily find "drunk typist" matches that make no sense, like this old result: std::collections::btree_map::itermut matching slice::itermut maxEditDistance = ("slice::itermut".length) / 3 = 14 / 3 = 4 editDistance("std", "slice") = 4 editDistance("itermut", "itermut") = 0 4 + 0 <= 4 PASS Of course, `slice::itermut` should not match stuff from btreemap. `slice` should not match `std`. The new result counts them separately: maxPathEditDistance = "slice".length / 3 = 5 / 3 = 1 maxEditDistance = "itermut".length / 3 = 7 / 3 = 2 editDistance("std", "slice") = 4 4 <= 1 FAIL Effectively, this makes path queries less "typo-resistant". It's not zero, but it means `vec` won't match the `v1` prelude. Queries without parent paths are unchanged.
91f41b8
to
0ea58e2
Compare
Change look good to me. Can you run JS perf check too please? |
There isn't really a case "like std" right? |
I haven't included the standard library in this benchmark tool, because rigging it up would be complicated and not worth it. The complexity comes from it being developed alongside the compiler, which means if I want to do a comparison benchmark across a bootstrap compiler version bump, I can't build the same version of the standard library with two different versions of rustdoc (unstable features are unstable) but I also shouldn't try to compare two different versions of the standard library (not an apples-to-apples comparison of the search engine changes). The other reason is that the standard library isn't small enough to be a microbenchmark, or big enough to be a stress test.
Footnotes
|
That's a good point. Well, let's consider the bench results not changing as a green light for going forward with it then. Thanks! @bors r+ rollup |
…nce, r=GuillaumeGomez rustdoc-search: count path edits with separate edit limit Avoids strange-looking results like this one, where the path component seems to be ignored: ![image](https://github.com/rust-lang/rust/assets/1593513/f0ef077a-6e09-4d67-a29d-8cabc1495f66) Since the two are counted separately elsewhere, they should get their own limits, too. The biggest problem with combining them is that paths are loosely checked by not requiring every component to match, which means that if they are short and matched loosely, they can easily find "drunk typist" matches that make no sense, like this old result: std::collections::btree_map::itermut matching slice::itermut maxEditDistance = ("slice::itermut".length) / 3 = 14 / 3 = 4 editDistance("std", "slice") = 4 editDistance("itermut", "itermut") = 0 4 + 0 <= 4 PASS Of course, `slice::itermut` should not match stuff from btreemap. `slice` should not match `std`. The new result counts them separately: maxPathEditDistance = "slice".length / 3 = 5 / 3 = 1 maxEditDistance = "itermut".length / 3 = 7 / 3 = 2 editDistance("std", "slice") = 4 4 <= 1 FAIL Effectively, this makes path queries less "typo-resistant". It's not zero, but it means `vec` won't match the `v1` prelude. This commit also adds substring matching to paths. It's stricter than the substring matching in the main part, but loose enough that what I expect to match does. Queries without parent paths are unchanged.
💥 Test timed out |
…iaskrgr Rollup of 5 pull requests Successful merges: - rust-lang#119331 (rustdoc-search: count path edits with separate edit limit) - rust-lang#119359 (Simplify Parser::ident_or_error) - rust-lang#119376 (Add regression test for rust-lang#106630) - rust-lang#119379 (Update `parse_seq` doc) - rust-lang#119380 (Don't suggest writing a bodyless arm if the pattern can never be a never pattern) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#119331 - notriddle:notriddle/maxpatheditdistance, r=GuillaumeGomez rustdoc-search: count path edits with separate edit limit Avoids strange-looking results like this one, where the path component seems to be ignored: ![image](https://github.com/rust-lang/rust/assets/1593513/f0ef077a-6e09-4d67-a29d-8cabc1495f66) Since the two are counted separately elsewhere, they should get their own limits, too. The biggest problem with combining them is that paths are loosely checked by not requiring every component to match, which means that if they are short and matched loosely, they can easily find "drunk typist" matches that make no sense, like this old result: std::collections::btree_map::itermut matching slice::itermut maxEditDistance = ("slice::itermut".length) / 3 = 14 / 3 = 4 editDistance("std", "slice") = 4 editDistance("itermut", "itermut") = 0 4 + 0 <= 4 PASS Of course, `slice::itermut` should not match stuff from btreemap. `slice` should not match `std`. The new result counts them separately: maxPathEditDistance = "slice".length / 3 = 5 / 3 = 1 maxEditDistance = "itermut".length / 3 = 7 / 3 = 2 editDistance("std", "slice") = 4 4 <= 1 FAIL Effectively, this makes path queries less "typo-resistant". It's not zero, but it means `vec` won't match the `v1` prelude. This commit also adds substring matching to paths. It's stricter than the substring matching in the main part, but loose enough that what I expect to match does. Queries without parent paths are unchanged.
Avoids strange-looking results like this one, where the path component seems to be ignored:
Since the two are counted separately elsewhere, they should get their own limits, too. The biggest problem with combining them is that paths are loosely checked by not requiring every component to match, which means that if they are short and matched loosely, they can easily find "drunk typist" matches that make no sense, like this old result:
Of course,
slice::itermut
should not match stuff from btreemap.slice
should not matchstd
.The new result counts them separately:
Effectively, this makes path queries less "typo-resistant". It's not zero, but it means
vec
won't match thev1
prelude.This commit also adds substring matching to paths. It's stricter than the substring matching in the main part, but loose enough that what I expect to match does.
Queries without parent paths are unchanged.