Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dogfood str_split_once() #79809

Merged
merged 8 commits into from
Dec 11, 2020
Merged

Dogfood str_split_once() #79809

merged 8 commits into from
Dec 11, 2020

Conversation

Eric-Arellano
Copy link
Contributor

Part of #74773.

Beyond increased clarity, this fixes some instances of a common confusion with how splitn(2) behaves: the first element will always be Some(), regardless of the delimiter, and even if the value is empty.

Given this code:

fn main() {
    let val = "...";
    let mut iter = val.splitn(2, '=');
    println!("Input: {:?}, first: {:?}, second: {:?}", val, iter.next(), iter.next());
}

We get:

Input: "no_delimiter", first: Some("no_delimiter"), second: None
Input: "k=v", first: Some("k"), second: Some("v")
Input: "=", first: Some(""), second: Some("")

Using str_split_once() makes more clear what happens when the delimiter is not found.

@rust-highfive
Copy link
Collaborator

Some changes occurred in intra-doc-links.

cc @jyn514

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @lcnr (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 7, 2020
@@ -148,23 +148,19 @@ impl DebugOptions {

if let Ok(env_debug_options) = std::env::var(RUSTC_COVERAGE_DEBUG_OPTIONS) {
for setting_str in env_debug_options.replace(" ", "").replace("-", "_").split(',') {
let mut setting = setting_str.splitn(2, '=');
match setting.next() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original match was considering more than necessary. setting.next() was always Some(). Only the value could be None.

I kept value modeled as an Option to avoid needing to change the related code like bool_option_val.

let mut parts = arg.splitn(2, '=');
let name = parts
.next()
.unwrap_or_else(|| early_error(error_format, "--extern value must not be empty"));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case could never happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Niiiice!

let (options, name) = match (first_part, second_part) {
(Some(opts), Some(name)) => (Some(opts), name),
(Some(name), None) => (None, name),
(None, None) => early_error(error_format, "--extern name must not be empty"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case could never happen, the first value is always Some.

Comment on lines +108 to +113
let (warn_str, critical_str) = durations_str.split_once(',').unwrap_or_else(|| {
panic!(
"Duration variable {} expected to have 2 numbers separated by comma, but got {}",
env_var_name, durations_str
)
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this inverts the order of validation: now we first validate that the split works, and only then validate that they can be parsed as numbers.

Some((url, fragment)) => (url, Some(fragment)),
};
// NB: the `splitn` always succeeds, even if the delimiter is not present.
let url = url.splitn(2, '?').next().unwrap();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we discard the second element, it's simpler to still use splitn().

src/tools/tidy/src/error_codes_check.rs Outdated Show resolved Hide resolved
@jyn514 jyn514 added the C-cleanup Category: PRs that clean code up or issues documenting cleanup. label Dec 7, 2020
@jyn514 jyn514 added T-libs Relevant to the library team, which will review and decide on the PR/issue. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 7, 2020
* Add assertion value is defined.
* Simplify comment.
* Fix bad change in err message.
compiler/rustc_mir/src/transform/coverage/debug.rs Outdated Show resolved Hide resolved
let mut parts = arg.splitn(2, '=');
let name = parts
.next()
.unwrap_or_else(|| early_error(error_format, "--extern value must not be empty"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Niiiice!

one of dylib, framework, or static",
s
),
);
Copy link
Member

@matklad matklad Dec 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just writing my reasoning down for posterity:

The previous version is more compact, so I am almost inclined to say that it was better. However, the new version doesn't have .unwrap(), and, given a couple of instances where have impossible cases in other parts of this PR, I think a tad more verbosity is a fine price to pay for making illegal states unrepresentable here.

That said, would flattening the two matches into one work better here?

match s.split_once('=') {
  Some(("dylib", name)) => ()
}

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack that this is less compact, but I refactored to use the nested match because it reduces duplication and imo makes this code easier to grok, albeit longer. We no longer have to put name in multiple arms of the match, like we had previously:

match s.split_once('=') {
  Some(("dylib", name)) => (),
  Some(("framework", name)) => (),
  ...
}

Instead, what do you think about adding impl TryFrom<String> for NativeLibKind? I'm happy to add that (although I might recommend saving that for a followup to keep this PR more focused).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, what do you think about adding impl TryFrom for NativeLibKind?

Not TryFrom, but impl FromStr for NativeLibKind would work nicely. But I think the added benefit here is marginal, if any (there's drawback of more code churn), so I suggest leaving the code as is then!

@@ -707,11 +704,9 @@ fn parse_extern_html_roots(
) -> Result<BTreeMap<String, String>, &'static str> {
let mut externs = BTreeMap::new();
for arg in &matches.opt_strs("extern-html-root-url") {
let mut parts = arg.splitn(2, '=');
let name = parts.next().ok_or("--extern-html-root-url must not be empty")?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -435,8 +435,10 @@ impl<'a, 'tcx> LinkCollector<'a, 'tcx> {

// Try looking for methods and associated items.
let mut split = path_str.rsplitn(2, "::");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use rsplit_once here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to, but had a hard time getting this code working properly. My WIP ended up being more confusing imo, so I gave up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah everything here is a mess. I tried to clean it up in #76467 but it ended up not working and getting stuck for 3 months :( so I reverted it.

.next()
.unwrap();
let testname =
file_path.file_name().unwrap().to_str().unwrap().split_once('.').unwrap().0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is just file_path.file_stem() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it now, and it breaks things. From lines 12 - 17, there are some files with multiple .s in the name, like $testname.$revision.$mode.stderr, and we need to strip those. I'll add a comment.

src/tools/tidy/src/error_codes_check.rs Outdated Show resolved Hide resolved
@lcnr
Copy link
Contributor

lcnr commented Dec 8, 2020

r=me

@matklad just wrote some review comments so I will leave the final approval to them
r? @matklad

@rust-highfive rust-highfive assigned matklad and unassigned lcnr Dec 8, 2020
* Use a match statement.
* Clarify why we can't use `file_stem()`.
* Error if the `:` is missing for Tidy error codes, rather than no-oping.
@matklad
Copy link
Member

matklad commented Dec 10, 2020

LGTM now!

@bors r+

@matklad
Copy link
Member

matklad commented Dec 10, 2020

@bors r+

@bors
Copy link
Contributor

bors commented Dec 10, 2020

📌 Commit 989edf4 has been approved by matklad

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 10, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Dec 10, 2020
Dogfood `str_split_once()`

Part of rust-lang#74773.

Beyond increased clarity, this fixes some instances of a common confusion with how `splitn(2)` behaves: the first element will always be `Some()`, regardless of the delimiter, and even if the value is empty.

Given this code:

```rust
fn main() {
    let val = "...";
    let mut iter = val.splitn(2, '=');
    println!("Input: {:?}, first: {:?}, second: {:?}", val, iter.next(), iter.next());
}
```

We get:

```
Input: "no_delimiter", first: Some("no_delimiter"), second: None
Input: "k=v", first: Some("k"), second: Some("v")
Input: "=", first: Some(""), second: Some("")
```

Using `str_split_once()` makes more clear what happens when the delimiter is not found.
@bors
Copy link
Contributor

bors commented Dec 11, 2020

⌛ Testing commit 989edf4 with merge 7414a221dbcad5b0f38a58956eab1374e5f3296f...

@tmandry
Copy link
Member

tmandry commented Dec 11, 2020

Rolled up
@bors retry

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 11, 2020
Rollup of 11 pull requests

Successful merges:

 - rust-lang#77027 (Improve documentation for `std::{f32,f64}::mul_add`)
 - rust-lang#79375 (Make the kernel_copy tests more robust/concurrent.)
 - rust-lang#79639 (Add long explanation for E0212)
 - rust-lang#79698 (Add tracking issue template for library features.)
 - rust-lang#79809 (Dogfood `str_split_once()`)
 - rust-lang#79851 (Clarify the 'default is only allowed on...' error)
 - rust-lang#79858 (Update const-fn doc in unstable-book)
 - rust-lang#79860 (Clarify that String::split_at takes a byte index.)
 - rust-lang#79871 (Fix small typo in `wrapping_shl` documentation)
 - rust-lang#79896 (Make search results tab and help button focusable with keyboard)
 - rust-lang#79917 (Use Symbol for inline asm register class names)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors
Copy link
Contributor

bors commented Dec 11, 2020

⌛ Testing commit 989edf4 with merge 65d053a...

@bors bors merged commit 17ec4b8 into rust-lang:master Dec 11, 2020
@rustbot rustbot added this to the 1.50.0 milestone Dec 11, 2020
@matklad
Copy link
Member

matklad commented Dec 11, 2020

@Eric-Arellano do you want to file a stabilization PR as well (see the instructions in the tracking issue)?

I am not sure if we should let this bake more or not, but I think it's fine to raise the issue with the libs team already.

let (item_str, item_name) = split.next().map(|i| (i, Symbol::intern(i))).unwrap();
// NB: `split`'s first element is always defined, even if the delimiter was not present.
let item_str = split.next().unwrap();
assert!(!item_str.is_empty());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caused an assertion failure on ::some_crate, I forgot :: means the root namespace. Sending a revert PR now.

https://rust-lang.zulipchat.com/#narrow/stream/266220-rustdoc/topic/Perf.20failing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Category: PRs that clean code up or issues documenting cleanup. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants