Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc aliases for a lot of the C standard library #81988

Closed
wants to merge 1 commit into from

Conversation

jyn514
Copy link
Member

@jyn514 jyn514 commented Feb 11, 2021

I took the very scientific approach of "ask some friends what parts of
the standard library they use a lot". They suggested
https://www.tutorialspoint.com/c_standard_library/ and also the short
list strcpy, strncpy, memcpy, memcpy, printf, scanf, fprintf, sprintf, asprintf, sscanf, strtok, strntok, malloc, calloc, free, strdup, strndup, err, errx. Many of these already have aliases; those with
*n* have no analogy in rust; asprintf, err, and errx have no analogy in Rust.
I added most of the rest.

@jyn514 jyn514 added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools labels Feb 11, 2021
@rust-highfive
Copy link
Collaborator

r? @KodrAus

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 11, 2021
@@ -279,6 +279,7 @@ impl char {
/// '1'.is_digit(37);
/// ```
#[stable(feature = "rust1", since = "1.0.0")]
#[doc(alias = "isxdigit")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is moderately iffy because you have to pass a radix of 16, but it's close enough.

library/core/src/char/methods.rs Show resolved Hide resolved
Comment on lines 892 to +890
#[doc(alias = "delete")]
#[doc(alias = "free")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very surprised someone added delete but not free.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This alias does seem very reasonable to add.

Comment on lines +776 to +788
#[doc(alias = "atoi")]
#[doc(alias = "atol")]
#[doc(alias = "strtod")]
#[doc(alias = "strtol")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured these were all "close enough" since int and long don't have a fixed size.

@@ -1202,6 +1205,7 @@ impl str {
///
/// [`split_whitespace`]: str::split_whitespace
#[stable(feature = "rust1", since = "1.0.0")]
#[doc(alias = "strtok")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is iffy, but C doesn't really have a split function and I didn't know where else to put strtok.

@@ -873,6 +873,7 @@ impl Command {
/// assert!(output.status.success());
/// ```
#[stable(feature = "process", since = "1.0.0")]
#[doc(alias = "system")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also go under spawn(). In C system blocks but returns no value.

@camelid
Copy link
Member

camelid commented Feb 11, 2021

Do people really search the Rust standard library using C function names?

@jyn514
Copy link
Member Author

jyn514 commented Feb 11, 2021

@camelid there are existing aliases for malloc, calloc, memcpy, and memmove, among others.

@camelid
Copy link
Member

camelid commented Feb 11, 2021

Just because there are existing ones doesn't mean we should add more 🙃

@jyn514
Copy link
Member Author

jyn514 commented Feb 11, 2021

I think this would be useful if you're learning rust coming from C. format and drop in particular are hard to find if you don't know what they're called - I've seen several people looking for a "drop keyword" without realizing it was in the standard library. I agree the functions on str might not be pulling their weight, but I don't think they're useless either. These aliases are very low cost to add and basically zero to maintain.

@camelid
Copy link
Member

camelid commented Feb 11, 2021

format and drop in particular are hard to find if you don't know what they're called - I've seen several people looking for a "drop keyword" without realizing it was in the standard library.

I can understand that, but that's true of many things. For example, I imagine many people will at least start learning Rust from the Book and Rust by Example (those are some of the ways I started). But perhaps others will jump right in without looking at any language-level docs.

These aliases are very low cost to add and basically zero to maintain.

That's true, but I just feel worried that we'll end up littering them all over the place and then not be able to remove them.

@joshtriplett
Copy link
Member

asprintf does have an equivalent: format!

@ojeda
Copy link
Contributor

ojeda commented Feb 11, 2021

A few nitpicks:

  • asprintf, err and errx are not part of C (or even POSIX).
  • strdup and strndup will likely be part of C23, but currently are only POSIX.
  • strntok does not exist.
  • I wouldn't say "most of the C standard library" since there are many, many more functions there! :-) For instance, we are missing all the math.h aliases.

Related: if we are going to add aliases for several languages (C, Python, Java...), it would be great if the alias functionality could take a lang parameter (or source or from or something like that) to be able to categorize the aliases and tell where they come from. This would be good for documentation and could be used in the future to let the user filter/search/list by them, e.g.:

#[doc(alias = "sprintf", lang = "C, C++, POSIX")]

library/core/src/char/methods.rs Show resolved Hide resolved
Comment on lines +165 to +166
#[doc(alias = "null")]
#[doc(alias = "nil")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C does not have nil, and null should be NULL if it is referring to the library side of C.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can separate this into a separate PR if you like, but I think the aliases are useful.

null should be NULL if it is referring to the library side of C.

doc(alias) is case insensitive.

Copy link
Contributor

@ojeda ojeda Feb 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, writing the original name would be better, no? Otherwise we should change e.g. Map to map in the Java one, etc.

Copy link
Contributor

@ojeda ojeda Feb 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I guess you want to keep null for other languages that use that one -- again, this depends on how we are looking at this: whether we are trying to keep a mapping or just search terms for any language that matches. If the latter, I agree: null is better.

@@ -64,4 +64,6 @@ pub use u64;
#[stable(feature = "core_primitive", since = "1.43.0")]
pub use u8;
#[stable(feature = "core_primitive", since = "1.43.0")]
#[doc(alias = "size_t")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are adding these, then we should also add:

  • _Bool and bool for bool.
  • The exact-width integer types: int8_t, uint8_t, etc.
  • char32_t for char.
  • ssize_t for isize (POSIX, but common enough).
  • Perhaps float and double for f32 and f64 (not guaranteed, but likely).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_Bool and bool for bool.

I have the same question as in #81989 (comment) - what do you see as the point of doc(alias)? Both of those pull up bool as the first result in search already: https://doc.rust-lang.org/stable/std/?search=_Bool, https://doc.rust-lang.org/stable/std/?search=bool

The others look useful to add, thanks :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my answer to that comment.

You're welcome! :)

@jyn514 jyn514 changed the title Add doc aliases for most of the C standard library Add doc aliases for a lot of the C standard library Feb 11, 2021
@jyn514
Copy link
Member Author

jyn514 commented Feb 11, 2021

Related: if we are going to add aliases for several languages (C, Python, Java...), it would be great if the alias functionality could take a lang parameter (or source or from or something like that) to be able to categorize the aliases and tell where they come from.

Can you open an issue for this?

@jyn514
Copy link
Member Author

jyn514 commented Feb 11, 2021

strdup and strndup will likely be part of C23, but currently are only POSIX.

I didn't distinguish very much between "standard C" and "POSIX", I think they're both helpful. Happy to add a bunch more aliases from posix if you think they'd be useful :)

@ojeda
Copy link
Contributor

ojeda commented Feb 11, 2021

Related: if we are going to add aliases for several languages (C, Python, Java...), it would be great if the alias functionality could take a lang parameter (or source or from or something like that) to be able to categorize the aliases and tell where they come from.

Can you open an issue for this?

Of course -- done! #82000

@KodrAus
Copy link
Contributor

KodrAus commented Feb 21, 2021

At this stage I think I echo @camelid's concern about collecting an unbounded number of aliases that aren't necessarily all useful to have.

I think that applies more to the string aliases than the others, which do seem pretty useful. Maybe we should look at landing this without:

  • string.rs
  • char/methods.rs
  • num/mod.rs
  • str/mod.rs

aliases. What do you all think?

@bors

This comment has been minimized.

@crlf0710 crlf0710 added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 12, 2021
@JohnCSimon
Copy link
Member

@jyn514 ping from triage - please address the merge conflicts, thanks.
@rustbot label: -S-waiting-on-review +S-waiting-on-author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 29, 2021
I took the very scientific approach of "ask some friends what parts of
the standard library they use a lot". They suggested
https://www.tutorialspoint.com/c_standard_library/ and also the short
list `strcpy, strncpy, memcpy, memcpy, printf, scanf, fprintf, sprintf,
asprintf, sscanf, strtok, strntok, malloc, calloc, free, strdup,
strndup, err, errx `. Many of these already have aliases; those with
`*n*` have no analogy in rust; `err` and `errx` have no analogy in Rust.
I added most of the rest.

This also includes aliases for standard types, like fixed-width
integers.
@jyn514 jyn514 removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 29, 2021
@jyn514 jyn514 added the S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). label Mar 29, 2021
@jyn514
Copy link
Member Author

jyn514 commented Mar 29, 2021

I think that applies more to the string aliases than the others, which do seem pretty useful. Maybe we should look at landing this without: ... what do you think?

Sure, that seems good to me :)

@m-ou-se m-ou-se assigned m-ou-se and unassigned KodrAus Apr 14, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Apr 21, 2021

We briefly discussed this in the libs meeting, see: #81989 (comment)

Some more thoughts related to the kind of aliases in this PR:

The Rust standard library reference is probably not the place to explain uint16_t is called u16 in Rust, or that one can just .clone() a String instead of calling strdup. (But as said in the linked comment, we're open for ideas for a policy here.)

Aliases can provide no context or explanation, which might lead to confusion or might even come across as somewhat passive aggressive: E.g. someone looking for the raw deallocation api searching for free or delete would now end up with mem::drop, as if the search results are saying "no you don't want to manage your own memory, just drop things!". The explanation that in Rust we usually use types with a Drop implementation instead of manual malloc/free calls probably belongs in a place that explains that whole concept, not just in a doc alias.

Even for seemingly obvious aliases such as toupper it's not clear where it should go: If by entering toupper in the search bar the user meant "what's the equivalent of toupper() from C?", then to_ascii_uppercase is the right answer. But if the question was "how do I make things uppercase? 'toupper' or something?", then to_uppercase is the right answer.

Those of us who were in the meeting agreed that questions like "what's the closest Rust thing to $thing from $language?" are better answered by some kind of cheatsheet or blog post or other resources maintained by the community. We don't know enough about all languages to be able to maintain this as part of the standard library.

I personally think some kind of policy for doc aliases in the standard library could be something like "it's okay to add alias $x to $y if someone reasonable expects $y to be named $x in Rust". E.g. the popcnt alias on u32::count_ones. But someone looking for strdup is probably looking for libc::strdup and not unaware of the concept of clone()ing things.

But as I said in the linked comment, we'd first like to see some kind of write down of such a policy proposal and reach a consensus before accepting more doc alias PRs.

(I'll put it on my own to-do list to write a draft for it, but I'm probably not going to have time for that any time soon.)

@m-ou-se m-ou-se removed the S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). label Apr 21, 2021
@jyn514
Copy link
Member Author

jyn514 commented Apr 21, 2021

Aliases can provide no context or explanation, which might lead to confusion or might even come across as somewhat passive aggressive: E.g. someone looking for the raw deallocation api searching for free or delete would now end up with mem::drop, as if the search results are saying "no you don't want to manage your own memory, just drop things!". The explanation that in Rust we usually use types with a Drop implementation instead of manual malloc/free calls probably belongs in a place that explains that whole concept, not just in a doc alias.

It seems inconsistent to have an alias for delete but not for free. I don't think "just drop things" is substantially different from free - there's no direct equivalent of malloc, so if you really don't understand RAII it will become clear when you try to construct something on the heap and realize you have to use Box or Global::alloc when no one else is using them.

That said, I agree most of the changes here are not super useful.

@jyn514 jyn514 closed this Apr 21, 2021
@jyn514 jyn514 deleted the doc-aliases branch April 21, 2021 16:00
@ojeda
Copy link
Contributor

ojeda commented Apr 21, 2021

Aliases can provide no context or explanation, which might lead to confusion or might even come across as somewhat passive aggressive

I suggested somewhere else to have some extra data for each alias. This was in the context of explaining which language the alias came from (since e.g. same typenames have different meanings in different languages), but having a "reason" or "comment" field overall per alias seems like a good idea to solve the "no context" issue you mention (even if we don't end up with aliases for different languages).

@m-ou-se
Copy link
Member

m-ou-se commented Apr 21, 2021

It seems inconsistent to have an alias for delete but not for free.

Agreed. I think alias = "delete" should also not be there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.