Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add descriptive names to doctests #3311

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

casey
Copy link

@casey casey commented Sep 4, 2022

I wasn't sure if this needed an RFC or not. It's a rather minor feature, however, it is a user-visible change, and since I've already written one, I thought I'd open a PR. Feel free to close it if it doesn't require an RFC.

I did some supporting work on this in rust-lang/rust#78429, which made code block info string parsing only split on , and whitespace, so that an attribute of the form name=foo would be possible, but never followed up with the actual RFC.

This RFC proposes the ability to annotate documentation test code blocks with metadata of the form name=NAME. The given name will be used to identify the documentation test when it is run by the test runner, in addition to the current information of test binary, module, and line number.

For example, this documention test:

```name=arithmetic
assert_eq!(2 + 2, 4);
```

Assuming that it is in crate foo, in file src/lib.rs, in module
bar, on line 37, will produce this output when run:

   Doc-tests foo

running 1 test
test src/lib.rs - bar::arithmetic (line 37) ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Rendered

This RFC proposes the ability to annotate documentation test code blocks
with metadata of the form `name=NAME`. The given name will be used to
identify the documentation test when it is run by the test runner, in
addition to the current information of test binary, module, and line
number.

For example, this documention test:

    ```name=arithmetic
    assert_eq!(2 + 2, 4);
    ```

Assuming that it is in crate `foo`, in file `src/lib.rs`, in module
`bar`, on line 37, will produce this output when run:

       Doc-tests foo

    running 1 test
    test src/lib.rs - bar::arithmetic (line 37) ... ok

    test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
@ehuss ehuss added the T-rustdoc Relevant to rustdoc team, which will review and decide on the RFC. label Sep 5, 2022
text/0000-doctest-name.md Outdated Show resolved Hide resolved
text/0000-doctest-name.md Outdated Show resolved Hide resolved
Co-authored-by: Aleksey Kladov <aleksey.kladov@gmail.com>
text/0000-doctest-name.md Outdated Show resolved Hide resolved
text/0000-doctest-name.md Show resolved Hide resolved
- Mention test filtering
- Remove restriction that `NAME` be a valid identifier
- Rename IDENT to NAME
@casey
Copy link
Author

casey commented Sep 22, 2022

Updated and addressed comments. Thank you for the feedback!

  • Mentioned test filtering
  • Remove stray note that NAME be a valid identifier
  • Rename IDENT to NAME

A previous version of this RFC required that doctest names be valid rust identifiers. I think this might be a good idea, but someone wasn't convinced. I think it's probably a reasonable restriction, and means that you can use the identifiers as names for the generated test binaries/functions/etc, so it's something that I'd like feedback on. However, I wanted the text of the RFC to be consistent, so I removed references to that restriction for now.

@casey
Copy link
Author

casey commented Jan 25, 2023

Thanks for the feedback! I added the requirement that names be valid rust identifiers, and used spaces to separate attributes in the examples instead of commas.

@casey
Copy link
Author

casey commented Apr 30, 2024

Friendly ping!

@notriddle
Copy link
Contributor

This probably needs re-thought through to match up with rust-lang/rust#110800. I'll start FCP if that's done, since this change otherwise looks good.

@casey
Copy link
Author

casey commented Apr 30, 2024

cc @GuillaumeGomez

Are commas the preferred way (or only?) way to separate annotations on a code block? I used spaces originally, since they were accepted by the parser, but I'm not sure if that changed.

I think that's the only change here that needs to be made.

@GuillaumeGomez
Copy link
Member

Oh damn, completely forgot to stabilize this feature. Gonna send a PR for that.

Commas are the only character to separate annotations currently. To add your feature, I think you just need to check for name in here. Once done and ready, please write a comment here so we review it one last time. 👍

@casey
Copy link
Author

casey commented May 15, 2024

@GuillaumeGomez Nice! Okay, updated the RFC to use commas in the example about separating attribute names. I think this is ready to go.

@GuillaumeGomez
Copy link
Member

Considering rust-lang/rust#124577 is about to get stabilized, I'm not sure if name is good enough now (is name supposed to be an attribute or a doctest name? Can't know unless you know very well how it works).

Another thing to note is that you will be able to add attributes to generated codeblocks with this syntax: custom,{name=something}, meaning that in your case, name here would not name the doctest but instead be an attribute generated in the HTML. So better underline this case in the RFC.

In any case, I think renaming name into doctest_name would be better. What do you think?

@notriddle
Copy link
Contributor

notriddle commented May 15, 2024

Another thing to note is that you will be able to add attributes to generated codeblocks with this syntax: custom,{name=something}, meaning that in your case, name here would not name the doctest but instead be an attribute generated in the HTML. So better underline this case in the RFC.

The docs don't say that, and the code seems to warn on "unsupported attributes" without storing them anywhere, so I don't think that's true.

If that PR is supposed to add this feature, it'll need fixed and should probably be pointed out to everyone else.

(Also, I don't see a problem with name being an attribute and a doctest name at the same time. But that's not really relevant.)

In any case, I think renaming name into doctest_name would be better. What do you think?

I think that's a bad idea. The custom classes should be verbose, because it's a feature intended to cover a corner case. This feature, however, is something we want people to use all the time. It should be terse and simple.

@GuillaumeGomez
Copy link
Member

Good catch, you can see that key-values only only certain values here.

@casey
Copy link
Author

casey commented May 15, 2024

I personally think name is better, just because it's short, and the people who use this feature will likely name every doctest, so they'll be writing out the attribute a bunch of times. Another option would be test=NAME, which is also short.

@notriddle
Copy link
Contributor

So, if we're essentially on board with calling it name, I'll start the FCP on that one.

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented May 15, 2024

Team member @notriddle has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels May 15, 2024
@GuillaumeGomez
Copy link
Member

One thing to note is that the eBNF doesn't currently allow to have key=value outside of curly braces since we followed pandoc syntax. I suppose it's ok since @notriddle approved. ;)

@notriddle
Copy link
Contributor

Or put it inside the braces. I just didn’t catch that.

@GuillaumeGomez
Copy link
Member

Then name is a valid element attribute, so I'm not sure we should keep using name for this case as it would also modify the generated HTML documentation.

@fmease
Copy link
Member

fmease commented May 19, 2024

This is a very nice feature!

@rfcbot concern braced-unbraced

As mentioned above, we need to decide if we'd like extend the code block attr grammar to support bare name=IDENT or if we would like to use/extend the existing1 pandoc syntax {...,name=IDENT,...}.

@rfcbot concern raw-idents

I guess it's implied by the use of the word Rust identifier but I'd like to raise awareness that this includes raw identifiers (prefix: r#) which is good. I think we should explicitly mention this in the RFC text.

@rfcbot concern hard-errors

Multiple name=IDENT words may not appear in a single code block's info
string.

I'd like to see this clarified. I'm pretty sure we don't want to emit a hard error for duplicate names but merely a warn-by-default rustdoc lint for backward compatibility.

Edit: The same applies to other syntax errors (e.g., RHS not being a valid ident).

@rfcbot reviewed

Footnotes

  1. Currently feature-gated under custom_code_classes_in_docs

@casey
Copy link
Author

casey commented May 19, 2024

Thank you for the review!

@rfcbot concern braced-unbraced

As mentioned above, we need to decide if we'd like extend the code block attr grammar to support bare name=IDENT or if we would like to use/extend the existing[^1] pandoc syntax {...,name=IDENT,...}.

I personally would prefer the unbraced name=IDENT syntax. Correct me if I'm wrong, but the pandoc braced syntax is only used for adding classes to the generated HTML, and it's also rather verbose.

@rfcbot concern raw-idents

I guess it's implied by the use of the word Rust identifier but I'd like to raise awareness that this includes raw identifiers (prefix: r#) which is good. I think we should call this out explicitly mention this in the RFC text.

I noted that raw identifiers are allowed.

@rfcbot concern hard-errors

Multiple name=IDENT words may not appear in a single code block's info
string.

I'd like to see this clarified. I'm pretty sure we don't want to emit a hard error for duplicate names but merely a warn-by-default rustdoc lint for backward compatibility.

In the same vein, should name=NOT_AN_IDENT warn-by-default as well, and not be used as a test name?

@fmease
Copy link
Member

fmease commented May 19, 2024

In the same vein, should name=NOT_AN_IDENT warn-by-default as well, […]?

Yes indeed, I forgot to mention that.

7f90cc4

@rfcbot resolve raw-idents

@fmease
Copy link
Member

fmease commented May 19, 2024

should [IDENT in name=IDENT] not be used as a test name?

That's a separate point. And I honestly don't know if it's feasible or worth impl'ing this, namely checking that doctest names are unique within the "#[test] namespace" of the current module (including other doctest names). Yeah, maybe we should actually check this and emit a warn-by-default lint on name clashes.

@casey
Copy link
Author

casey commented May 20, 2024

That's a separate point. And I honestly don't know if it's feasible or worth impl'ing this, namely checking that doctest names are unique within the "#[test] namespace" of the current module (including other doctest names). Yeah, maybe we should actually check this and emit a warn-by-default lint on name clashes.

Just updated the RFC. If the idea is that name=IDENT doctests will potentially be instantiated as #[test] fn test() { … } items, then they should both be unique across doctests, but also unique across all items in the current namespace.

@notriddle
Copy link
Contributor

notriddle commented May 20, 2024

Wait a minute. It reverted back to commas?

@rfcbot concern interoperability

We seem to be going in circles.

The original commit, 7641de1, from 2022, used the syntax rust,name=foo.

I objected to this syntax, citing a Babelmark run that showed most CommonMark engines interpreting this as adding <pre><code class="rust,name=foo">, when it's supposed to be class="rust". This matters because people use Rustdoc to execute doctests from external markdown files, even when they're using mdBook or GitHub to view them.

In cc3c07a, this was resolved by separating the name= attribute from the class with a space (which achieves roughly the same interoperability as Pandoc's syntax).

This syntax needs to be designed so that most markdown renderers treat it as separate from the language tag. Allowing commas is fine, but using a space to separate the name from the language needs to work.

@notriddle
Copy link
Contributor

Commas are the only character to separate annotations currently.

That doesn't seem to be true.

https://github.com/rust-lang/rust/blob/b92758a9aef1cef7b79e2b72c3d8ba113e547f89/src/librustdoc/html/markdown/tests.rs#L286

@notriddle
Copy link
Contributor

This feature was discussed in the previous monthly meeting, and we decided that it needed better motivation before adding it.

https://rust-lang.zulipchat.com/#narrow/channel/393423-t-rustdoc.2Fmeetings/topic/2024-12-09/near/487096843

To estimate how common this need might be, I used Crater to measure how many crates had items with more than five doctests to an item. The answer, as it turns out, is 0.6%.

rust-lang/rust#134530

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. T-rustdoc Relevant to rustdoc team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants