Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Report line numbers in mdtest relative to the markdown file, not the test snippet #13804

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pilleye
Copy link
Contributor

@pilleye pilleye commented Oct 18, 2024

Summary: Adds absolute line numbers to mdtest (fixes #13798)

Test Plan:

Screen.Recording.2024-10-18.at.1.00.25.AM.mov

Copy link
Contributor

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@MichaReiser MichaReiser added the red-knot Multi-file analysis & type inference label Oct 18, 2024
@@ -235,6 +247,7 @@ impl<'s> Parser<'s> {
// ignore other Markdown syntax (paragraphs, etc) used as comments in the test
if let Some(next_newline) = self.unparsed.find('\n') {
(_, self.unparsed) = self.unparsed.split_at(next_newline + 1);
self.current_line_number = self.current_line_number.saturating_add(1);
Copy link
Member

@MichaReiser MichaReiser Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the line numbers off after a code block (that could span multiple lines)?

I think a better approach is to track the offset (TextSize) and increment it in self.scan and here. Then use LineIndex during rendering to get the line number.

use std::collections::BTreeMap;
use std::path::PathBuf;

type Failures = BTreeMap<SystemPathBuf, matcher::FailuresByLine>;
type Failures = BTreeMap<AbsoluteLineNumberPath, matcher::FailuresByLine>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm inclined to use a regular Vec here and instead sort it before returning it in run_test. Vec's are just overall simpler data structure (e.g. for iterating). I then would add the md_offset to the FailuresByLine data structure (do we still need line_number or can we replace it with the md_offset?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove the grouping by embedded-file path from the output altogether, as I suggested below, that also suggests simplifying this data structure.

@MichaReiser
Copy link
Member

Thanks, this is great. I suspect that the line numbers may get out of sync during parsing if the error isn't in the file's first code block.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is fantastic! A couple more comments on top of Micha's :-)

Comment on lines +38 to +45
for (
AbsoluteLineNumberPath {
path,
line_number: absolute_line_number,
},
by_line,
) in failures
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just because you can do arbitrary unpacking in Rust doesn't mean you should ;)

Here, I'd find this more readable:

Suggested change
for (
AbsoluteLineNumberPath {
path,
line_number: absolute_line_number,
},
by_line,
) in failures
{
for (path, by_line) in failures {
let AbsoluteLineNumberPath {
path,
line_number: absolute_line_number,
} = path;

let line_info = format!("line {line_number}:").cyan();
println!(" {line_info} {failure}");
println!("{absolute_line_info} {line_info} {failure}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation should come before absolute_line_info here, so that the per-line diagnostics are indented below the "heading", which is the name of the embedded file

Suggested change
println!("{absolute_line_info} {line_info} {failure}");
println!(" {absolute_line_info} {line_info} {failure}");

This means that the output in the terminal looks is beautifully organised like this if you have a test which has multiple Python snippets in the same Markdown test, and failures in more than one of those code blocks:

image

I checked, and I'm still able to double-click on the exception_control_flow.md:157 if I run the test in VSCode, and it takes me straight to the line, even if it's indented.


Separately, I'm wondering if we need the relative line number to be reported at all, or if we could just make do with the relative line number? It feels like it just adds noise. Maybe we could just do this (diff relative to your PR branch):

diff --git a/crates/red_knot_test/src/lib.rs b/crates/red_knot_test/src/lib.rs
index 691a71528..e31fad615 100644
--- a/crates/red_knot_test/src/lib.rs
+++ b/crates/red_knot_test/src/lib.rs
@@ -50,9 +50,9 @@ pub fn run(path: &PathBuf, title: &str) {
                             "{}:{}",
                             title,
                             absolute_line_number.saturating_add(line_number.get())
-                        );
-                        let line_info = format!("line {line_number}:").cyan();
-                        println!("{absolute_line_info}    {line_info} {failure}");
+                        )
+                        .cyan();
+                        println!("    {absolute_line_info} {failure}");
                     }
                 }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the many line numbers and file names slightly confusing... Especially because they're next to each other.

But it's hard to come up with a sane grouping. The grouping by file seems reasonable but is also somewhat confusing with diagnostics that then have line numbers absolute to the entire file.

And do we still need to show both line numbers? Aren't the md file line numbers sufficient?

/src/single_except.py
	exception/control_flow.md:157 unmatched assertion: revealed: list[int]
	exception/control_flow.md:159 unmatched assertion: revealed: list[int]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And do we still need to show both line numbers? Aren't the md file line numbers sufficient?

I already asked that; I agree with you. Take a look at the final suggested diff in my comment above :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh shoot. Reading is hard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with removing the old embedded-relative line number entirely, in favor of just the new absolute line number.

I would favor removing the grouping by embedded-file altogether in the output, and just having flat output of one failure per line, per test. It's weird to me to have it grouped by embedded file but then the line numbers be relative to the markdown file. If you find the given line in the markdown file, it'll be obvious what embedded file you are in. And so many of the embedded file names are just /src/test.py anyway, it feels like mostly noise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would favor removing the grouping by embedded-file altogether in the output, and just having flat output of one failure per line, per test. It's weird to me to have it grouped by embedded file but then the line numbers be relative to the markdown file. If you find the given line in the markdown file, it'll be obvious what embedded file you are in. And so many of the embedded file names are just /src/test.py anyway, it feels like mostly noise.

That rationale makes sense to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[red-knot] Report line numbers in mdtest relative to the markdown file, not the test snippet
4 participants