-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ty] Remap Jupyter notebook cell indices in ruff_db
#19698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| pub(crate) enum Position { | ||
| RowCol(usize, usize), | ||
| Cell(usize, usize, usize), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess another option here is
struct Position {
row: usize,
col: usize,
cell: Option<usize>,
}or even just adding the Option<usize> to the tuple we had before. That could deduplicate a little code in the match above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer using an Option with named fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I like the Option better here as well. Or the very least, enum variants with named fields. (i.e., Which usize is which?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. I think the old code had the names backwards even with only two fields in the tuple, so named fields will be helpful.
ruff/crates/ruff_annotate_snippets/src/renderer/display_list.rs
Lines 252 to 257 in af8587e
| if let Some((col, row)) = pos { | |
| buffer.append(line_offset, ":", stylesheet.none); | |
| buffer.append(line_offset, col.to_string().as_str(), stylesheet.none); | |
| buffer.append(line_offset, ":", stylesheet.none); | |
| buffer.append(line_offset, row.to_string().as_str(), stylesheet.none); | |
| } |
|
| pub(crate) enum Position { | ||
| RowCol(usize, usize), | ||
| Cell(usize, usize, usize), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer using an Option with named fields.
|
|
||
| /// The optional cell index in a Jupyter notebook, used for reporting source locations along | ||
| /// with the ranges on `annotations`. | ||
| pub(crate) cell_index: Option<usize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we might need to add this to the Annotation because a Diagnostic with multiple ranges may span multiple cells. Or is this already what snippet represents. A single code frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is what a snippet represents, based on our RenderableSnippet::new docs:
ruff/crates/ruff_db/src/diagnostic/render.rs
Lines 578 to 579 in 5f149fc
| /// Callers should guarantee that the `input` on every `ResolvedAnnotation` | |
| /// given is identical. |
and this usage in the same function:
ruff/crates/ruff_db/src/diagnostic/render.rs
Lines 593 to 594 in 5f149fc
| let diagnostic_source = &anns[0].diagnostic_source; | |
| let source = diagnostic_source.as_source_code(); |
Oh, I guess this is documented just above this too:
ruff/crates/ruff_annotate_snippets/src/snippet.rs
Lines 67 to 70 in 5f149fc
| /// One `Snippet` is meant to represent a single, continuous, | |
| /// slice of source code that you want to annotate. | |
| #[derive(Debug)] | |
| pub struct Snippet<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// One `Snippet` is meant to represent a single, continuous,
/// slice of source code that you want to annotate.
I'm not sure this is sufficient. You could have one snippet with multiple annotations but there's no guarantee that all annotations belong to the same cell and this would also be hard to guarantee in the linter (or formatter) because they run on the concatenated notebook source.
That's why I think that each annotation might need to have its own cell in addition to the snippet itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in addition to the snippet itself
Ah, okay I think I see what you mean. It turns out we have an existing mechanism that helps here. If the context windows for the annotations don't overlap, the secondary annotation gets a small sub-header, which already works for our notebooks:
error[unused-import]: `os` imported but unused
--> notebook.ipynb:cell 1:2:8
|
1 | # cell 1
2 | import os
| ^^
|
::: notebook.ipynb:cell 3:4:5
|
2 | def foo():
3 | print()
4 | x = 1
| - second cell
|
help: Remove unused import: `os`
The - second cell line is the secondary annotation with a very short underline.
This doesn't currently trigger if the context windows overlap:
error[unused-import]: `os` imported but unused
--> notebook.ipynb:cell 1:2:8
|
1 | # cell 1
2 | import os
| ^^
3 | # cell 2
4 | import math
| ---- second cell
5 |
6 | print('hello world')
|
help: Remove unused import: `os`
So there may be a very neat solution here without having to modify the Snippet or Annotation representation. I might just need better bookkeeping of the cell indices when computing context windows. We already truncate the first cell's context in the first example to fit within the cell, so I think this is all that's missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. And we also never print the cell if they do overlap. So maybe that's fine as is? But it probably makes sense to add a test demonstrating the behavior at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should render something like this in the second case or the line numbers are a bit misleading:
error[unused-import]: `os` imported but unused
--> notebook.ipynb:cell 1:2:8
|
1 | # cell 1
2 | import os
| ^^
|
::: notebook.ipynb:cell 2:2:8
|
1 | # cell 2
2 | import math
| ---- second cell
3 |
4 | print('hello world')
|
help: Remove unused import: `os`
That's the test I'm working on passing now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that makes sense!
| context, | ||
| anns.iter().map(|ann| ann.line_end).max().unwrap(), | ||
| ); | ||
| let content_start_index = anns.iter().map(|ann| ann.line_start).min().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extend the documentation to account for notebooks?
|
Thanks for working on this. Adding this to the rendering will also improve ty's notebook support :) |
|
This should be ready for another look. We now always render additional annotations in different cells with their own sub-headings, even if the concatenated source lines are adjacent: in this concatenated source: # cell 1
import os
# cell 2
import math
print('hello world')I double-checked that we reuse the same snippet if they are in the same cell too: I also updated the docs and |
BurntSushi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I buy what you're selling!
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this is great!
4603fc8 to
2cfad84
Compare
we want the first new test diagnostic to match the second. i.e. something like
this:
```
error[unused-import]: `os` imported but unused
--> notebook.ipynb:cell 1:2:8
|
1 | # cell 1
2 | import os
| ^^
|
::: notebook.ipynb:cell 2:2:8
|
1 | # cell 2
2 | import math
| ---- second cell
3 |
4 | print('hello world')
|
help: Remove unused import: `os`
```
I thought this was unreachable when adding it (#19644 (comment)), but obviously it's tested here! Without the spacing tweak, the snapshots were rendering like this: ``` error[unused-import][*] : `os` imported but unused error[unused-import][*] : `math` imported but unused ``` with awkward extra spaces before the colons.
0d3c8dd to
5601e03
Compare
Summary
This PR remaps ranges in Jupyter notebooks from simple
row:columnindices in the concatenated source code tocell:row:colto match Ruff's output. This is probably not a likely change to land upstream inannotate-snippets, but I didn't see a good way around it.The remapping logic is taken nearly verbatim from here:
ruff/crates/ruff_linter/src/message/text.rs
Lines 212 to 222 in cd6bf14
Test Plan
New
fullrendering test for a notebookI was mainly focused on Ruff, but in local tests this also works for ty:
This isn't a duplicate diagnostic, just an unimaginative example: