Skip to content

Conversation

@ntBre
Copy link
Contributor

@ntBre ntBre commented Aug 1, 2025

Summary

This PR remaps ranges in Jupyter notebooks from simple row:column indices in the concatenated source code to cell:row:col to match Ruff's output. This is probably not a likely change to land upstream in annotate-snippets, but I didn't see a good way around it.

The remapping logic is taken nearly verbatim from here:

// If we're working with a Jupyter Notebook, skip the lines which are
// outside of the cell containing the diagnostic.
if let Some(index) = self.notebook_index {
let content_end_cell = index.cell(content_end_index).unwrap_or(OneIndexed::MIN);
while end_index > content_end_index {
if index.cell(end_index).unwrap_or(OneIndexed::MIN) == content_end_cell {
break;
}
end_index = end_index.saturating_sub(1);
}
}

Test Plan

New full rendering test for a notebook

I was mainly focused on Ruff, but in local tests this also works for ty:

error[invalid-assignment]: Object of type `Literal[1]` is not assignable to `str`
 --> Untitled.ipynb:cell 1:3:1
  |
1 | import math
2 |
3 | x: str = 1
  | ^
  |
info: rule `invalid-assignment` is enabled by default

error[invalid-assignment]: Object of type `Literal[1]` is not assignable to `str`
 --> Untitled.ipynb:cell 2:3:1
  |
1 | import math
2 |
3 | x: str = 1
  | ^
  |
info: rule `invalid-assignment` is enabled by default

This isn't a duplicate diagnostic, just an unimaginative example:

# cell 1
import math

x: str = 1
# cell 2
import math

x: str = 1

@ntBre ntBre added ty Multi-file analysis & type inference diagnostics Related to reporting of diagnostics. labels Aug 1, 2025
Comment on lines 886 to 899
pub(crate) enum Position {
RowCol(usize, usize),
Cell(usize, usize, usize),
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess another option here is

struct Position {
    row: usize,
    col: usize,
    cell: Option<usize>,
}

or even just adding the Option<usize> to the tuple we had before. That could deduplicate a little code in the match above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer using an Option with named fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I like the Option better here as well. Or the very least, enum variants with named fields. (i.e., Which usize is which?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. I think the old code had the names backwards even with only two fields in the tuple, so named fields will be helpful.

if let Some((col, row)) = pos {
buffer.append(line_offset, ":", stylesheet.none);
buffer.append(line_offset, col.to_string().as_str(), stylesheet.none);
buffer.append(line_offset, ":", stylesheet.none);
buffer.append(line_offset, row.to_string().as_str(), stylesheet.none);
}

@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@ntBre ntBre marked this pull request as ready for review August 1, 2025 21:23
Comment on lines 886 to 899
pub(crate) enum Position {
RowCol(usize, usize),
Cell(usize, usize, usize),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer using an Option with named fields.

Comment on lines +68 to +81

/// The optional cell index in a Jupyter notebook, used for reporting source locations along
/// with the ranges on `annotations`.
pub(crate) cell_index: Option<usize>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we might need to add this to the Annotation because a Diagnostic with multiple ranges may span multiple cells. Or is this already what snippet represents. A single code frame?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is what a snippet represents, based on our RenderableSnippet::new docs:

/// Callers should guarantee that the `input` on every `ResolvedAnnotation`
/// given is identical.

and this usage in the same function:

let diagnostic_source = &anns[0].diagnostic_source;
let source = diagnostic_source.as_source_code();

Oh, I guess this is documented just above this too:

/// One `Snippet` is meant to represent a single, continuous,
/// slice of source code that you want to annotate.
#[derive(Debug)]
pub struct Snippet<'a> {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// One `Snippet` is meant to represent a single, continuous, 
/// slice of source code that you want to annotate. 

I'm not sure this is sufficient. You could have one snippet with multiple annotations but there's no guarantee that all annotations belong to the same cell and this would also be hard to guarantee in the linter (or formatter) because they run on the concatenated notebook source.

That's why I think that each annotation might need to have its own cell in addition to the snippet itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in addition to the snippet itself

Ah, okay I think I see what you mean. It turns out we have an existing mechanism that helps here. If the context windows for the annotations don't overlap, the secondary annotation gets a small sub-header, which already works for our notebooks:

error[unused-import]: `os` imported but unused
 --> notebook.ipynb:cell 1:2:8                
  |                                           
1 | # cell 1                                  
2 | import os                                 
  |        ^^                                 
  |                                           
 ::: notebook.ipynb:cell 3:4:5                
  |                                           
2 | def foo():                                
3 |     print()                               
4 |     x = 1                                 
  |     - second cell                         
  |                                           
help: Remove unused import: `os`              

The - second cell line is the secondary annotation with a very short underline.

This doesn't currently trigger if the context windows overlap:

error[unused-import]: `os` imported but unused
 --> notebook.ipynb:cell 1:2:8                
  |                                           
1 | # cell 1                                  
2 | import os                                 
  |        ^^                                 
3 | # cell 2                                  
4 | import math                               
  |        ---- second cell                   
5 |                                           
6 | print('hello world')                      
  |                                           
help: Remove unused import: `os`              

So there may be a very neat solution here without having to modify the Snippet or Annotation representation. I might just need better bookkeeping of the cell indices when computing context windows. We already truncate the first cell's context in the first example to fit within the cell, so I think this is all that's missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. And we also never print the cell if they do overlap. So maybe that's fine as is? But it probably makes sense to add a test demonstrating the behavior at least

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should render something like this in the second case or the line numbers are a bit misleading:

error[unused-import]: `os` imported but unused
 --> notebook.ipynb:cell 1:2:8
  |
1 | # cell 1
2 | import os
  |        ^^
  |
 ::: notebook.ipynb:cell 2:2:8
  |
1 | # cell 2
2 | import math
  |        ---- second cell
3 |
4 | print('hello world')
  |
help: Remove unused import: `os`

That's the test I'm working on passing now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that makes sense!

context,
anns.iter().map(|ann| ann.line_end).max().unwrap(),
);
let content_start_index = anns.iter().map(|ann| ann.line_start).min().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extend the documentation to account for notebooks?

@MichaReiser
Copy link
Member

Thanks for working on this. Adding this to the rendering will also improve ty's notebook support :)

@ntBre
Copy link
Contributor Author

ntBre commented Aug 4, 2025

This should be ready for another look. We now always render additional annotations in different cells with their own sub-headings, even if the concatenated source lines are adjacent:

error[unused-import]: `os` imported but unused
 --> notebook.ipynb:cell 1:2:8                
  |                                           
1 | # cell 1                                  
2 | import os                                 
  |        ^^                                 
  |                                           
 ::: notebook.ipynb:cell 2:2:8                
  |                                           
1 | # cell 2                                  
2 | import math                               
  |        ---- second cell                   
3 |                                           
4 | print('hello world')                      
  |                                           
help: Remove unused import: `os`              

in this concatenated source:

# cell 1            
import os           
# cell 2            
import math         
                    
print('hello world')

I double-checked that we reuse the same snippet if they are in the same cell too:

error[test-diagnostic]: main diagnostic message
 --> notebook.ipynb:cell 2:2:8                 
  |                                            
1 | # cell 2                                   
2 | import math                                
  |        ^^^^ second cell                    
3 |                                            
4 | print('hello world')                       
  | ----- print statement                      
  |                                            
help: Remove `print` statement                 

I also updated the docs and Position representation, as suggested!

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy what you're selling!

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is great!

@ntBre ntBre force-pushed the brent/full-file-diagnostics branch 2 times, most recently from 4603fc8 to 2cfad84 Compare August 5, 2025 14:56
Base automatically changed from brent/full-file-diagnostics to main August 5, 2025 15:20
ntBre added 6 commits August 5, 2025 11:31
we want the first new test diagnostic to match the second. i.e. something like
this:

```
error[unused-import]: `os` imported but unused
 --> notebook.ipynb:cell 1:2:8
  |
1 | # cell 1
2 | import os
  |        ^^
  |
 ::: notebook.ipynb:cell 2:2:8
  |
1 | # cell 2
2 | import math
  |        ---- second cell
3 |
4 | print('hello world')
  |
help: Remove unused import: `os`
```
ntBre added 3 commits August 5, 2025 11:31
I thought this was unreachable when adding
it (#19644 (comment)), but
obviously it's tested here!

Without the spacing tweak, the snapshots were rendering like this:

```
error[unused-import][*] : `os` imported but unused
error[unused-import][*] : `math` imported but unused
```

with awkward extra spaces before the colons.
@ntBre ntBre force-pushed the brent/full-jupyter branch from 0d3c8dd to 5601e03 Compare August 5, 2025 15:45
@ntBre ntBre closed this Aug 5, 2025
@ntBre ntBre reopened this Aug 5, 2025
@ntBre ntBre merged commit 5bfffe1 into main Aug 5, 2025
35 checks passed
@ntBre ntBre deleted the brent/full-jupyter branch August 5, 2025 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diagnostics Related to reporting of diagnostics. ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants